We encountered a problem that the disconnect command hangs.
After analyzing the log and stack, we found that the triggering
process is as follows:
CPU0 CPU1
nvme_rdma_error_recovery_work
nvme_rdma_teardown_io_queues
nvme_do_delete_ctrl nvme_stop_queues
nvme_remove_namespaces
--clear ctrl->namespaces
nvme_start_queues
--no ns in ctrl->namespaces
nvme_ns_remove return(because ctrl is deleting)
blk_freeze_queue
blk_mq_freeze_queue_wait
--wait for ns to unquiesce to clean infligt IO, hang forever
This problem was not found in older kernels because we will flush
err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not
seem to be modified for functional reasons, the patch can be revert
to solve the problem.
Revert commit 794a4cb3d2 ("nvme: remove the .stop_ctrl callout")
Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
queue stoppage and inflight requests cancellation is fully fenced from
io_work and thus failing a request from this context. Hence we don't
need to try to guess from the socket retcode if this failure is because
the queue is about to be torn down or not.
We are perfectly safe to just fail it, the request will not be cancelled
later on.
This solves possible very long shutdown delays when the users issues a
'nvme disconnect-all'
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Data digest calculation iterates over command mapped iovec. However
since commit bac04454ef we unmap the iovec before we handle the data
digest, and since commit 69b85e1f1d we clear nr_mapped when we unmap
the iov.
Instead of open-coding the command iov traversal, simply call
crypto_ahash_digest with the command sg that is already allocated (we
already do that for the send path). Rename nvmet_tcp_send_ddgst to
nvmet_tcp_calc_ddgst and call it from send and recv paths.
Fixes: 69b85e1f1d ("nvmet-tcp: add an helper to free the cmd buffers")
Fixes: bac04454ef ("nvmet-tcp: fix kmap leak when data digest in use")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This device shares the PCI ID with the Samsung 970 Evo Plus that
does not need or want the quirks. Move the the quirk entry to the
core table based on the model number instead.
Fixes: bc360b0b16 ("nvme-pci: add quirks for Samsung X5 SSDs")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
The Micron MTFDKBA2T0TFH device reports the same subsysem NQN for
all devices. Add a quick to ignore it.
Signed-off-by: Leo Savernik <l.savernik@aon.at>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Like commit 5611ec2b98 ("nvme-pci: prevent SK hynix PC400 from using
Write Zeroes command"), UMIS and Samsung has the same issue:
[ 6305.633887] blk_update_request: operation not supported error,
dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0
phys_seg 0 prio class 0
So also disable Write Zeroes command on UMIS and Samsung.
Signed-off-by: rasheed.hsueh <rasheed.hsueh@lcfc.corp-partner.google.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When ZHITAI TiPro7000 SSDs entered deepest power state(ps4)
it has the same APST sleep problem as Kingston A2000.
by chance the system crashes and displays the same dmesg info:
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65
As the Archlinux wiki suggest (enlat + exlat) < 25000 is fine
and my testing shows no system crashes ever since.
Therefore disabling the deepest power state will fix the APST sleep issue.
https://wiki.archlinux.org/title/Solid_state_drive/NVMe
This is the APST data from 'nvme id-ctrl /dev/nvme1'
NVME Identify Controller:
vid : 0x1e49
ssvid : 0x1e49
sn : [...]
mn : ZHITAI TiPro7000 1TB
fr : ZTA32F3Y
[...]
ps 0 : mp:3.50W operational enlat:5 exlat:5 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
ps 1 : mp:3.30W operational enlat:50 exlat:100 rrt:1 rrl:1
rwt:1 rwl:1 idle_power:- active_power:-
ps 2 : mp:2.80W operational enlat:50 exlat:200 rrt:2 rrl:2
rwt:2 rwl:2 idle_power:- active_power:-
ps 3 : mp:0.1500W non-operational enlat:500 exlat:5000 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0200W non-operational enlat:2000 exlat:60000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
Signed-off-by: Ning Wang <ningwang35@outlook.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
ADATA XPG GAMMIX S50 drives report bogus eui64 values that appear to
be the same across drives in one system. Quirk them out so they are
not marked as "non globally unique" duplicates.
Signed-off-by: Stefan Reiter <stefan@pimaker.at>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Many users have encountered IO timeouts with a CSTS value of 0xffffffff,
which indicates a failure to read the register. While there are various
potential causes for this observation, faulty NVMe APST has been the
culprit quite frequently. Add the recommended troubleshooting steps in
the error output when this condition occurs.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The recent global id check is finding poorly implemented devices in the
wild. Include relavant device information in the output to help quicken
an appropriate quirk patch.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This provides more context to users.
Old message:
[ 00.000000] No UUID available providing old NGUID
New message:
[ 00.000000] block nvme0n1: No UUID available providing old NGUID
Fixes: d934f9848a ("nvme: provide UUID value to userspace")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Mostly small bug fixes plus other trivial updates. The major change
of note is moving ufs out of scsi and a minor update to lpfc vmid
handling.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYptvayYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishb1rAPwJWSl1
kB9Qt1xPa1JK0Z7To2LRkrQ4uxYbAnpTLEP6UgEA3YuBccXRppcYQe5CXCrcZryz
BtTbtI3ApiV40xC/SRk=
=Uit3
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Mostly small bug fixes plus other trivial updates.
The major change of note is moving ufs out of scsi and a minor update
to lpfc vmid handling"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
scsi: qla2xxx: Remove unused 'ql_dm_tgt_ex_pct' parameter
scsi: qla2xxx: Remove setting of 'req' and 'rsp' parameters
scsi: mpi3mr: Fix kernel-doc
scsi: lpfc: Add support for ATTO Fibre Channel devices
scsi: core: Return BLK_STS_TRANSPORT for ALUA transitioning
scsi: sd_zbc: Prevent zone information memory leak
scsi: sd: Fix potential NULL pointer dereference
scsi: mpi3mr: Rework mrioc->bsg_device model to fix warnings
scsi: myrb: Fix up null pointer access on myrb_cleanup()
scsi: core: Unexport scsi_bus_type
scsi: sd: Don't call blk_cleanup_disk() in sd_probe()
scsi: ufs: ufshcd: Delete unnecessary NULL check
scsi: isci: Fix typo in comment
scsi: pmcraid: Fix typo in comment
scsi: smartpqi: Fix typo in comment
scsi: qedf: Fix typo in comment
scsi: esas2r: Fix typo in comment
scsi: storvsc: Fix typo in comment
scsi: ufs: Split the drivers/scsi/ufs directory
scsi: qla1280: Remove redundant variable
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKZmkoQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpqyrD/4iyg2ULBPyljLoM3Ed8AONbrFBApenKDFN
FjiFRZNll8zvJLTtP0GQqJIrljPuRlqb0IkbgqXl+yvPZ+wpB9HgIe2aohkOaqqJ
KS49UNR/aIHwC2y7lwlcsdgVqqoPdc4wnZeaQvCsWPCBhCca/k0kR7s3uEhHMK92
OWpV9osl/thLfOBwwt4IEaO1Koz8PM/fCR4XA2KLbs8E4P8EcSFglqi0ap7foLNr
pQAJIlPjkmF6nw4xg5fdBjVBo//kVcuf9IBMi5/XinmUL1taFAcn5WyeOvBbi0Fs
Sqp/pKkveM8xWZKrDyA/wf8nzRNpBBl6TOQEMFtV6FkZrij3pbKHlCiyL2gBR13e
5gkbVvXgtdqDVlnqlvIV/Swfh5YQFtn7+vlHFOUjP+iObsBRo4fTvDhPTLoO/VCf
tIAA7xnq/pquRKS/QGGC7ZxRVc3T1r+EpvbBP5Dc4CDGbbbyZLCSrOh5HWIb0I3k
95GSiipTtf54KqZ9HiG/u+xNAFIdapXgU4Xm+JyDRWLdxSs5Nmy8VgpdflvxBfuo
hCvJJw3vtDusjyHc7IafxaZlJVQT9tPgPshs8GfrCMCP19RCALD/5irspFh6dD35
BQTIkhC68XNa0iNn/NTP3uxir/JwRoovxQkA9eD+r1NHsAbL8GTypfr5kKJxDaIK
UhawfyZE3Q==
=YqhC
-----END PGP SIGNATURE-----
Merge tag 'for-5.19/drivers-2022-06-02' of git://git.kernel.dk/linux-block
Pull more block driver updates from Jens Axboe:
"A collection of stragglers that were late on sending in their changes
and just followup fixes.
- NVMe fixes pull request via Christoph:
- set controller enable bit in a separate write (Niklas Cassel)
- disable namespace identifiers for the MAXIO MAP1001 (Christoph)
- fix a comment typo (Julia Lawall)"
- MD fixes pull request via Song:
- Remove uses of bdevname (Christoph Hellwig)
- Bug fixes (Guoqing Jiang, and Xiao Ni)
- bcache fixes series (Coly)
- null_blk zoned write fix (Damien)
- nbd fixes (Yu, Zhang)
- Fix for loop partition scanning (Christoph)"
* tag 'for-5.19/drivers-2022-06-02' of git://git.kernel.dk/linux-block: (23 commits)
block: null_blk: Fix null_zone_write()
nvmet: fix typo in comment
nvme: set controller enable bit in a separate write
nvme-pci: disable namespace identifiers for the MAXIO MAP1001
bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()
nbd: use pr_err to output error message
nbd: fix possible overflow on 'first_minor' in nbd_dev_add()
nbd: fix io hung while disconnecting device
nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed
nbd: fix race between nbd_alloc_config() and module removal
nbd: call genl_unregister_family() first in nbd_cleanup()
md: bcache: check the return value of kzalloc() in detached_dev_do_request()
bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init()
block, loop: support partitions without scanning
bcache: avoid journal no-space deadlock by reserving 1 journal bucket
bcache: remove incremental dirty sector counting for bch_sectors_dirty_init()
bcache: improve multithreaded bch_sectors_dirty_init()
bcache: improve multithreaded bch_btree_check()
md: fix double free of io_acct_set bioset
md: Don't set mddev private to NULL in raid0 pers->free
...
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The NVM Express Base Specification 2.0 specifies in the description
of the CC – Controller Configuration register:
"Host software shall set the Arbitration Mechanism Selected (CC.AMS),
the Memory Page Size (CC.MPS), and the I/O Command Set Selected (CC.CSS)
to valid values prior to enabling the controller by setting CC.EN to ‘1’.
While we haven't seen any controller misbehaving while setting all bits
in a single write, let's do it in the order that it is written in the
spec, as there could potentially be controllers that are implemented to
rely on the configuration bits being set before enabling the controller.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Let the caller set it together with the end_io_data instead of passing
a pointless argument. Note the the target code did in fact already
set it and then just overrode it again by calling blk_execute_rq_nowait.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220524121530.943123-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Small collection of incremental improvement patches:
- Minor code cleanup patches, comment improvements, etc from static tools
- Clean the some of the kernel caps, reducing the historical stealth uAPI
leftovers
- Bug fixes and minor changes for rdmavt, hns, rxe, irdma
- Remove unimplemented cruft from rxe
- Reorganize UMR QP code in mlx5 to avoid going through the IB verbs layer
- flush_workqueue(system_unbound_wq) removal
- Ensure rxe waits for objects to be unused before allowing the core to
free them
- Several rc quality bug fixes for hfi1
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCYo+NxgAKCRCFwuHvBreF
YbqSAQDJ+QolaATUvOQUPLbuLopUCJLe95VS15Kl3SNXiVUUFAEA8DLL1s6+WShd
AgypUxGHipx5BAytrn45/WiwuDeEbQ8=
=jgTl
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Small collection of incremental improvement patches:
- Minor code cleanup patches, comment improvements, etc from static
tools
- Clean the some of the kernel caps, reducing the historical stealth
uAPI leftovers
- Bug fixes and minor changes for rdmavt, hns, rxe, irdma
- Remove unimplemented cruft from rxe
- Reorganize UMR QP code in mlx5 to avoid going through the IB verbs
layer
- flush_workqueue(system_unbound_wq) removal
- Ensure rxe waits for objects to be unused before allowing the core
to free them
- Several rc quality bug fixes for hfi1"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (67 commits)
RDMA/rtrs-clt: Fix one kernel-doc comment
RDMA/hfi1: Remove all traces of diagpkt support
RDMA/hfi1: Consolidate software versions
RDMA/hfi1: Remove pointless driver version
RDMA/hfi1: Fix potential integer multiplication overflow errors
RDMA/hfi1: Prevent panic when SDMA is disabled
RDMA/hfi1: Prevent use of lock before it is initialized
RDMA/rxe: Fix an error handling path in rxe_get_mcg()
IB/core: Fix typo in comment
RDMA/core: Fix typo in comment
IB/hf1: Fix typo in comment
IB/qib: Fix typo in comment
IB/iser: Fix typo in comment
RDMA/mlx4: Avoid flush_scheduled_work() usage
IB/isert: Avoid flush_scheduled_work() usage
RDMA/mlx5: Remove duplicate pointer assignment in mlx5_ib_alloc_implicit_mr()
RDMA/qedr: Remove unnecessary synchronize_irq() before free_irq()
RDMA/hns: Use hr_reg_read() instead of remaining roce_get_xxx()
RDMA/hns: Use hr_reg_xxx() instead of remaining roce_set_xxx()
RDMA/irdma: Add SW mechanism to generate completions on error
...
There are minor updates to SoC specific drivers for chips by Rockchip,
Samsung, NVIDIA, TI, NXP, i.MX, Qualcomm, and Broadcom. Noteworthy
driver changes include:
- Several conversions of DT bindings to yaml format.
- Renesas adds driver support for R-Car V4H, RZ/V2M and RZ/G2UL SoCs.
- Qualcomm adds a bus driver for the SSC (Snapdragon Sensor Core),
and support for more chips in the RPMh power domains and the soc-id.
- NXP has a new driver for the HDMI blk-ctrl on i.MX8MP.
- Apple M1 gains support for the on-chip NVMe controller, making it
possible to finally use the internal disks. This also includes SoC
drivers for their RTKit IPC and for the SART DMA address filter.
For other subsystems that merge their drivers through the SoC tree,
we have
- Firmware drivers for the ARM firmware stack including TEE, OP-TEE,
SCMI and FF-A get a number of smaller updates and cleanups. OP-TEE
now has a cache for firmware argument structures as an optimization,
and SCMI now supports the 3.1 version of the specification.
- Reset controller updates to Amlogic, ASpeed, Renesas and ACPI drivers
- Memory controller updates for Tegra, and a few updates for other
platforms.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmKOXOoACgkQmmx57+YA
GNlpVQ//eQGfL0WktE5G/y0mCVuVHtXT5nSjHMgjTOdb9+QvaATCfxnLXvP7Gq7C
7YzJd68G+2ZC4rUkkjTxyMICT7eIrJSAIAFn4PWee4EQ5DfbHgG+1tToTjxqb+QQ
6wGB5MVaYUhjZE30kY2E8a+OKxHtEnkt9wcch6ei0vzsMZquQJF6byfHd5+I4Knd
CyDmXX8ZGXK3FnhvuBLr3Rgwyhs0X4Ju7UaONLZxBYxdnh8WmymRszmMnv5qEkub
KDe8fbhFamOT3Z55JdCA5xq7LvUzjsKpTGFxFcS0ptbkTmtAsuyYqqiWvAPx3D5u
5TxVGSx9QKid6fpIsITZ2ptO6fgljh1W9b/3Y3/eltudXsM1qqSxyN2Hre+M9egf
WEDADqbNR5Y5+bq1iZWI348jXkNHVPpsLHI9Ihqf4yyrKwFkmRmNLnws53XTAPH2
FPXZvJjwFDBDHGfewSoLFePXUPNytVLXbr6Mq72ZyTDIBDU8Mxh666Wd8bu4tgbG
1Y2pMjDIdXDOsljM6Of5D3XjM1kuDwEmFxWGy+cKLgoEbHLeE1xIbTjUir4687+d
VNHdtsIRFPRZzz2lUSmI8vlA2aewMWrkOF/Ulz8xh6gG8uitMSfOxghg4IWOfRVM
mlvgFP5eqTInmQcbWRxaRO9JzP+rPp1sAcEpsBmuEHw5Akflbc8=
=XoLF
-----END PGP SIGNATURE-----
Merge tag 'arm-drivers-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM driver updates from Arnd Bergmann:
"There are minor updates to SoC specific drivers for chips by Rockchip,
Samsung, NVIDIA, TI, NXP, i.MX, Qualcomm, and Broadcom.
Noteworthy driver changes include:
- Several conversions of DT bindings to yaml format.
- Renesas adds driver support for R-Car V4H, RZ/V2M and RZ/G2UL SoCs.
- Qualcomm adds a bus driver for the SSC (Snapdragon Sensor Core),
and support for more chips in the RPMh power domains and the
soc-id.
- NXP has a new driver for the HDMI blk-ctrl on i.MX8MP.
- Apple M1 gains support for the on-chip NVMe controller, making it
possible to finally use the internal disks. This also includes SoC
drivers for their RTKit IPC and for the SART DMA address filter.
For other subsystems that merge their drivers through the SoC tree, we
have
- Firmware drivers for the ARM firmware stack including TEE, OP-TEE,
SCMI and FF-A get a number of smaller updates and cleanups. OP-TEE
now has a cache for firmware argument structures as an
optimization, and SCMI now supports the 3.1 version of the
specification.
- Reset controller updates to Amlogic, ASpeed, Renesas and ACPI
drivers
- Memory controller updates for Tegra, and a few updates for other
platforms"
* tag 'arm-drivers-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (159 commits)
memory: tegra: Add MC error logging on Tegra186 onward
memory: tegra: Add memory controller channels support
memory: tegra: Add APE memory clients for Tegra234
memory: tegra: Add Tegra234 support
nvme-apple: fix sparse endianess warnings
soc/tegra: pmc: Document core domain fields
soc: qcom: pdr: use static for servreg_* variables
soc: imx: fix semicolon.cocci warnings
soc: renesas: R-Car V3U is R-Car Gen4
soc: imx: add i.MX8MP HDMI blk-ctrl
soc: imx: imx8m-blk-ctrl: Add i.MX8MP media blk-ctrl
soc: imx: add i.MX8MP HSIO blk-ctrl
soc: imx: imx8m-blk-ctrl: set power device name
soc: qcom: llcc: Add sc8180x and sc8280xp configurations
dt-bindings: arm: msm: Add sc8180x and sc8280xp LLCC compatibles
soc/tegra: pmc: Select REGMAP
dt-bindings: reset: st,sti-powerdown: Convert to yaml
dt-bindings: reset: st,sti-picophyreset: Convert to yaml
dt-bindings: reset: socfpga: Convert to yaml
dt-bindings: reset: snps,axs10x-reset: Convert to yaml
...
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmKKlIAeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC3oH/iPm/fLG2sJut8My
sU0RC9K+6ESV5h2Qy6k00/lqKstlu4EvBjw4V8vYpx3Q2+hbSFMn2SeWqqqT3Lkk
Zb8KINCFuuyMtdCBb42PV0zhUf5pCQF7ocm/Ae4jllDHtPmqk3WJ6IGtZBK5JBlw
z6RR/wKt0y0MRj9eZyPyYjOee2L2vuVh4tgnexK/4L8g2ZtMMRThhvUzSMWG4zxR
STYYNp0uFcfT1Vt85+ODevFH4TvdECAj+SqAegN+seHLM17YY7M0/WiIYpxGRv8P
lIpDQl4PBU8EBkpI5hkpJ/3qPincbuVOMLsYfxFtpcjjG12vGjFp2krGpS3TedZQ
3mvaJ7c=
=vLke
-----END PGP SIGNATURE-----
Merge tag 'v5.18' into rdma.git for-next
Following patches have dependencies.
Resolve the merge conflict in
drivers/net/ethernet/mellanox/mlx5/core/main.c by keeping the new names
for the fs functions following linux-next:
https://lore.kernel.org/r/20220519113529.226bc3e2@canb.auug.org.au/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKrTcQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgph/REAC0/7odRfJeTJ1PkJhSKFc7dhyS7rK4du2s
3+z+H6Yeua2yVIJb0mYYGEJcOUUQ9nD2T9424n3NzDOw88U4y8Vg2YEH+UiJBuj4
AJoxPNkQdxL7WzmwHmRNLCcOOFhISLqWiCJSr45d+LP1f6aO24Q9lewYWxtNA4TW
mqb7Ne7e3Z77m9rmsCsZ26bzQHg1EEQ6qgjZM9tqMhOeTqYhmrqfrD9KtG8TIkpK
N8277E5QcequHf7v6VpKqEOzf3d2kx55JaZdu+oxLPVMED3wJJFwcYF1/xmM7Fgx
tp7xCjqqUHXwKvJNCFJpnvw+cXu0Ct7cWOIG4ROCvaTD4vBI1KzZLc0gO7pKFW0Y
hNIlMXr4n8PmonS81tMV4TqmRWxedX/jxuaeJCVNr89PqYU4luPpigJZqv7rlGry
KZUlktQot22M/7FC2MS6KhgbQKLPrRGTAEyY/JNwBHckCZiduWQFlmKLQ926xQIJ
6vdjSzHK5MrT/d+yow3bGFxAJWloGJ+L+RsH0b+WikF81+6ic9P3AoStgbVilfKD
6sbjcju8SShDlQ+W/Ocm0rHC+i/RDKT3QqItXgfhA/1FfMPODQGc/xcZg+AdTswn
VSnUIkvk9/mTO0StilVfNJDfG1QkSpJ5Ilvs/DnIahZj6IG4QbJvtnVNbmQX6ptz
AUB4DdGwXg==
=geQL
-----END PGP SIGNATURE-----
Merge tag 'for-5.19/drivers-2022-05-22' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
"Here are the driver updates queued up for 5.19. This contains:
- NVMe pull requests via Christoph:
- tighten the PCI presence check (Stefan Roese)
- fix a potential NULL pointer dereference in an error path (Kyle
Miller Smith)
- fix interpretation of the DMRSL field (Tom Yan)
- relax the data transfer alignment (Keith Busch)
- verbose error logging improvements (Max Gurtovoy, Chaitanya
Kulkarni)
- misc cleanups (Chaitanya Kulkarni, Christoph)
- set non-mdts limits in nvme_scan_work (Chaitanya Kulkarni)
- add support for TP4084 - Time-to-Ready Enhancements (Christoph)
- MD pull request via Song:
- Improve annotation in raid5 code, by Logan Gunthorpe
- Support MD_BROKEN flag in raid-1/5/10, by Mariusz Tkaczyk
- Other small fixes/cleanups
- null_blk series making the configfs side much saner (Damien)
- Various minor drbd cleanups and fixes (Haowen, Uladzislau, Jiapeng,
Arnd, Cai)
- Avoid using the system workqueue (and hence flushing it) in rnbd
(Jack)
- Avoid using the system workqueue (and hence flushing it) in aoe
(Tetsuo)
- Series fixing discard_alignment issues in drivers (Christoph)
- Small series fixing drivers poking at disk->part0 for openers
information (Christoph)
- Series fixing deadlocks in loop (Christoph, Tetsuo)
- Remove loop.h and add SPDX headers (Christoph)
- Various fixes and cleanups (Julia, Xie, Yu)"
* tag 'for-5.19/drivers-2022-05-22' of git://git.kernel.dk/linux-block: (72 commits)
mtip32xx: fix typo in comment
nvme: set non-mdts limits in nvme_scan_work
nvme: add support for TP4084 - Time-to-Ready Enhancements
nvme: split the enum used for various register constants
nbd: Fix hung on disconnect request if socket is closed before
nvme-fabrics: add a request timeout helper
nvme-pci: harden drive presence detect in nvme_dev_disable()
nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags
nvme: mark internal passthru request RQF_QUIET
nvme: remove unneeded include from constants file
nvme: add missing status values to verbose logging
nvme: set dma alignment to dword
nvme: fix interpretation of DMRSL
loop: remove most the top-of-file boilerplate comment from the UAPI header
loop: remove most the top-of-file boilerplate comment
loop: add a SPDX header
loop: remove loop.h
block: null_blk: Improve device creation with configfs
block: null_blk: Cleanup messages
block: null_blk: Cleanup device creation and deletion
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmKKrUsQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgDjD/44hY9h0JsOLoRH1IvFtuaH6n718JXuqG17
hHCfmnAUVqj2jT00IUbVlUTd905bCGpfrodBL3PAmPev1zZHOUd/MnJKrSynJ+/s
NJEMZQaHxLmocNDpJ1sZo7UbAFErsZXB0gVYUO8cH2bFYNu84H1mhRCOReYyqmvQ
aIAASX5qRB/ciBQCivzAJl2jTdn4WOn5hWi9RLidQB7kSbaXGPmgKAuN88WI4H7A
zQgAkEl2EEquyMI5tV1uquS7engJaC/4PsenF0S9iTyrhJLjneczJBJZKMLeMR8d
sOm6sKJdpkrfYDyaA4PIkgmLoEGTtwGpqGHl4iXTyinUAxJoca5tmPvBb3wp66GE
2Mr7pumxc1yJID2VHbsERXlOAX3aZNCowx2gum2MTRIO8g11Eu3aaVn2kv37MBJ2
4R2a/cJFl5zj9M8536cG+Yqpy0DDVCCQKUIqEupgEu1dyfpznyWH5BTAHXi1E8td
nxUin7uXdD0AJkaR0m04McjS/Bcmc1dc6I8xvkdUFYBqYCZWpKOTiEpIBlHg0XJA
sxdngyz5lSYTGVA4o4QCrdR0Tx1n36A1IYFuQj0wzxBJYZ02jEZuII/A3dd+8hiv
EY+VeUQeVIXFFuOcY+e0ScPpn7Nr17hAd1en/j2Hcoe4ZE8plqG2QTcnwgflcbis
iomvJ4yk0Q==
=0Rw1
-----END PGP SIGNATURE-----
Merge tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"Here are the core block changes for 5.19. This contains:
- blk-throttle accounting fix (Laibin)
- Series removing redundant assignments (Michal)
- Expose bio cache via the bio_set, so that DM can use it (Mike)
- Finish off the bio allocation interface cleanups by dealing with
the weirdest member of the family. bio_kmalloc combines a kmalloc
for the bio and bio_vecs with a hidden bio_init call and magic
cleanup semantics (Christoph)
- Clean up the block layer API so that APIs consumed by file systems
are (almost) only struct block_device based, so that file systems
don't have to poke into block layer internals like the
request_queue (Christoph)
- Clean up the blk_execute_rq* API (Christoph)
- Clean up various lose end in the blk-cgroup code to make it easier
to follow in preparation of reworking the blkcg assignment for bios
(Christoph)
- Fix use-after-free issues in BFQ when processes with merged queues
get moved to different cgroups (Jan)
- BFQ fixes (Jan)
- Various fixes and cleanups (Bart, Chengming, Fanjun, Julia, Ming,
Wolfgang, me)"
* tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-block: (83 commits)
blk-mq: fix typo in comment
bfq: Remove bfq_requeue_request_body()
bfq: Remove superfluous conversion from RQ_BIC()
bfq: Allow current waker to defend against a tentative one
bfq: Relax waker detection for shared queues
blk-cgroup: delete rcu_read_lock_held() WARN_ON_ONCE()
blk-throttle: Set BIO_THROTTLED when bio has been throttled
blk-cgroup: Remove unnecessary rcu_read_lock/unlock()
blk-cgroup: always terminate io.stat lines
block, bfq: make bfq_has_work() more accurate
block, bfq: protect 'bfqd->queued' by 'bfqd->lock'
block: cleanup the VM accounting in submit_bio
block: Fix the bio.bi_opf comment
block: reorder the REQ_ flags
blk-iocost: combine local_stat and desc_stat to stat
block: improve the error message from bio_check_eod
block: allow passing a NULL bdev to bio_alloc_clone/bio_init_clone
block: remove superfluous calls to blkcg_bio_issue_init
kthread: unexport kthread_blkcg
blk-cgroup: cleanup blkcg_maybe_throttle_current
...
Add two new opcodes that userspace can use for admin commands:
NVME_URING_CMD_ADMIN : non-vectroed
NVME_URING_CMD_ADMIN_VEC : vectored variant
Wire up support when these are issued on controller node(/dev/nvmeX).
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220520090630.70394-3-joshi.k@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Factor out a helper consolidating the error checks, and fix typo in a
comment too. This is in preparation to support admin commands on this
path.
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220520090630.70394-2-joshi.k@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add nvme_fc_io_getuuid() to the nvme-fc transport. The routine is invoked
by the FC LLDD on a per-I/O request basis. The routine translates from the
FC-specific request structure to the bio and the cgroup structure in order
to obtain the FC appid stored in the cgroup structure. If a value is not
set or a bio is not found, a NULL appid (aka uuid) will be returned to the
LLDD.
Link: https://lore.kernel.org/r/20220519123110.17361-2-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In current implementation we set the non-mdts limits by calling
nvme_init_non_mdts_limits() from nvme_init_ctrl_finish().
This also tries to set the limits for the discovery controller which
has no I/O queues resulting in the warning message reported by the
nvme_log_error() when running blktest nvme/002: -
[ 2005.155946] run blktests nvme/002 at 2022-04-09 16:57:47
[ 2005.192223] loop: module loaded
[ 2005.196429] nvmet: adding nsid 1 to subsystem blktests-subsystem-0
[ 2005.200334] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
<------------------------------SNIP---------------------------------->
[ 2008.958108] nvmet: adding nsid 1 to subsystem blktests-subsystem-997
[ 2008.962082] nvmet: adding nsid 1 to subsystem blktests-subsystem-998
[ 2008.966102] nvmet: adding nsid 1 to subsystem blktests-subsystem-999
[ 2008.973132] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN testhostnqn.
*[ 2008.973196] nvme1: Identify(0x6), Invalid Field in Command (sct 0x0 / sc 0x2) MORE DNR*
[ 2008.974595] nvme nvme1: new ctrl: "nqn.2014-08.org.nvmexpress.discovery"
[ 2009.103248] nvme nvme1: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
Move the call of nvme_init_non_mdts_limits() to nvme_scan_work() after
we verify that I/O queues are created since that is a converging point
for each transport where these limits are actually used.
1. FC :
nvme_fc_create_association()
...
nvme_fc_create_io_queues(ctrl);
...
nvme_start_ctrl()
nvme_scan_queue()
nvme_scan_work()
2. PCIe:-
nvme_reset_work()
...
nvme_setup_io_queues()
nvme_create_io_queues()
nvme_alloc_queue()
...
nvme_start_ctrl()
nvme_scan_queue()
nvme_scan_work()
3. RDMA :-
nvme_rdma_setup_ctrl
...
nvme_rdma_configure_io_queues
...
nvme_start_ctrl()
nvme_scan_queue()
nvme_scan_work()
4. TCP :-
nvme_tcp_setup_ctrl
...
nvme_tcp_configure_io_queues
...
nvme_start_ctrl()
nvme_scan_queue()
nvme_scan_work()
* nvme_scan_work()
...
nvme_validate_or_alloc_ns()
nvme_alloc_ns()
nvme_update_ns_info()
nvme_update_disk_info()
nvme_config_discard() <---
blk_queue_max_write_zeroes_sectors() <---
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Add support for using longer timeouts during controller initialization
and letting the controller come up with namespaces that are not ready
for I/O yet. We skip these not ready namespaces during scanning and
only bring them online once anoter scan is kicked off by the AEN that
is set when the NRDY bit gets set in the I/O Command Set Independent
Identify Namespace Data Structure. This asynchronous probing avoids
blocking the kernel boot when controllers take a very long time to
recover after unclean shutdowns (up to minutes).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
The RDAMA and TCP transport both complete the timed out request in the
same manner and hence code is duplicated. Add and use the helper
nvmf_complete_timed_out_request() to remove the duplicate code.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
On our ZynqMP system we observe, that a NVMe drive that resets itself
while doing a firmware update causes a Kernel crash like this:
[ 67.720772] pcieport 0000:02:02.0: pciehp: Slot(2): Link Down
[ 67.720783] pcieport 0000:02:02.0: pciehp: Slot(2): Card not present
[ 67.720795] nvme 0000:04:00.0: PME# disabled
[ 67.720849] Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP
[ 67.720853] nwl-pcie fd0e0000.pcie: Slave error
Analysis: When nvme_dev_disable() is called because of this PCIe hotplug
event, pci_is_enabled() is still true. And accessing the NVMe drive
which is currently not available as it's in reboot process causes this
"synchronous external abort" on this ARM64 platform.
This patch adds the pci_device_is_present() check as well, which returns
false in this "Card not present" hot-plug case. With this change, the
NVMe driver does not try to access the NVMe registers any more and the
FW update finishes without any problems.
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In nvme_alloc_admin_tags, the admin_q can be set to an error (typically
-ENOMEM) if the blk_mq_init_queue call fails to set up the queue, which
is checked immediately after the call. However, when we return the error
message up the stack, to nvme_reset_work the error takes us to
nvme_remove_dead_ctrl()
nvme_dev_disable()
nvme_suspend_queue(&dev->queues[0]).
Here, we only check that the admin_q is non-NULL, rather than not
an error or NULL, and begin quiescing a queue that never existed, leading
to bad / NULL pointer dereference.
Signed-off-by: Kyle Smith <kyles@hpe.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Most of the internal passthru commands use __nvme_submit_sync_cmd()
interface. There are few places we open code the request submission :-
1. nvme_keep_alive_work(struct work_struct *work)
2. nvme_timeout(struct request *req, bool reserved)
3. nvme_delete_queue(struct nvme_queue *nvmeq, u8 opcode)
Mark the internal passthru request quiet so that we can skip the verbose
error message from nvme_log_error() in nvme_end_req() completion path,
this will be consistent with what we have in __nvme_submit_sync_cmd().
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Alan Adamson <alan.adamson@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Log a few more path related status codes.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The nvme specification only requires qword alignment for segment
descriptors, and the driver already guarantees that. The spec has always
allowed user data to be dword aligned, which is what the queue's
attribute is for, so relax the alignment requirement to that value.
While we could allow byte alignment for some controllers when using
SGLs, we still need to support PRP, and that only allows dword.
Fixes: 3b2a1ebceb ("nvme: set dma alignment to qword")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
DMRSLl is in the unit of logical blocks, while max_discard_sectors is
in the unit of "linux sector".
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
wire up support for async passthru that takes an array of buffers (using
iovec). Exposed via a new op NVME_URING_CMD_IO_VEC. Same 'struct
nvme_uring_cmd' is to be used with -
1. cmd.addr as base address of user iovec array
2. cmd.data_len as count of iovec array elements
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220511054750.20432-6-joshi.k@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Introduce handler for fops->uring_cmd(), implementing async passthru
on char device (/dev/ngX). The handler supports newly introduced
operation NVME_URING_CMD_IO. This operates on a new structure
nvme_uring_cmd, which is similar to struct nvme_passthru_cmd64 but
without the embedded 8b result field. This field is not needed since
uring-cmd allows to return additional result via big-CQE.
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220511054750.20432-5-joshi.k@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Divide the work into two helpers, namely nvme_alloc_user_request and
nvme_execute_user_rq. This is a prep patch, to help wiring up
uring-cmd support in nvme.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[axboe: fold in fix for assuming bio is non-NULL]
Link: https://lore.kernel.org/r/20220511054750.20432-4-joshi.k@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The new nvme-apple driver is missing a few conversions to and
from little-endian data:
drivers/nvme/host/apple.c:291:19: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned long long [usertype] prp1 @@ got restricted __le64 [usertype] prp1 @@
drivers/nvme/host/apple.c:291:19: sparse: expected unsigned long long [usertype] prp1
drivers/nvme/host/apple.c:291:19: sparse: got restricted __le64 [usertype] prp1
drivers/nvme/host/apple.c:292:19: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned long long [usertype] prp2 @@ got restricted __le64 [usertype] prp2 @@
drivers/nvme/host/apple.c:293:21: sparse: sparse: incorrect type in assignment (different base types) @@ expected unsigned int [usertype] length @@ got restricted __le16 [usertype] length @@
drivers/nvme/host/apple.c:351:52: sparse: sparse: incorrect type in initializer (different base types) @@ expected unsigned int [usertype] next_dma_addr @@ got restricted __le64 [usertype] @@
drivers/nvme/host/apple.c:456:45: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __le64 [usertype] @@ got unsigned int [addressable] [usertype] prp_dma @@
drivers/nvme/host/apple.c:459:31: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __le64 [usertype] @@ got unsigned long long [assigned] [usertype] dma_addr @@
drivers/nvme/host/apple.c:474:25: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __le64 [usertype] prp1 @@ got unsigned int [usertype] dma_address @@
drivers/nvme/host/apple.c:475:25: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __le64 [usertype] prp2 @@ got unsigned int [usertype] first_dma @@
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
- RTKit IPC library required to boot and communicate with
co-processors embedded inside Apple SoCs
- SART DMA address filter required to allow some DMA transactions for
the NVMe co-processor
- NVMe platform driver
The following minor changes since v3 on the mailing list have been
folded in:
- sart: %llx -> %pa for a phys_addr_t
- rtkit:/sart: Drop IS_ENABLED inside headers
- rtkit: Use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL
- nvme: Set NVME_REQ_CANCELLED in the timeout handler
- nvme: Use DEFINE_SIMPLE_DEV_PM_OPS instead of #ifdef CONFIG_PM_SLEEP
-----BEGIN PGP SIGNATURE-----
iIkEABYKADEWIQRUAfN3OZ14tapALxsd6alOrecRcQUCYm/5lhMcc3ZlbkBzdmVu
cGV0ZXIuZGV2AAoJEB3pqU6t5xFxITsBAI02bdBvHQAFouu1WzhLzlAPNqjGv3Th
YFOJL4y2oyigAP9zkIGP0e1JcpApPxvlmQnM340ZY+TCth+ahoCWlS4VAg==
=b0QX
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmJ0IIIACgkQmmx57+YA
GNnxlg/+M0d7awpSyiN2g2aoPN9PzaDiWgjSDJ30THfY+JodFmK/Uw1L2dO32+md
0UMuMH9Vtiy3NXf/dDoLU+kiNQgT1ZY7Iw2220GZyHxRUC/7RZ/zGNvAb1FclDhK
YEYtlA2fK1MvlhEcriD8ytP78R80UA7Cx4o5T8CkgfN2OtjLip72aMCFtzRfxMk6
W0GQvkTpDtp3qn59Wlz/JEJKMT+q0txxq1QkPW93Su190S/7Lk/nd+CkmFDHx54n
2LFa33JuZZfNbzPMN9emTl7G5Z7Y241zYBMHvtAzd3WdaDZ6I1apeVnDXrNRhysC
4B8si4qbo9NORhWR96okll0MCiBNZ1UfvS4U+/8A1b0CHwX5rEUKLNOXIfO8xApi
L3HvVUpYPJw8w3aK2SKGVt2aQcnbluy9+LJQgHByPFbeIV80OjWoR4xSSZi2XErl
tp4vmyUBTR8g85EnCqcc8OWCdl+yMjX+NmftLk/XB7/xD+HtqKocqizOVho/9u+K
rv6Ivx6JHk33LFG/jJkN6q8OncgfSRirg0oy6IF/dy2xS2U3OtfaIXHIy9NosC36
B3bm8mb3CVQJxazMYl3rstBGQoN1YpuwOsHRUWiWG1jDnXFwCSck9b5OH+y64PME
45V27Rx1ftTgrOWgsllIqsviWADO4Qj7n5YXYvC+hGrBeDnM8pQ=
=gi38
-----END PGP SIGNATURE-----
Merge tag 'asahi-soc-rtkit-sart-nvme-for-5.19' of https://github.com/AsahiLinux/linux into arm/drivers
Apple SoC NVMe driver and dependencies:
- RTKit IPC library required to boot and communicate with
co-processors embedded inside Apple SoCs
- SART DMA address filter required to allow some DMA transactions for
the NVMe co-processor
- NVMe platform driver
The following minor changes since v3 on the mailing list have been
folded in:
- sart: %llx -> %pa for a phys_addr_t
- rtkit:/sart: Drop IS_ENABLED inside headers
- rtkit: Use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL
- nvme: Set NVME_REQ_CANCELLED in the timeout handler
- nvme: Use DEFINE_SIMPLE_DEV_PM_OPS instead of #ifdef CONFIG_PM_SLEEP
* tag 'asahi-soc-rtkit-sart-nvme-for-5.19' of https://github.com/AsahiLinux/linux:
nvme-apple: Add initial Apple SoC NVMe driver
dt-bindings: nvme: Add Apple ANS NVMe
soc: apple: Add SART driver
dt-bindings: iommu: Add Apple SART DMA address filter
soc: apple: Add RTKit IPC library
soc: apple: Always include Makefile
Link: https://lore.kernel.org/r/20220505154020.84638-1-sven@svenpeter.dev
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The nvme driver never sets a discard_alignment, so it also doens't need
to clear it to zero.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
nvme-fc appid support needs CONFIG_BLK_CGROUP_FC_APPID to work, so disable
the whole code if the option is not set.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Apple SoCs such as the M1 come with an embedded NVMe controller that
is not attached to any PCIe bus. Additionally, it doesn't conform
to the NVMe specification and requires a bunch of changes to command
submission and IOMMU configuration to work.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sven Peter <sven@svenpeter.dev>
Secure erase is a very different operation from discard in that it is
a data integrity operation vs hint. Fully split the limits and helper
infrastructure to make the separation more clear.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2]
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Acked-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just use a non-zero max_discard_sectors as an indicator for discard
support, similar to what is done for write zeroes.
The only places where needs special attention is the RAID5 driver,
which must clear discard support for security reasons by default,
even if the default stacking rules would allow for it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>