linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-23 04:31:50 +00:00

Author	SHA1	Message	Date
Mike Christie	e344c00e7c	scsi: target: core: Unexport target_queue_submission() target_queue_submission() is not called by drivers anymore so unexport it. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-7-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-13 15:53:58 -04:00
Mike Christie	e2f4ea4013	scsi: target: Allow userspace to request direct submissions This allows userspace to request the fabric drivers do direct submissions if they support it. With the new device file, submit_type, users can write 0 - 2 to control how commands are submitted to the backend: 0 - TARGET_FABRIC_DEFAULT_SUBMIT - LIO will use the fabric's default submission type. This is the default for compat. 1 - TARGET_DIRECT_SUBMIT - LIO will submit the cmd to the backend from the calling context if the fabric the cmd was received on supports it, else it will use the fabric's default type. 2 - TARGET_QUEUE_SUBMIT - LIO will queue the cmd to the LIO submission workqueue which will pass it to the backend. When using an NVMe drive and vhost-scsi with direct submission we see around a 20% improvement in 4K I/Os: fio jobs 1 2 4 8 10 -------------------------------------------------- defer 94K 190K 394K 770K 890K direct 128K 252K 488K 950K - And when using the queueing mode, we now no longer see issues like where the iSCSI tx thread is blocked in the block layer waiting on a tag so it can't respond to a nop or perform I/Os for other LUs. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-6-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-13 15:53:58 -04:00
Mike Christie	428926796e	scsi: target: core: Kill transport_handle_cdb_direct() Move the code from transport_handle_cdb_direct() to target_submit() and have iSCSI call target_submit(). Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-5-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-13 15:53:58 -04:00
Mike Christie	5c48a4ea32	scsi: target: core: Move buffer clearing hack Move the hack to clear some buffers to transport_handle_cdb_direct() so we can eventually merge transport_handle_cdb_direct() and target_submit(). This also fixes up the comment so it's clear it was only for udev and reflects that the referenced function does not exist and we now allow more than 1 page for control CDBs. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-4-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-13 15:53:58 -04:00
Mike Christie	ee48345e1c	scsi: target: core: Move core_alua_check_nonop_delay() call Move core_alua_check_nonop_delay() to transport_handle_cdb_direct() so the iSCSI target driver doesn't have to call as many core functions directly. We will eventually merge transport_handle_cdb_direct and target_submit so iSCSI and the other drivers call a common function. It will also be helpful as preparation for future changes which allow the iSCSI target to defer command submission to the LIO submission workqueue, because we will have a common submission function for that which will be based on transport_handle_cdb_direct()/target_submit(). Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-3-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-13 15:53:57 -04:00
Mike Christie	194605d45d	scsi: target: Have drivers report if they support direct submissions In some cases, like with multiple LUN targets or where the target has to respond to transport level requests from the receiving context it can be better to defer cmd submission to a helper thread. If the backend driver blocks on something like request/tag allocation it can block the entire target submission path and other LUs and transport IO on that session. In other cases like single LUN targets with storage that can support all the commands that the target can queue, then it's best to submit the cmd to the backend from the target's cmd receiving context. Subsequent commits will allow the user to config what they prefer, but drivers like loop can't directly submit because they can be called from a context that can't sleep. And, drivers like vhost-scsi can support direct submission, but need to keep their default behavior of deferring execution to avoid possible regressions where the backend can block. Make the drivers tell LIO core if they support direct submissions and their current default, so we can prevent users from misconfiguring the system and initialize devices correctly. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-2-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-13 15:53:57 -04:00
Mike Christie	40ddd6df93	scsi: target: iscs: Make write_pending_must_be_called a bit field Subsequent commits add more on/off type of settings to the target_core_fabric_ops struct so this makes write_pending_must_be_called a bit field instead of a bool to better organize the settings. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-10-13 15:53:57 -04:00
Linus Torvalds	b89b029377	SCSI misc on 20230902 Updates to the usual drivers (ufs, lpfc, qla2xxx, mpi3mr, libsas) and the usual minor updates and bug fixes but no significant core changes. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZPLlcCYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXH4AP9Mm8Hk ICCySnXJ/VcxrAU2ekUlPxGT8CNT2WqRhIqKegEAu0jagPyNcTjX+oBCNVOM33zU p5uU6IEHDA3d36qu04w= =8gks -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, lpfc, qla2xxx, mpi3mr, libsas) and the usual minor updates and bug fixes but no significant core changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (116 commits) scsi: storvsc: Handle additional SRB status values scsi: libsas: Delete sas_ata_task.retry_count scsi: libsas: Delete sas_ata_task.stp_affil_pol scsi: libsas: Delete sas_ata_task.set_affil_pol scsi: libsas: Delete sas_ssp_task.task_prio scsi: libsas: Delete sas_ssp_task.enable_first_burst scsi: libsas: Delete sas_ssp_task.retry_count scsi: libsas: Delete struct scsi_core scsi: libsas: Delete enum sas_phy_type scsi: libsas: Delete enum sas_class scsi: libsas: Delete sas_ha_struct.lldd_module scsi: target: Fix write perf due to unneeded throttling scsi: lpfc: Do not abuse UUID APIs and LPFC_COMPRESS_VMID_SIZE scsi: pm8001: Remove unused declarations scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock scsi: elx: sli4: Remove code duplication scsi: bfa: Replace one-element array with flexible-array member in struct fc_rscn_pl_s scsi: qla2xxx: Remove unused declarations scsi: pmcraid: Use pci_dev_id() to simplify the code scsi: pm80xx: Set RETFIS when requested by libsas ...	2023-09-02 12:02:41 -07:00
Mike Christie	84c073fd89	scsi: target: Fix write perf due to unneeded throttling The write back throttling (WBT) code checks if REQ_SYNC \| REQ_IDLE is set to determine if a write is O_DIRECT vs buffered. If the bits are not set then it assumes it's a buffered write and will throttle LIO if we hit certain metrics. LIO itself is not using the buffer cache and is doing direct I/O, so this has us set the direct bits so we are not throttled. When the initiator application is doing direct I/O this can greatly improve performance. It depends on the backend device but we have seen where the WBT code is throttling writes to only 20K IOPs with 4K I/Os when the device can support 100K+. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230817192902.346791-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-08-21 17:20:48 -04:00
Jinyoung Choi	80814b8e35	bio-integrity: update the payload size in bio_integrity_add_page() Previously, the bip's bi_size has been set before an integrity pages were added. If a problem occurs in the process of adding pages for bip, the bi_size mismatch problem must be dealt with. When the page is successfully added to bvec, the bi_size is updated. The parts affected by the change were also contained in this commit. Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com> Tested-by: "Martin K. Petersen" <martin.petersen@oracle.com> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20230803024956epcms2p38186a17392706650c582d38ef3dbcd32@epcms2p3 Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-08-09 16:05:35 -06:00
Martin K. Petersen	31799f9e6a	Merge patch series "scsi: target: iscsi: Get rid of sprintf in iscsi_target_configfs.c" Konstantin Shelekhin <k.shelekhin@yadro.com> says: This patch series cleanses iscsi_target_configfs.c of sprintf usage. The first patch fixes the real problem, the second just makes sure we are on the safe side from now on. I've reproduced the issue fixed in the first patch by utilizing this cool thing: https://git.sr.ht/~kshelekhin/scapy-iscsi Yeah, shameless promoting of my own tools, but I like the simplicity of scapy and writing tests in C with libiscsi can be a little cumbersome. Check it out: #!/usr/bin/env python3 # Let's cause some DoS in iSCSI target import sys from scapy.supersocket import StreamSocket from scapy_iscsi.iscsi import * cpr = { "InitiatorName": "iqn.2016-04.com.open-iscsi:e476cd9e4e59", "TargetName": "iqn.2023-07.com.example:target", "HeaderDigest": "None", "DataDigest": "None", } spr = { "SessionType": "Normal", "ErrorRecoveryLevel": 0, "DefaultTime2Retain": 0, "DefaultTime2Wait": 2, "ImmediateData": "Yes", "FirstBurstLength": 65536, "MaxBurstLength": 262144, "MaxRecvDataSegmentLength": 262144, "MaxOutstandingR2T": 1, } if len(sys.argv) != 3: print("usage: dos.py <host> <port>", file=sys.stderr) exit(1) host = sys.argv[1] port = int(sys.argv[2]) isid = 0xB00B tsih = 0 connections = [] for i in range(0, 127): s = socket.socket() s.connect((host, port)) s = StreamSocket(s, ISCSI) ds = cpr if i > 0 else cpr \| spr lirq = ISCSI() / LoginRequest(isid=isid, tsih=tsih, cid=i, ds=kv2text(ds)) lirs = s.sr1(lirq) tsih = lirs.tsih connections.append(s) input() Link: https://lore.kernel.org/r/20230722152657.168859-1-k.shelekhin@yadro.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-07-31 12:11:17 -04:00
Konstantin Shelekhin	c0431feb0a	scsi: target: iscsi: Stop using sprintf() in iscsi_target_configfs.c Get rid of sprintf() in favor of sysfs_emit(). The latter ensures not to overflow the given buffer. Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Link: https://lore.kernel.org/r/20230722152657.168859-3-k.shelekhin@yadro.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-07-31 12:09:58 -04:00
Konstantin Shelekhin	801f287c93	scsi: target: iscsi: Fix buffer overflow in lio_target_nacl_info_show() The function lio_target_nacl_info_show() uses sprintf() in a loop to print details for every iSCSI connection in a session without checking for the buffer length. With enough iSCSI connections it's possible to overflow the buffer provided by configfs and corrupt the memory. This patch replaces sprintf() with sysfs_emit_at() that checks for buffer boundries. Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Link: https://lore.kernel.org/r/20230722152657.168859-2-k.shelekhin@yadro.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-07-31 12:09:58 -04:00
Maurizio Lombardi	30a5b62e1c	scsi: target: iscsi: Remove the unused netif_timeout attribute This attribute has never been used, remove it. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20230630155309.46061-1-mlombard@redhat.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-07-11 12:33:32 -04:00
Linus Torvalds	7fcd473a64	SCSI misc on 20230708 A few late arriving patches that missed the initial pull request. It's mostly bug fixes (the dt-bindings is a fix for the initial pull). Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZKmvwSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishResAQCPDbBh omMRBE+W+Vx2TgOJGjo/F+T1D2JjBhLIGpNVggEApJtgrQutAToiCU/qIP9GOTl7 evetzh5boMMuyD2s7ak= =pi4v -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull more SCSI updates from James Bottomley: "A few late arriving patches that missed the initial pull request. It's mostly bug fixes (the dt-bindings is a fix for the initial pull)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Remove unused function declaration scsi: target: docs: Remove tcm_mod_builder.py scsi: target: iblock: Quiet bool conversion warning with pr_preempt use scsi: dt-bindings: ufs: qcom: Fix ICE phandle scsi: core: Simplify scsi_cdl_check_cmd() scsi: isci: Fix comment typo scsi: smartpqi: Replace one-element arrays with flexible-array members scsi: target: tcmu: Replace strlcpy() with strscpy() scsi: ncr53c8xx: Replace strlcpy() with strscpy() scsi: lpfc: Fix lpfc_name struct packing	2023-07-08 12:35:18 -07:00
Linus Torvalds	ca7ce08d6a	SCSI misc on 20230629 Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi, lpfc, qla2xxx). We have a couple of major core changes impacting other systems: Command Duration Limits, which spills into block and ATA and block level Persistent Reservation Operations, which touches block, nvme, target and dm (both of which are added with merge commits containing a cover letter explaining what's going on). Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZJ19cSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfZpAQCQBuWR ELcOhsaG5KzO6xLWcH8mjsOoxffKvazZjTKXlAD5ATEv7++E250oKS3t+yfjae5I Lc195MlDju85ItUQgfk= =U9ik -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi, lpfc, qla2xxx). We have a couple of major core changes impacting other systems: - Command Duration Limits, which spills into block and ATA - block level Persistent Reservation Operations, which touches block, nvme, target and dm Both of these are added with merge commits containing a cover letter explaining what's going on" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits) scsi: core: Improve warning message in scsi_device_block() scsi: core: Replace scsi_target_block() with scsi_block_targets() scsi: core: Don't wait for quiesce in scsi_device_block() scsi: core: Don't wait for quiesce in scsi_stop_queue() scsi: core: Merge scsi_internal_device_block() and device_block() scsi: sg: Increase number of devices scsi: bsg: Increase number of devices scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue scsi: ufs: ufs-pci: Add support for Intel Arrow Lake scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT scsi: ufs: wb: Add explicit flush_threshold sysfs attribute scsi: ufs: ufs-qcom: Switch to the new ICE API scsi: ufs: dt-bindings: qcom: Add ICE phandle scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR scsi: ufs: core: Remove dedicated hwq for dev command scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes ...	2023-06-30 11:57:07 -07:00
Mike Christie	40863cb945	scsi: target: iblock: Quiet bool conversion warning with pr_preempt use We want to pass in true for pr_preempt's argument if we are doing a PRO_PREEMPT_AND_ABORT, so just test sa against PRO_PREEMPT_AND_ABORT, and pass the result directly to pr_preempt. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306221655.Kwtqi1gI-lkp@intel.com/ Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230623161136.6270-1-michael.christie@oracle.com Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-06-28 21:58:36 -04:00
Linus Torvalds	3a8a670eee	Networking changes for 6.5. Core ---- - Rework the sendpage & splice implementations. Instead of feeding data into sockets page by page extend sendmsg handlers to support taking a reference on the data, controlled by a new flag called MSG_SPLICE_PAGES. Rework the handling of unexpected-end-of-file to invoke an additional callback instead of trying to predict what the right combination of MORE/NOTLAST flags is. Remove the MSG_SENDPAGE_NOTLAST flag completely. - Implement SCM_PIDFD, a new type of CMSG type analogous to SCM_CREDENTIALS, but it contains pidfd instead of plain pid. - Enable socket busy polling with CONFIG_RT. - Improve reliability and efficiency of reporting for ref_tracker. - Auto-generate a user space C library for various Netlink families. Protocols --------- - Allow TCP to shrink the advertised window when necessary, prevent sk_rcvbuf auto-tuning from growing the window all the way up to tcp_rmem[2]. - Use per-VMA locking for "page-flipping" TCP receive zerocopy. - Prepare TCP for device-to-device data transfers, by making sure that payloads are always attached to skbs as page frags. - Make the backoff time for the first N TCP SYN retransmissions linear. Exponential backoff is unnecessarily conservative. - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO). - Avoid waking up applications using TLS sockets until we have a full record. - Allow using kernel memory for protocol ioctl callbacks, paving the way to issuing ioctls over io_uring. - Add nolocalbypass option to VxLAN, forcing packets to be fully encapsulated even if they are destined for a local IP address. - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure in-kernel ECMP implementation (e.g. Open vSwitch) select the same link for all packets. Support L4 symmetric hashing in Open vSwitch. - PPPoE: make number of hash bits configurable. - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client (ipconfig). - Add layer 2 miss indication and filtering, allowing higher layers (e.g. ACL filters) to make forwarding decisions based on whether packet matched forwarding state in lower devices (bridge). - Support matching on Connectivity Fault Management (CFM) packets. - Hide the "link becomes ready" IPv6 messages by demoting their printk level to debug. - HSR: don't enable promiscuous mode if device offloads the proto. - Support active scanning in IEEE 802.15.4. - Continue work on Multi-Link Operation for WiFi 7. BPF --- - Add precision propagation for subprogs and callbacks. This allows maintaining verification efficiency when subprograms are used, or in fact passing the verifier at all for complex programs, especially those using open-coded iterators. - Improve BPF's {g,s}setsockopt() length handling. Previously BPF assumed the length is always equal to the amount of written data. But some protos allow passing a NULL buffer to discover what the output buffer should be, without writing anything. - Accept dynptr memory as memory arguments passed to helpers. - Add routing table ID to bpf_fib_lookup BPF helper. - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands. - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark maps as read-only). - Show target_{obj,btf}_id in tracing link fdinfo. - Addition of several new kfuncs (most of the names are self-explanatory): - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(), bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size() and bpf_dynptr_clone(). - bpf_task_under_cgroup() - bpf_sock_destroy() - force closing sockets - bpf_cpumask_first_and(), rework bpf_cpumask_any() kfuncs Netfilter --------- - Relax set/map validation checks in nf_tables. Allow checking presence of an entry in a map without using the value. - Increase ip_vs_conn_tab_bits range for 64BIT builds. - Allow updating size of a set. - Improve NAT tuple selection when connection is closing. Driver API ---------- - Integrate netdev with LED subsystem, to allow configuring HW "offloaded" blinking of LEDs based on link state and activity (i.e. packets coming in and out). - Support configuring rate selection pins of SFP modules. - Factor Clause 73 auto-negotiation code out of the drivers, provide common helper routines. - Add more fool-proof helpers for managing lifetime of MDIO devices associated with the PCS layer. - Allow drivers to report advanced statistics related to Time Aware scheduler offload (taprio). - Allow opting out of VF statistics in link dump, to allow more VFs to fit into the message. - Split devlink instance and devlink port operations. New hardware / drivers ---------------------- - Ethernet: - Synopsys EMAC4 IP support (stmmac) - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches - Marvell 88E6250 7 port switches - Microchip LAN8650/1 Rev.B0 PHYs - MediaTek MT7981/MT7988 built-in 1GE PHY driver - WiFi: - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps - Realtek RTL8723DS (SDIO variant) - Realtek RTL8851BE - CAN: - Fintek F81604 Drivers ------- - Ethernet NICs: - Intel (100G, ice): - support dynamic interrupt allocation - use meta data match instead of VF MAC addr on slow-path - nVidia/Mellanox: - extend link aggregation to handle 4, rather than just 2 ports - spawn sub-functions without any features by default - OcteonTX2: - support HTB (Tx scheduling/QoS) offload - make RSS hash generation configurable - support selecting Rx queue using TC filters - Wangxun (ngbe/txgbe): - add basic Tx/Rx packet offloads - add phylink support (SFP/PCS control) - Freescale/NXP (enetc): - report TAPRIO packet statistics - Solarflare/AMD: - support matching on IP ToS and UDP source port of outer header - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6 - add devlink dev info support for EF10 - Virtual NICs: - Microsoft vNIC: - size the Rx indirection table based on requested configuration - support VLAN tagging - Amazon vNIC: - try to reuse Rx buffers if not fully consumed, useful for ARM servers running with 16kB pages - Google vNIC: - support TCP segmentation of >64kB frames - Ethernet embedded switches: - Marvell (mv88e6xxx): - enable USXGMII (88E6191X) - Microchip: - lan966x: add support for Egress Stage 0 ACL engine - lan966x: support mapping packet priority to internal switch priority (based on PCP or DSCP) - Ethernet PHYs: - Broadcom PHYs: - support for Wake-on-LAN for BCM54210E/B50212E - report LPI counter - Microsemi PHYs: support RGMII delay configuration (VSC85xx) - Micrel PHYs: receive timestamp in the frame (LAN8841) - Realtek PHYs: support optional external PHY clock - Altera TSE PCS: merge the driver into Lynx PCS which it is a variant of - CAN: Kvaser PCIEcan: - support packet timestamping - WiFi: - Intel (iwlwifi): - major update for new firmware and Multi-Link Operation (MLO) - configuration rework to drop test devices and split the different families - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature - Qualcomm 802.11ax (ath11k): - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode - support factory test mode - RealTek (rtw89): - add RSSI based antenna diversity - support U-NII-4 channels on 5 GHz band - RealTek (rtl8xxxu): - AP mode support for 8188f - support USB RX aggregation for the newer chips Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmSbJM4ACgkQMUZtbf5S IrtoDhAAhEim1+LBIKf4lhPcVdZ2p/TkpnwTz5jsTwSeRBAxTwuNJ2fQhFXg13E3 MnRq6QaEp8G4/tA/gynLvQop+FEZEnv+horP0zf/XLcC8euU7UrKdrpt/4xxdP07 IL/fFWsoUGNO+L9LNaHwBo8g7nHvOkPscHEBHc2Xrvzab56TJk6vPySfLqcpKlNZ CHWDwTpgRqNZzSKiSpoMVd9OVMKUXcPYHpDmfEJ5l+e8vTXmZzOLHrSELHU5nP5f mHV7gxkDCTshoGcaed7UTiOvgu1p6E5EchDJxiLaSUbgsd8SZ3u4oXwRxgj33RK/ fB2+UaLrRt/DdlHvT/Ph8e8Ygu77yIXMjT49jsfur/zVA0HEA2dFb7V6QlsYRmQp J25pnrdXmE15llgqsC0/UOW5J1laTjII+T2T70UOAqQl4LWYAQDG4WwsAqTzU0KY dueydDouTp9XC2WYrRUEQxJUzxaOaazskDUHc5c8oHp/zVBT+djdgtvVR9+gi6+7 yy4elI77FlEEqL0ItdU/lSWINayAlPLsIHkMyhSGKX0XDpKjeycPqkNx4UterXB/ JKIR5RBWllRft+igIngIkKX0tJGMU0whngiw7d1WLw25wgu4sB53hiWWoSba14hv tXMxwZs5iGaPcT38oRVMZz8I1kJM4Dz3SyI7twVvi4RUut64EG4= =9i4I -----END PGP SIGNATURE----- Merge tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Jakub Kicinski: "WiFi 7 and sendpage changes are the biggest pieces of work for this release. The latter will definitely require fixes but I think that we got it to a reasonable point. Core: - Rework the sendpage & splice implementations Instead of feeding data into sockets page by page extend sendmsg handlers to support taking a reference on the data, controlled by a new flag called MSG_SPLICE_PAGES Rework the handling of unexpected-end-of-file to invoke an additional callback instead of trying to predict what the right combination of MORE/NOTLAST flags is Remove the MSG_SENDPAGE_NOTLAST flag completely - Implement SCM_PIDFD, a new type of CMSG type analogous to SCM_CREDENTIALS, but it contains pidfd instead of plain pid - Enable socket busy polling with CONFIG_RT - Improve reliability and efficiency of reporting for ref_tracker - Auto-generate a user space C library for various Netlink families Protocols: - Allow TCP to shrink the advertised window when necessary, prevent sk_rcvbuf auto-tuning from growing the window all the way up to tcp_rmem[2] - Use per-VMA locking for "page-flipping" TCP receive zerocopy - Prepare TCP for device-to-device data transfers, by making sure that payloads are always attached to skbs as page frags - Make the backoff time for the first N TCP SYN retransmissions linear. Exponential backoff is unnecessarily conservative - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO) - Avoid waking up applications using TLS sockets until we have a full record - Allow using kernel memory for protocol ioctl callbacks, paving the way to issuing ioctls over io_uring - Add nolocalbypass option to VxLAN, forcing packets to be fully encapsulated even if they are destined for a local IP address - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure in-kernel ECMP implementation (e.g. Open vSwitch) select the same link for all packets. Support L4 symmetric hashing in Open vSwitch - PPPoE: make number of hash bits configurable - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client (ipconfig) - Add layer 2 miss indication and filtering, allowing higher layers (e.g. ACL filters) to make forwarding decisions based on whether packet matched forwarding state in lower devices (bridge) - Support matching on Connectivity Fault Management (CFM) packets - Hide the "link becomes ready" IPv6 messages by demoting their printk level to debug - HSR: don't enable promiscuous mode if device offloads the proto - Support active scanning in IEEE 802.15.4 - Continue work on Multi-Link Operation for WiFi 7 BPF: - Add precision propagation for subprogs and callbacks. This allows maintaining verification efficiency when subprograms are used, or in fact passing the verifier at all for complex programs, especially those using open-coded iterators - Improve BPF's {g,s}setsockopt() length handling. Previously BPF assumed the length is always equal to the amount of written data. But some protos allow passing a NULL buffer to discover what the output buffer should* be, without writing anything - Accept dynptr memory as memory arguments passed to helpers - Add routing table ID to bpf_fib_lookup BPF helper - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark maps as read-only) - Show target_{obj,btf}_id in tracing link fdinfo - Addition of several new kfuncs (most of the names are self-explanatory): - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(), bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size() and bpf_dynptr_clone(). - bpf_task_under_cgroup() - bpf_sock_destroy() - force closing sockets - bpf_cpumask_first_and(), rework bpf_cpumask_any() kfuncs Netfilter: - Relax set/map validation checks in nf_tables. Allow checking presence of an entry in a map without using the value - Increase ip_vs_conn_tab_bits range for 64BIT builds - Allow updating size of a set - Improve NAT tuple selection when connection is closing Driver API: - Integrate netdev with LED subsystem, to allow configuring HW "offloaded" blinking of LEDs based on link state and activity (i.e. packets coming in and out) - Support configuring rate selection pins of SFP modules - Factor Clause 73 auto-negotiation code out of the drivers, provide common helper routines - Add more fool-proof helpers for managing lifetime of MDIO devices associated with the PCS layer - Allow drivers to report advanced statistics related to Time Aware scheduler offload (taprio) - Allow opting out of VF statistics in link dump, to allow more VFs to fit into the message - Split devlink instance and devlink port operations New hardware / drivers: - Ethernet: - Synopsys EMAC4 IP support (stmmac) - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches - Marvell 88E6250 7 port switches - Microchip LAN8650/1 Rev.B0 PHYs - MediaTek MT7981/MT7988 built-in 1GE PHY driver - WiFi: - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps - Realtek RTL8723DS (SDIO variant) - Realtek RTL8851BE - CAN: - Fintek F81604 Drivers: - Ethernet NICs: - Intel (100G, ice): - support dynamic interrupt allocation - use meta data match instead of VF MAC addr on slow-path - nVidia/Mellanox: - extend link aggregation to handle 4, rather than just 2 ports - spawn sub-functions without any features by default - OcteonTX2: - support HTB (Tx scheduling/QoS) offload - make RSS hash generation configurable - support selecting Rx queue using TC filters - Wangxun (ngbe/txgbe): - add basic Tx/Rx packet offloads - add phylink support (SFP/PCS control) - Freescale/NXP (enetc): - report TAPRIO packet statistics - Solarflare/AMD: - support matching on IP ToS and UDP source port of outer header - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6 - add devlink dev info support for EF10 - Virtual NICs: - Microsoft vNIC: - size the Rx indirection table based on requested configuration - support VLAN tagging - Amazon vNIC: - try to reuse Rx buffers if not fully consumed, useful for ARM servers running with 16kB pages - Google vNIC: - support TCP segmentation of >64kB frames - Ethernet embedded switches: - Marvell (mv88e6xxx): - enable USXGMII (88E6191X) - Microchip: - lan966x: add support for Egress Stage 0 ACL engine - lan966x: support mapping packet priority to internal switch priority (based on PCP or DSCP) - Ethernet PHYs: - Broadcom PHYs: - support for Wake-on-LAN for BCM54210E/B50212E - report LPI counter - Microsemi PHYs: support RGMII delay configuration (VSC85xx) - Micrel PHYs: receive timestamp in the frame (LAN8841) - Realtek PHYs: support optional external PHY clock - Altera TSE PCS: merge the driver into Lynx PCS which it is a variant of - CAN: Kvaser PCIEcan: - support packet timestamping - WiFi: - Intel (iwlwifi): - major update for new firmware and Multi-Link Operation (MLO) - configuration rework to drop test devices and split the different families - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature - Qualcomm 802.11ax (ath11k): - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode - support factory test mode - RealTek (rtw89): - add RSSI based antenna diversity - support U-NII-4 channels on 5 GHz band - RealTek (rtl8xxxu): - AP mode support for 8188f - support USB RX aggregation for the newer chips" tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1602 commits) net: scm: introduce and use scm_recv_unix helper af_unix: Skip SCM_PIDFD if scm->pid is NULL. net: lan743x: Simplify comparison netlink: Add __sock_i_ino() for __netlink_diag_dump(). net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses Revert "af_unix: Call scm_recv() only after scm_set_cred()." phylink: ReST-ify the phylink_pcs_neg_mode() kdoc libceph: Partially revert changes to support MSG_SPLICE_PAGES net: phy: mscc: fix packet loss due to RGMII delays net: mana: use vmalloc_array and vcalloc net: enetc: use vmalloc_array and vcalloc ionic: use vmalloc_array and vcalloc pds_core: use vmalloc_array and vcalloc gve: use vmalloc_array and vcalloc octeon_ep: use vmalloc_array and vcalloc net: usb: qmi_wwan: add u-blox 0x1312 composition perf trace: fix MSG_SPLICE_PAGES build error ipvlan: Fix return value of ipvlan_queue_xmit() netfilter: nf_tables: fix underflow in chain reference counter netfilter: nf_tables: unbind non-anonymous set if rule construction fails ...	2023-06-28 16:43:10 -07:00
Linus Torvalds	a0433f8cae	for-6.5/block-2023-06-23 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmSV8dwQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpilGD/9Yys1oxIXJpRf00fzrylAlBthRxMjFQVWw zAut106hAQiBHvU8IkmGA3MvEFVHxtzwYhHI7IR8K3aZBIqscweCqmVI9JyogJw9 U9Twnzel47VmuKdM94FeoN+hbj1fP8EWTjzmy67/zEEfFCdmHvNlMi3lSrGYIpFy 39LxTB99Y4UarM5PtWbes37GYYljzMSWKuo4AfBkvq1eQa+sZ0Vq2xAABKq3UM7f apqhgHtkJooRePDP0eQp+kAyyVMgW2jIK+oIdJDxNF3CKTu2w40RzaYz6fp+jVSU H4R/xS59GW4/xql+VBJDh/qJg9K62DPPYjlW8BmSR8+IjvfFpsyH3/MacE50CD3P 20fs/Mnj49H79fDrQEHJI53cOOb2EmUitbwLbvOcColNTPpt8loBtdQxjF2RMU8R Nyort9DJPFclYCxky1LYg1CNEC2Ln4Zy/jD47wPvqRmOQphOoVlV/hPnOEqvjaZC 49Vn70W2DeE9cXvYI7ha+XIg6/oj+Gs3iusEbV08Ci7EAtXgI+ZUUsQ97K8UNiUh h2lqSJtuI7lBpYP9sf+BeCch5UCC+xGYyTdoM5f58lehWBBPtbs0g7S9RyRyOYxe n+yxEUo3dAGzJ/xsKAjinbZfeWIpr0b1TkAh4w3Cq/BKzRr9Bp8lBAxYuancbQ+Y 1ADPteUOTA== =zP4Y -----END PGP SIGNATURE----- Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux Pull block updates from Jens Axboe: - NVMe pull request via Keith: - Various cleanups all around (Irvin, Chaitanya, Christophe) - Better struct packing (Christophe JAILLET) - Reduce controller error logs for optional commands (Keith) - Support for >=64KiB block sizes (Daniel Gomez) - Fabrics fixes and code organization (Max, Chaitanya, Daniel Wagner) - bcache updates via Coly: - Fix a race at init time (Mingzhe Zou) - Misc fixes and cleanups (Andrea, Thomas, Zheng, Ye) - use page pinning in the block layer for dio (David) - convert old block dio code to page pinning (David, Christoph) - cleanups for pktcdvd (Andy) - cleanups for rnbd (Guoqing) - use the unchecked __bio_add_page() for the initial single page additions (Johannes) - fix overflows in the Amiga partition handling code (Michael) - improve mq-deadline zoned device support (Bart) - keep passthrough requests out of the IO schedulers (Christoph, Ming) - improve support for flush requests, making them less special to deal with (Christoph) - add bdev holder ops and shutdown methods (Christoph) - fix the name_to_dev_t() situation and use cases (Christoph) - decouple the block open flags from fmode_t (Christoph) - ublk updates and cleanups, including adding user copy support (Ming) - BFQ sanity checking (Bart) - convert brd from radix to xarray (Pankaj) - constify various structures (Thomas, Ivan) - more fine grained persistent reservation ioctl capability checks (Jingbo) - misc fixes and cleanups (Arnd, Azeem, Demi, Ed, Hengqi, Hou, Jan, Jordy, Li, Min, Yu, Zhong, Waiman) * tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux: (266 commits) scsi/sg: don't grab scsi host module reference ext4: Fix warning in blkdev_put() block: don't return -EINVAL for not found names in devt_from_devname cdrom: Fix spectre-v1 gadget block: Improve kernel-doc headers blk-mq: don't insert passthrough request into sw queue bsg: make bsg_class a static const structure ublk: make ublk_chr_class a static const structure aoe: make aoe_class a static const structure block/rnbd: make all 'class' structures const block: fix the exclusive open mask in disk_scan_partitions block: add overflow checks for Amiga partition support block: change all __u32 annotations to __be32 in affs_hardblocks.h block: fix signed int overflow in Amiga partition support block: add capacity validation in bdev_add_partition() block: fine-granular CAP_SYS_ADMIN for Persistent Reservation block: disallow Persistent Reservation on partitions reiserfs: fix blkdev_put() warning from release_journal_dev() block: fix wrong mode for blkdev_get_by_dev() from disk_scan_partitions() block: document the holder argument to blkdev_get_by_path ...	2023-06-26 12:47:20 -07:00
David Howells	d2fe21077d	scsi: target: iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage. This allows multiple pages and multipage folios to be passed through. TODO: iscsit_fe_sendpage_sg() should perhaps set up a bio_vec array for the entire set of pages it's going to transfer plus two for the header and trailer and page fragments to hold the header and trailer - and then call sendmsg once for the entire message. Signed-off-by: David Howells <dhowells@redhat.com> cc: Mike Christie <michael.christie@oracle.com> cc: Maurizio Lombardi <mlombard@redhat.com> cc: "James E.J. Bottomley" <jejb@linux.ibm.com> cc: "Martin K. Petersen" <martin.petersen@oracle.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: Al Viro <viro@zeniv.linux.org.uk> cc: open-iscsi@googlegroups.com Link: https://lore.kernel.org/r/20230623225513.2732256-13-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-24 15:50:13 -07:00
Azeem Shaikh	4b2e28758d	scsi: target: tcmu: Replace strlcpy() with strscpy() strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230621030033.3800351-3-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-06-21 21:13:00 -04:00
Bob Pearson	9127169922	scsi: target: core: Fix error path in target_setup_session() In the error exits in target_setup_session(), if a branch is taken to free_sess: transport_free_session() may call to target_free_cmd_counter() and then fall through to call target_free_cmd_counter() a second time. This can, and does, sometimes cause seg faults since the data field in cmd_cnt->refcnt has been freed in the first call. Fix this problem by simply returning after the call to transport_free_session(). The second call is redundant for those cases. Fixes: `4edba7e4a8` ("scsi: target: Move cmd counter allocation") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Link: https://lore.kernel.org/r/20230613144259.12890-1-rpearsonhpe@gmail.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-06-14 21:54:35 -04:00
Christoph Hellwig	05bdb99653	block: replace fmode_t with a block-specific type for block open flags The only overlap between the block open flags mapped into the fmode_t and other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and ->ioctl and stop abusing fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-06-12 08:04:05 -06:00
Christoph Hellwig	2736e8eeb0	block: use the holder as indication for exclusive opens The current interface for exclusive opens is rather confusing as it requires both the FMODE_EXCL flag and a holder. Remove the need to pass FMODE_EXCL and just key off the exclusive open off a non-NULL holder. For blkdev_put this requires adding the holder argument, which provides better debug checking that only the holder actually releases the hold, but at the same time allows removing the now superfluous mode argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-06-12 08:04:04 -06:00
Christoph Hellwig	0718afd47f	block: introduce holder ops Add a new blk_holder_ops structure, which is passed to blkdev_get_by_* and installed in the block_device for exclusive claims. It will be used to allow the block layer to call back into the user of the block device for thing like notification of a removed device or a device resize. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Link: https://lore.kernel.org/r/20230601094459.1350643-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-06-05 10:53:04 -06:00
Martin K. Petersen	7907ad748b	Merge patch series "Use block pr_ops in LIO" Mike Christie <michael.christie@oracle.com> says: The patches in this thread allow us to use the block pr_ops with LIO's target_core_iblock module to support cluster applications in VMs. They were built over Linus's tree. They also apply over linux-next and Martin's tree and Jens's trees. Currently, to use windows clustering or linux clustering (pacemaker + cluster labs scsi fence agents) in VMs with LIO and vhost-scsi, you have to use tcmu or pscsi or use a cluster aware FS/framework for the LIO pr file. Setting up a cluster FS/framework is pain and waste when your real backend device is already a distributed device, and pscsi and tcmu are nice for specific use cases, but iblock gives you the best performance and allows you to use stacked devices like dm-multipath. So these patches allow iblock to work like pscsi/tcmu where they can pass a PR command to the backend module. And then iblock will use the pr_ops to pass the PR command to the real devices similar to what we do for unmap today. The patches are separated in the following groups: Patch 1 - 2: - Add block layer callouts for reading reservations and rename reservation error code. Patch 3 - 5: - SCSI support for new callouts. Patch 6: - DM support for new callouts. Patch 7 - 13: - NVMe support for new callouts. Patch 14 - 18: - LIO support for new callouts. This patchset has been tested with the libiscsi PGR ops and with window's failover cluster verification test. Note that for scsi backend devices we need this patchset: https://lore.kernel.org/linux-scsi/20230123221046.125483-1-michael.christie@oracle.com/T/#m4834a643ffb5bac2529d65d40906d3cfbdd9b1b7 to handle UAs. To reduce the size of this patchset that's being done separately to make reviewing easier. And to make merging easier this patchset and the one above do not have any conflicts so can be merged in different trees. Link: https://lore.kernel.org/r/20230407200551.12660-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 16:35:02 -04:00
Maurizio Lombardi	2a737d3b8c	scsi: target: iscsi: Prevent login threads from racing between each other The tpg->np_login_sem is a semaphore that is used to serialize the login process when multiple login threads run concurrently against the same target portal group. The iscsi_target_locate_portal() function finds the tpg, calls iscsit_access_np() against the np_login_sem semaphore and saves the tpg pointer in conn->tpg; If iscsi_target_locate_portal() fails, the caller will check for the conn->tpg pointer and, if it's not NULL, then it will assume that iscsi_target_locate_portal() called iscsit_access_np() on the semaphore. Make sure that conn->tpg gets initialized only if iscsit_access_np() was successful, otherwise iscsit_deaccess_np() may end up being called against a semaphore we never took, allowing more than one thread to access the same tpg. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20230508162219.1731964-4-mlombard@redhat.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 16:29:39 -04:00
Maurizio Lombardi	13247018d6	scsi: target: iscsi: Fix hang in the iSCSI login code If the initiator suddenly stops sending data during a login while keeping the TCP connection open, the login_work won't be scheduled and will never release the login semaphore; concurrent login operations will therefore get stuck and fail. The bug is due to the inability of the login timeout code to properly handle this particular case. Fix the problem by replacing the old per-NP login timer with a new per-connection timer. The timer is started when an initiator connects to the target; if it expires, it sends a SIGINT signal to the thread pointed at by the conn->login_kworker pointer. conn->login_kworker is set by calling the iscsit_set_login_timer_kworker() helper, initially it will point to the np thread; When the login operation's control is in the process of being passed from the NP-thread to login_work, the conn->login_worker pointer is set to NULL. Finally, login_kworker will be changed to point to the worker thread executing the login_work job. If conn->login_kworker is NULL when the timer expires, it means that the login operation hasn't been completed yet but login_work isn't running, in this case the timer will mark the login process as failed and will schedule login_work so the latter will be forced to free the resources it holds. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20230508162219.1731964-2-mlombard@redhat.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 16:29:39 -04:00
Azeem Shaikh	0871237a94	scsi: target: Replace all non-returning strlcpy() with strscpy() strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230516025322.2804923-1-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:39:44 -04:00
Mike Christie	394f811848	scsi: target: Add block PR support to iblock This adds support for the block PR callouts to target_core_iblock. This patch doesn't attempt to implement the entire spec because there's no way support it all like SPEC_I_PT and ALL_TG_PT. This only supports exporting the iblock device from one path on the local target. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230407200551.12660-19-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-11 21:55:36 -04:00
Mike Christie	8455799d2d	scsi: target: Report and detect unsupported PR commands The backend modules don't know about ports and I_T nexuses and the pr_ops callouts the modules will use don't support the old RESERVE/RELEASE commands. This patch has us report we don't support those types of commands and fail them. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230407200551.12660-18-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-11 21:55:36 -04:00
Mike Christie	d9b3275bdd	scsi: target: Pass struct target_opcode_descriptor to enabled The iblock pr_ops support does not support commands that require port or I_T Nexus info. This adds a struct target_opcode_descriptor as an argument to the enabled callout so we can still have the common tcm_is_pr_enabled and tcm_is_scsi2_reservations_enabled functions and also determine if the command is supported based on the command and service action and device settings. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230407200551.12660-17-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-11 21:55:36 -04:00
Mike Christie	53062ace0b	scsi: target: Allow backends to hook into PR handling For the cases where you want to export a device to a VM via a single I_T nexus and want to passthrough the PR handling to the physical/real device you have to use pscsi or tcmu. Both are good for specific uses however for the case where you want good performance, and are not using SCSI devices directly (using DM/MD RAID or multipath devices) then we are out of luck. The following patches allow iblock to mimimally hook into the LIO PR code and then pass the PR handling to the physical device. Note that like with the tcmu an pscsi cases it's only supported when you export the device via one I_T nexus. This patch adds the initial LIO callouts. The next patch will modify iblock. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230407200551.12660-16-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-11 21:55:36 -04:00
Mike Christie	0217da08c1	scsi: target: Rename sbc_ops to exec_cmd_ops The next patches allow us to call the block layer's pr_ops from the backends. This will require allowing the backends to hook into the cmd processing for SPC commands, so this renames sbc_ops to a more generic exec_cmd_ops. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230407200551.12660-15-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-11 21:55:36 -04:00
Maurizio Lombardi	a0fde512f7	scsi: target: core: Fix invalid memory access nr_attrs should start counting from zero, otherwise we will end up dereferencing an invalid memory address. $ targetcli /loopback create general protection fault RIP: 0010:configfs_create_file+0x12/0x70 Call Trace: <TASK> configfs_attach_item.part.0+0x5f/0x150 configfs_attach_group.isra.0+0x49/0x120 configfs_mkdir+0x24f/0x4d0 vfs_mkdir+0x192/0x240 do_mkdirat+0x131/0x160 __x64_sys_mkdir+0x48/0x70 do_syscall_64+0x5c/0x90 Fixes: `31177b7479` ("scsi: target: core: Add RTPI attribute for target port") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20230407130033.556644-1-mlombard@redhat.com Acked-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-11 20:49:43 -04:00
Tom Rix	aa4d7812cf	scsi: target: core: Remove unused 'prod_len' variable clang with W=1 reports: drivers/target/target_core_spc.c:229:6: error: variable 'prod_len' set but not used [-Werror,-Wunused-but-set-variable] u32 prod_len; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20230329132421.1809362-1-trix@redhat.com Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-04-02 21:35:52 -04:00
Martin K. Petersen	f467b865cf	Merge branch '6.3/scsi-fixes' into 6.4/scsi-staging Pull in the fixes branch to resolve an mpi3mr conflict reported by sfr. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-31 21:45:14 -04:00
Lizhe	882f4adac9	scsi: target: tcm_loop: Remove redundant driver match function If there is no driver match function, the driver core assumes that each candidate pair (driver, device) matches. See driver_match_device(). pseudo_lld_bus_match() always returns 1 and is therefore equivalent to not registering a match function. Remove it. Signed-off-by: Lizhe <sensor1010@163.com> Link: https://lore.kernel.org/r/20230319043518.297490-1-sensor1010@163.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 20:44:14 -04:00
Martin K. Petersen	62d15dba0a	Merge patch series "Constify most SCSI host templates" Bart Van Assche <bvanassche@acm.org> says: It helps humans and the compiler if it is made explicit that SCSI host templates are not modified. Hence this patch series that constifies most SCSI host templates. Please consider this patch series for the next merge window. Link: https://lore.kernel.org/r/20230322195515.1267197-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 20:13:03 -04:00
Bart Van Assche	8e2ab8cda5	scsi: target: tcm-loop: Declare SCSI host template const Make it explicit that the SCSI host template is not modified. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230322195515.1267197-79-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 19:20:00 -04:00
Martin K. Petersen	ae2fb3cb0f	Merge patch series "target: TMF and recovery fixes" Mike Christie <michael.christie@oracle.com> says: The following patches apply over Martin's 6.4 branches and Linus's tree. They fix a couple regressions in iscsit that occur when there are TMRs executing and a connection is closed. It also includes Dimitry's fixes in related code paths for cmd cleanup when ERL2 is used and the write pending hang during conn cleanup. This version of the patchset brings it back to just regressions and fixes for bugs we have a lot of users hitting. I'm going to fix isert and get it hooked into iscsit properly in a second patchset, because this one was getting so large. I've also moved my cleanup type of patches for a 3rd patchset. Link: https://lore.kernel.org/r/20230319015620.96006-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:39:15 -04:00
Dmitry Bogdanov	ea87981a0e	scsi: target: iscsi: Handle abort for WRITE_PENDING cmds Sometimes an initiator does not send data for a WRITE command and tries to abort it. The abort hangs waiting for frontend driver completion. iSCSI driver waits for data and that timeout eventually initiates connection reinstatment. The connection closing releases the commands in the connection, but those aborted commands still did not handle the abort and did not decrease a command ref counter. Because of that the connection reinstatement hangs indefinitely and prevents re-login for that initiator. Add handling in TCM of the abort for the WRITE_PENDING commands at connection closing moment to make it possible to release them. Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> [mnc: Rebase and expand comment] Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-10-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:23 -04:00
Mike Christie	cc79da306e	scsi: target: iscsit: Fix TAS handling during conn cleanup Fix a bug added in commit `f36199355c` ("scsi: target: iscsi: Fix cmd abort fabric stop race"). If CMD_T_TAS is set on the se_cmd we must call iscsit_free_cmd() to do the last put on the cmd and free it, because the connection is down and we will not up sending the response and doing the put from the normal I/O path. Add a check for CMD_T_TAS in iscsit_release_commands_from_conn() so we now detect this case and run iscsit_free_cmd(). Fixes: `f36199355c` ("scsi: target: iscsi: Fix cmd abort fabric stop race") Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-9-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:23 -04:00
Mike Christie	673db054d7	scsi: target: Fix multiple LUN_RESET handling This fixes a bug where an initiator thinks a LUN_RESET has cleaned up running commands when it hasn't. The bug was added in commit `51ec502a32` ("target: Delete tmr from list before processing"). The problem occurs when: 1. We have N I/O cmds running in the target layer spread over 2 sessions. 2. The initiator sends a LUN_RESET for each session. 3. session1's LUN_RESET loops over all the running commands from both sessions and moves them to its local drain_task_list. 4. session2's LUN_RESET does not see the LUN_RESET from session1 because the commit above has it remove itself. session2 also does not see any commands since the other reset moved them off the state lists. 5. sessions2's LUN_RESET will then complete with a successful response. 6. sessions2's inititor believes the running commands on its session are now cleaned up due to the successful response and cleans up the running commands from its side. It then restarts them. 7. The commands do eventually complete on the backend and the target starts to return aborted task statuses for them. The initiator will either throw a invalid ITT error or might accidentally lookup a new task if the ITT has been reallocated already. Fix the bug by reverting the patch, and serialize the execution of LUN_RESETs and Preempt and Aborts. Also prevent us from waiting on LUN_RESETs in core_tmr_drain_tmr_list, because it turns out the original patch fixed a bug that was not mentioned. For LUN_RESET1 core_tmr_drain_tmr_list can see a second LUN_RESET and wait on it. Then the second reset will run core_tmr_drain_tmr_list and see the first reset and wait on it resulting in a deadlock. Fixes: `51ec502a32` ("target: Delete tmr from list before processing") Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-8-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:23 -04:00
Dmitry Bogdanov	d8990b5a4d	scsi: target: iscsit: Free cmds before session free Commands from recovery entries are freed after session has been closed. That leads to use-after-free at command free or NPE with such call trace: Time2Retain timer expired for SID: 1, cleaning up iSCSI session. BUG: kernel NULL pointer dereference, address: 0000000000000140 RIP: 0010:sbitmap_queue_clear+0x3a/0xa0 Call Trace: target_release_cmd_kref+0xd1/0x1f0 [target_core_mod] transport_generic_free_cmd+0xd1/0x180 [target_core_mod] iscsit_free_cmd+0x53/0xd0 [iscsi_target_mod] iscsit_free_connection_recovery_entries+0x29d/0x320 [iscsi_target_mod] iscsit_close_session+0x13a/0x140 [iscsi_target_mod] iscsit_check_post_dataout+0x440/0x440 [iscsi_target_mod] call_timer_fn+0x24/0x140 Move cleanup of recovery enrties to before session freeing. Reported-by: Forza <forza@tnonline.net> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-7-michael.christie@oracle.com Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:23 -04:00
Mike Christie	395cee83d0	scsi: target: iscsit: Stop/wait on cmds during conn close This fixes a bug added in commit `f36199355c` ("scsi: target: iscsi: Fix cmd abort fabric stop race"). If we have multiple sessions to the same se_device we can hit a race where a LUN_RESET on one session cleans up the se_cmds from under another session which is being closed. This results in the closing session freeing its conn/session structs while they are still in use. The bug is: 1. Session1 has IO se_cmd1. 2. Session2 can also have se_cmds for I/O and optionally TMRs for ABORTS but then gets a LUN_RESET. 3. The LUN_RESET on session2 sees the se_cmds on session1 and during the drain stages marks them all with CMD_T_ABORTED. 4. session1 is now closed so iscsit_release_commands_from_conn() only sees se_cmds with the CMD_T_ABORTED bit set and returns immediately even though we have outstanding commands. 5. session1's connection and session are freed. 6. The backend request for se_cmd1 completes and it accesses the freed connection/session. This hooks the iscsit layer into the cmd counter code, so we can wait for all outstanding se_cmds before freeing the connection. Fixes: `f36199355c` ("scsi: target: iscsi: Fix cmd abort fabric stop race") Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-6-michael.christie@oracle.com Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:23 -04:00
Mike Christie	6d256bee60	scsi: target: iscsit: isert: Alloc per conn cmd counter This has iscsit allocate a per conn cmd counter and converts iscsit/isert to use it instead of the per session one. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-5-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:23 -04:00
Mike Christie	8e288be860	scsi: target: Pass in cmd counter to use during cmd setup Allow target_get_sess_cmd() users to pass in the cmd counter they want to use. Right now we pass in the session's cmd counter but in a subsequent commit iSCSI will switch from per session to per conn. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-4-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:23 -04:00
Mike Christie	4edba7e4a8	scsi: target: Move cmd counter allocation iSCSI needs to allocate its cmd counter per connection for MCS support where we need to stop and wait on commands running on a connection instead of per session. This moves the cmd counter allocation to target_setup_session() which is used by drivers that need the stop+wait behavior per session. xcopy doesn't need stop+wait at all, so we will be OK moving the cmd counter allocation outside of transport_init_session(). Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-3-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:23 -04:00
Mike Christie	becd9be606	scsi: target: Move sess cmd counter to new struct iSCSI needs to wait on outstanding commands like how SRP and the FC/FCoE drivers do. It can't use target_stop_session() because for MCS support we can't stop the entire session during recovery because if other connections are OK then we want to be able to continue to execute I/O on them. Move the per session cmd counters to a new struct so iSCSI can allocate them per connection. The xcopy code can also just not allocate in the future since it doesn't need to track commands. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-2-michael.christie@oracle.com Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-03-24 17:32:22 -04:00

1 2 3 4 5 ...

2453 Commits