linux

Author	SHA1	Message	Date
Arun Easi	0bc17251df	scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines Flash update failed due to missing endian conversion in FLT region access as well as in checksum computation. Link: https://lore.kernel.org/r/20201202132312.19966-12-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:18 -05:00
Saurav Kashyap	f795f96e72	scsi: qla2xxx: Handle aborts correctly for port undergoing deletion Call trace observed while shutting down the adapter ports (LINK DOWN). Handle aborts correctly. localhost kernel: INFO: task nvme:44209 blocked for more than 120 seconds. localhost kernel: "echo 0 >/proc/sys/kernel/hung_task_timeout_secs" disables this message. localhost kernel: nvme D ffff88b45fb5acc0 0 44209 1 0x00000080 localhost kernel: Call Trace: localhost kernel: [<ffffffffbd187169>] schedule+0x29/0x70 localhost kernel: [<ffffffffbd184c51>] schedule_timeout+0x221/0x2d0 localhost kernel: [<ffffffffbcad7229>] ? ttwu_do_wakeup+0x19/0xe0 localhost kernel: [<ffffffffbcad735f>] ? ttwu_do_activate+0x6f/0x80 localhost kernel: [<ffffffffbcada830>] ? try_to_wake_up+0x190/0x390 localhost kernel: [<ffffffffbd18751d>] wait_for_completion+0xfd/0x140 localhost kernel: [<ffffffffbcadaaf0>] ? wake_up_state+0x20/0x20 localhost kernel: [<ffffffffbcabe3da>] flush_work+0x10a/0x1b0 localhost kernel: [<ffffffffbcabb0f0>] ? move_linked_works+0x90/0x90 localhost kernel: [<ffffffffbcabe6cf>] flush_delayed_work+0x3f/0x50 localhost kernel: [<ffffffffc0452767>] nvme_fc_init_ctrl+0x657/0x6a0 [nvme_fc] localhost kernel: [<ffffffffc045293a>] nvme_fc_create_ctrl+0x18a/0x210 [nvme_fc] localhost kernel: [<ffffffffc028962f>] nvmf_dev_write+0x98f/0xb35 [nvme_fabrics] localhost kernel: [<ffffffffbcd08927>] ? security_file_permission+0x27/0xa0 localhost kernel: [<ffffffffbcc4db50>] vfs_write+0xc0/0x1f0 localhost kernel: [<ffffffffbcc4e92f>] SyS_write+0x7f/0xf0 localhost kernel: [<ffffffffbd193f92>] system_call_fastpath+0x25/0x2a Link: https://lore.kernel.org/r/20201202132312.19966-11-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:18 -05:00
Quinn Tran	07a5f69248	scsi: qla2xxx: Fix N2N and NVMe connect retry failure FC-NVMe target discovery failed when initiator wwpn < target wwpn in an N2N (Direct Attach) config, where the driver was stuck on FCP PRLI mode and failed to retry with NVMe PRLI. Link: https://lore.kernel.org/r/20201202132312.19966-10-njavali@marvell.com Fixes: `84ed362ac4` ("scsi: qla2xxx: Dual FCP-NVMe target port support”) Fixes: `983f127603` ("scsi: qla2xxx: Retry PLOGI on FC-NVMe PRLI failure”) Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:18 -05:00
Arun Easi	8a78dd6ed1	scsi: qla2xxx: Fix FW initialization error on big endian machines Some fields are not correctly byte swapped causing failure during initialization. As probe() returns failure, HBAs will not be claimed when this happens. qla2xxx [0007:01:00.0]-ffff:3: Secure Flash Update in FW: Supported qla2xxx [0007:01:00.0]-ffff:3: SCM in FW: Supported qla2xxx [0007:01:00.0]-00d2:3: Init Firmware ** FAILED **. qla2xxx [0007:01:00.0]-00d6:3: Failed to initialize adapter - Adapter flags 2. qla2xxx 0007:01:00.1: enabling device (0140 -> 0142) qla2xxx [0007:01:00.1]-011c: : MSI-X vector count: 128. qla2xxx [0007:01:00.1]-001d: : Found an ISP2289 irq 18 iobase 0xd000080080004000. qla2xxx 0007:01:00.1: Using 64-bit direct DMA at offset 800000000000000 BUG: Bad page state in process insmod pfn:67118 page:f00000000168bd40 count:-1 mapcount:0 mapping: (null) index:0x0 page flags: 0x3ffff800000000() page dumped because: nonzero _count Modules linked in: qla2xxx(OE+) nvme_fc nvme_fabrics nvme_core scsi_transport_fc scsi_tgt nls_utf8 isofs ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter nx_crypto ses enclosure scsi_transport_sas pseries_rng sg ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic crct10dif_common usb_storage ipr libata tg3 ptp pps_core dm_mirror dm_region_hash dm_log dm_mod CPU: 32 PID: 8560 Comm: insmod Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.ppc64 #1 Call Trace: [c0000006dd7caa70] [c00000000001cca8] .show_stack+0x88/0x330 (unreliable) [c0000006dd7cab30] [c000000000ac3d88] .dump_stack+0x28/0x3c [c0000006dd7caba0] [c00000000029e48c] .bad_page+0x15c/0x1c0 [c0000006dd7cac40] [c00000000029f938] .get_page_from_freelist+0x11e8/0x1ea0 [c0000006dd7caf40] [c0000000002a1d30] .__alloc_pages_nodemask+0x1c0/0xc70 [c0000006dd7cb140] [c00000000002ba0c] .__dma_direct_alloc_coherent+0x8c/0x170 [c0000006dd7cb1e0] [d000000010a94688] .qla2x00_mem_alloc+0x10f8/0x1370 [qla2xxx] [c0000006dd7cb2d0] [d000000010a9c790] .qla2x00_probe_one+0xb60/0x22e0 [qla2xxx] [c0000006dd7cb540] [c0000000005de764] .pci_device_probe+0x204/0x300 [c0000006dd7cb600] [c0000000006ca61c] .driver_probe_device+0x2cc/0x6f0 [c0000006dd7cb6b0] [c0000000006cabec] .__driver_attach+0x10c/0x110 [c0000006dd7cb740] [c0000000006c5f04] .bus_for_each_dev+0x94/0x100 [c0000006dd7cb7e0] [c0000000006c94f4] .driver_attach+0x34/0x50 [c0000006dd7cb860] [c0000000006c8f58] .bus_add_driver+0x298/0x3b0 [c0000006dd7cb900] [c0000000006cb6e0] .driver_register+0xb0/0x1a0 [c0000006dd7cb980] [c0000000005dc474] .__pci_register_driver+0xc4/0xf0 [c0000006dd7cba10] [d000000010b94e20] .qla2x00_module_init+0x2a8/0x328 [qla2xxx] [c0000006dd7cbaa0] [c00000000000c130] .do_one_initcall+0x130/0x2e0 [c0000006dd7cbb50] [c0000000001b2e8c] .load_module+0x1afc/0x2340 [c0000006dd7cbd40] [c0000000001b3920] .SyS_finit_module+0xd0/0x130 [c0000006dd7cbe30] [c00000000000a284] system_call+0x38/0xfc Link: https://lore.kernel.org/r/20201202132312.19966-9-njavali@marvell.com Fixes: `9f2475fe74` ("scsi: qla2xxx: SAN congestion management implementation") Fixes: `cf3c54fb49` ("scsi: qla2xxx: Add SLER and PI control support”) Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:18 -05:00
Arun Easi	8de309e729	scsi: qla2xxx: Fix crash during driver load on big endian machines Crash stack: [576544.715489] Unable to handle kernel paging request for data at address 0xd00000000f970000 [576544.715497] Faulting instruction address: 0xd00000000f880f64 [576544.715503] Oops: Kernel access of bad area, sig: 11 [#1] [576544.715506] SMP NR_CPUS=2048 NUMA pSeries : [576544.715703] NIP [d00000000f880f64] .qla27xx_fwdt_template_valid+0x94/0x100 [qla2xxx] [576544.715722] LR [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] [576544.715726] Call Trace: [576544.715731] [c0000004d0ffb000] [c0000006fe02c350] 0xc0000006fe02c350 (unreliable) [576544.715750] [c0000004d0ffb080] [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] [576544.715770] [c0000004d0ffb170] [d00000000f7aa034] .qla81xx_load_risc+0x84/0x1a0 [qla2xxx] [576544.715789] [c0000004d0ffb210] [d00000000f79f7c8] .qla2x00_setup_chip+0xc8/0x910 [qla2xxx] [576544.715808] [c0000004d0ffb300] [d00000000f7a631c] .qla2x00_initialize_adapter+0x4dc/0xb00 [qla2xxx] [576544.715826] [c0000004d0ffb3e0] [d00000000f78ce28] .qla2x00_probe_one+0xf08/0x2200 [qla2xxx] Link: https://lore.kernel.org/r/20201202132312.19966-8-njavali@marvell.com Fixes: `f73cb695d3` ("[SCSI] qla2xxx: Add support for ISP2071.") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:17 -05:00
Arun Easi	aceba54ba0	scsi: qla2xxx: Fix compilation issue in PPC systems Fix compile time errors reported on PPC systems, qla_gbl.h:991:20: error: inlining failed in call to always_inline ‘qla_nvme_abort_set_option’: function body not available Link: https://lore.kernel.org/r/20201202132312.19966-7-njavali@marvell.com Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:17 -05:00
Saurav Kashyap	0ce8ab50a6	scsi: qla2xxx: Don't check for fw_started while posting NVMe command NVMe commands can come only after successful addition of rport and NVMe connect, and rport is only registered after FW started bit is set. Remove the redundant check. Link: https://lore.kernel.org/r/20201202132312.19966-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:17 -05:00
Quinn Tran	e4fc78f48d	scsi: qla2xxx: Tear down session if FW say it is down The completion status 0x28 (ppc = be = 0x2800) below indicates session is not there, trigger session deletion. qla2xxx [000b:04:00.1]-8009:8: DEVICE RESET ISSUED nexus=8:1:51 cmd=c000001432d0f600. qla2xxx [000b:04:00.1]-5039:8: Async-tmf error - hdl=67b completion status(2800). qla2xxx [000b:04:00.1]-8030:8: TM IOCB failed (102). qla2xxx [000b:04:00.1]-800c:8: do_reset failed for cmd=c000001432d0f600. qla2xxx [000b:04:00.1]-800f:8: DEVICE RESET FAILED: Task management failed nexus=8:1:51 cmd=c000001432d0f600. qla2xxx [000b:04:00.1]-8009:8: DEVICE RESET ISSUED nexus=8:1:52 cmd=c000001432d0c200. qla2xxx [000b:04:00.1]-5039:8: Async-tmf error - hdl=67c completion status(2800). qla2xxx [000b:04:00.1]-8030:8: TM IOCB failed (102). Link: https://lore.kernel.org/r/20201202132312.19966-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:17 -05:00
Quinn Tran	a6dcfe0848	scsi: qla2xxx: Limit interrupt vectors to number of CPUs Driver created too many QPairs(126) with 28xx adapter. Limit to the number of CPUs to minimize wasted resources. Link: https://lore.kernel.org/r/20201202132312.19966-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:17 -05:00
Saurav Kashyap	c1599657d4	scsi: qla2xxx: Change post del message from debug level to log level Change the message debug level. Link: https://lore.kernel.org/r/20201202132312.19966-3-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:17 -05:00
Daniel Wagner	305c16ce26	scsi: qla2xxx: Return EBUSY on fcport deletion When the fcport is about to be deleted we should return EBUSY instead of ENODEV. Only for EBUSY will the request be requeued in a multipath setup. Also return EBUSY when the firmware has not yet started to avoid dropping the request. Link: https://lore.kernel.org/r/20201014073048.36219-1-dwagner@suse.de Link: https://lore.kernel.org/r/20201202132312.19966-2-njavali@marvell.com Reviewed-by: Arun Easi <aeasi@marvell.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-09 11:34:17 -05:00
Colin Ian King	3a5b9fa2cc	scsi: qla4xxx: Remove redundant assignment to variable rval The variable rval is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Link: https://lore.kernel.org/r/20201204191810.1150995-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Addresses-Coverity: ("Unused value")	2020-12-09 11:34:17 -05:00
Arnd Bergmann	e7e499ee8a	i.MX SoC update for 5.11: - Add revision detection support for i.MX7ULP revision 2.2. - Add a little document for i.MX7ULP B2 silicon version. - Add serial number support for i.MX23, i.MX28 SoCs through soc_device. - Improve the identifying of i.MX6QP SoCs. -----BEGIN PGP SIGNATURE----- iQFIBAABCgAyFiEEFmJXigPl4LoGSz08UFdYWoewfM4FAl/HjyIUHHNoYXduZ3Vv QGtlcm5lbC5vcmcACgkQUFdYWoewfM6aaAf+Pl6Mo1Jf9EwEG5gdJ+ELXODNIae7 CLMcg2DPTdZGTrAFdAjZ0uKmZbQNshvDr3m4SQNSGN/7IYiRjayR3jFg20mbVDjT CvQYGHrNT8g6FEjcM50J0yBvBRgC7PvX1OXj9WZ8DXqOKw+qwZwFDoIlbk7Q/k+I POYpTbJw79vzSrgn2yTQIgBx4UczC5ZR1SldGCZ5M2PzjfKEOeEJwGHEg+hrhoOB kBTgqprJJyY4FX1NiGvbnobfVKqYhykxw6CgoI8GNsG0O58xU2TdZw2T9t/ZZcgw T+sV50RUXvVmyBSl3s5VDDBtHaM2RGBj8aPMPTQhWjUliKlka4KslWNp/A== =PpkS -----END PGP SIGNATURE----- Merge tag 'imx-soc-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/soc i.MX SoC update for 5.11: - Add revision detection support for i.MX7ULP revision 2.2. - Add a little document for i.MX7ULP B2 silicon version. - Add serial number support for i.MX23, i.MX28 SoCs through soc_device. - Improve the identifying of i.MX6QP SoCs. * tag 'imx-soc-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: mxs: Add serial number support for i.MX23, i.MX28 SoCs ARM: imx: mach-imx6q: correctly identify i.MX6QP SoCs ARM: imx: imx7ulp: Add a comment explaining the B2 silicon version ARM: imx: Add revision support for i.MX7ULP revision 2.2 Link: https://lore.kernel.org/r/20201202142717.9262-2-shawnguo@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2020-12-09 17:27:35 +01:00
Arnd Bergmann	b760bfbcbf	arm64: Kconfig.platform: amlogic updates for v5.11 - ship only the necessary clock controllers -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEe4dGDhaSf6n1v/EMWTcYmtP7xmUFAl/GwtUACgkQWTcYmtP7 xmV0Iw//f53+9bGT9suyUIaE7vTkUD3N7BpOZ2hhc+wjoDcWz3cg+R4R6LF7pdZ6 AEbJYQUTepBZvtgS35TEMOL+w+kpL++IG2Jj3bgTUCrWfUNEK52ZnaJobMehdhdL dRny5BYLDWwj+G1VIIQu3lCyxr7qonA4AEsVr+5jFhpJdxvkLyYkRGEDWG1lybUa p0xRE21OfH4BKx4nrXZU64MFUv501iHLtUdhtkddeFaG7UvcwaNdQz3llh7SvxX6 gC2pR6aaU7bjsCZOBk/Gig12PxlSMzt8/PkAAAOiyzFsWoC4qzXShUeJkIPTUZ8i pecuV8dY1XDj35EMROiIHoNMfEDYmwjr+yww3PP8NAyhMBrff/GX4cPeV5pPRWsK UiL3WBqFMQNhAm+guDXW4aUlHEw0EzhhkA4ZhEnRTGkwrPWeqyP6lvi2vk14itpG NXLHosoJaiZLb/i8/6UJYj4vhPaRMo44Sja+8NtzUm1R/nVza69pNvCmH+bNg4ct 1v7NKUoExaGT69bWrxcW1T6ZQNDhXNude1+ZUpsTRjcFH410ouy3pF0oPr57K2M5 S7HsHiRlkomewWEsfq/O7qzAY1lk5jTNan1SA1MgYpPaQ+sIYJpQTvdSxtJ8m63A XR7bhxxYuq+J+6gJecxyP5EBp2sMrj/uqJ2TIPrDshwG2/Y5X1o= =vf5y -----END PGP SIGNATURE----- Merge tag 'amlogic-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into arm/soc arm64: Kconfig.platform: amlogic updates for v5.11 - ship only the necessary clock controllers * tag 'amlogic-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic: arm64: meson: ship only the necessary clock controllers Link: https://lore.kernel.org/r/7hlfehjgv8.fsf@baylibre.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2020-12-09 17:26:34 +01:00
Arnd Bergmann	4bdfafd6ff	mvebu arm for 5.11 (part 1) Update MAINTAINER file: - add new 98DX3236 based boards - point new git repository for mvebu -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQQYqXDMF3cvSLY+g9cLBhiOFHI71QUCX8UN/gAKCRALBhiOFHI7 1WeJAJ9/36tHDlpxahJgS6S5YgR2pW0FFQCfbOjvzWUJTgUPYVDtg/Ef/MWPCaE= =E6a3 -----END PGP SIGNATURE----- Merge tag 'mvebu-arm-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/soc mvebu arm for 5.11 (part 1) Update MAINTAINER file: - add new 98DX3236 based boards - point new git repository for mvebu * tag 'mvebu-arm-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: MAINTAINERS: switch mvebu tree to kernel.org MAINTAINERS: Add an entry for MikroTik CRS3xx 98DX3236 boards Link: https://lore.kernel.org/r/87k0u2j0n7.fsf@BL-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2020-12-09 17:25:43 +01:00
Simon Perron Caissy	5b13886da8	ice: Add space to unknown speed Add space to the end of 'Unknown' string in order to avoid concatenation with 'bps' string when formatting netdev log message. Signed-off-by: Simon Perron Caissy <simon.perron.caissy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Jacob Keller	9228d8b261	ice: join format strings to same line as ice_debug When printing messages with ice_debug, align the printed string to the origin line of the message in order to ease debugging and tracking messages back to their source. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Bruce Allan	34d8461a65	ice: silence static analysis warning sparse warns about cast to/from restricted types which is not an actual problem; silence the warning. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Bruce Allan	32e6deb297	ice: cleanup misleading comment The maximum Admin Queue buffer size and NVM shadow RAM sector size are both 4 Kilobytes. Some comments refer to those as 4Kb which can be confused with 4 Kilobits. Update the comments to use the commonly used KB symbol instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Nick Nunley	bcf68ea1e5	ice: Remove vlan_ena from vsi structure vlan_ena was introduced to track whether VLAN filters are enabled on the device, but 1) checking for num_vlan > 1 already gives us this information, and is currently used in this way throughout the code 2) the logic for vlan_ena is broken when multiple VLANs are active Just remove vlan_ena and use num_vlan instead. Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Jeb Cramer	956542cae5	ice: Remove gate to OROM init Remove the gate that prevents the OROM and netlist info from being populated. The NVM now has the appropriate section for software to reference the versioning info. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Jeb Cramer	c21125c997	ice: Enable Support for FW Override (E82X) The driver is able to override the firmware when it comes to supporting a more lenient link mode. This feature was limited to E810 devices. It is now extended to E82X devices. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Paul M Stillwell Jr	f2651a91b9	ice: don't always return an error for Get PHY Abilities AQ command There are times when the driver shouldn't return an error when the Get PHY abilities AQ command (0x0600) returns an error. Instead the driver should log that the error occurred and continue on. This allows the driver to load even though the AQ command failed. The user can then later determine the reason for the failure and correct it. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Bruce Allan	88dcfdb4cd	ice: cleanup stack hog In ice_flow_add_prof_sync(), struct ice_flow_prof_params has recently grown in size hogging stack space when allocated there. Hogging stack space should be avoided. Change allocation to be on the heap when needed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Harikumar Bokkena <harikumarx.bokkena@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:13 -08:00
Kan Liang	c2208046bb	perf/x86/intel: Add Tremont Topdown support Tremont has four L1 Topdown events, TOPDOWN_FE_BOUND.ALL, TOPDOWN_BAD_SPECULATION.ALL, TOPDOWN_BE_BOUND.ALL and TOPDOWN_RETIRING.ALL. They are available on GP counters. Export them to sysfs and facilitate the perf stat tool. $perf stat --topdown -- sleep 1 Performance counter stats for 'sleep 1': retiring bad speculation frontend bound backend bound 24.9% 16.8% 31.7% 26.6% 1.001224610 seconds time elapsed 0.001150000 seconds user 0.000000000 seconds sys Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1607457952-3519-1-git-send-email-kan.liang@linux.intel.com	2020-12-09 17:08:59 +01:00
Gustavo A. R. Silva	bd11952b40	uprobes/x86: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://github.com/KSPP/linux/issues/115	2020-12-09 17:08:59 +01:00
Gustavo A. R. Silva	b645957545	perf/x86: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a fallthrough pseudo-keyword as a replacement for a /* fall through / comment, instead of letting the code fall through to the next case. Notice that Clang doesn't recognize / fall through */ comments as implicit fall-through markings. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://github.com/KSPP/linux/issues/115	2020-12-09 17:08:59 +01:00
Gustavo A. R. Silva	e689b300c9	kprobes/x86: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of just letting the code fall through to the next case. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://github.com/KSPP/linux/issues/115	2020-12-09 17:08:58 +01:00
Kan Liang	f8129cd958	perf/x86/intel/lbr: Fix the return type of get_lbr_cycles() The cycle count of a timed LBR is always 1 in perf record -D. The cycle count is stored in the first 16 bits of the IA32_LBR_x_INFO register, but the get_lbr_cycles() return Boolean type. Use u16 to replace the Boolean type. Fixes: `47125db27e` ("perf/x86/intel/lbr: Support Architectural LBR") Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201125213720.15692-2-kan.liang@linux.intel.com	2020-12-09 17:08:58 +01:00
Kan Liang	46b72e1bf4	perf/x86/intel: Fix rtm_abort_event encoding on Ice Lake According to the event list from icelake_core_v1.09.json, the encoding of the RTM_RETIRED.ABORTED event on Ice Lake should be, "EventCode": "0xc9", "UMask": "0x04", "EventName": "RTM_RETIRED.ABORTED", Correct the wrong encoding. Fixes: `6017608936` ("perf/x86/intel: Add Icelake support") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201125213720.15692-1-kan.liang@linux.intel.com	2020-12-09 17:08:57 +01:00
Masami Hiramatsu	78ff2733ff	x86/kprobes: Restore BTF if the single-stepping is cancelled Fix to restore BTF if single-stepping causes a page fault and it is cancelled. Usually the BTF flag was restored when the single stepping is done (in resume_execution()). However, if a page fault happens on the single stepping instruction, the fault handler is invoked and the single stepping is cancelled. Thus, the BTF flag is not restored. Fixes: `1ecc798c67` ("x86: debugctlmsr kprobes") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/160389546985.106936.12727996109376240993.stgit@devnote2	2020-12-09 17:08:57 +01:00
peterz@infradead.org	78af4dc949	perf: Break deadlock involving exec_update_mutex Syzbot reported a lock inversion involving perf. The sore point being perf holding exec_update_mutex() for a very long time, specifically across a whole bunch of filesystem ops in pmu::event_init() (uprobes) and anon_inode_getfile(). This then inverts against procfs code trying to take exec_update_mutex. Move the permission checks later, such that we need to hold the mutex over less code. Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2020-12-09 17:08:57 +01:00
Peter Zijlstra	e6e4f42eb7	sparc64/mm: Implement pXX_leaf_size() support Sparc64 has non-pagetable aligned large page support; wire up the pXX_leaf_size() functions to report the correct pagetable page size. This enables PERF_SAMPLE_{DATA,CODE}_PAGE_SIZE to report accurate pagetable leaf sizes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201126121121.301768209@infradead.org	2020-12-09 17:08:56 +01:00
Peter Zijlstra	c5eecbb58f	powerpc/8xx: Implement pXX_leaf_size() support Christophe Leroy wrote: > I can help with powerpc 8xx. It is a 32 bits powerpc. The PGD has 1024 > entries, that means each entry maps 4M. > > Page sizes are 4k, 16k, 512k and 8M. > > For the 8M pages we use hugepd with a single entry. The two related PGD > entries point to the same hugepd. > > For the other sizes, they are in standard page tables. 16k pages appear > 4 times in the page table. 512k entries appear 128 times in the page > table. > > When the PGD entry has _PMD_PAGE_8M bits, the PMD entry points to a > hugepd with holds the single 8M entry. > > In the PTE, we have two bits: _PAGE_SPS and _PAGE_HUGE > > _PAGE_HUGE means it is a 512k page > _PAGE_SPS means it is not a 4k page > > The kernel can by build either with 4k pages as standard page size, or > 16k pages. It doesn't change the page table layout though. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201126121121.364451610@infradead.org	2020-12-09 17:08:56 +01:00
Ahmed S. Darwish	cb262935a1	seqlock: kernel-doc: Specify when preemption is automatically altered The kernel-doc annotations for sequence counters write side functions are incomplete: they do not specify when preemption is automatically disabled and re-enabled. This has confused a number of call-site developers. Fix it. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/CAHk-=wikhGExmprXgaW+MVXG1zsGpztBbVwOb23vetk41EtTBQ@mail.gmail.com	2020-12-09 17:08:49 +01:00
Ahmed S. Darwish	66bcfcdf89	seqlock: Prefix internal seqcount_t-only macros with a "do_" When the seqcount_LOCKNAME_t group of data types were introduced, two classes of seqlock.h sequence counter macros were added: - An external public API which can either take a plain seqcount_t or any of the seqcount_LOCKNAME_t variants. - An internal API which takes only a plain seqcount_t. To distinguish between the two groups, the "_seqcount_t_" pattern was used for the latter. This confused a number of mm/ call-site developers, and Linus also commented that it was not a standard practice for marking seqlock.h internal APIs. Distinguish the latter group of macros by prefixing a "do_". Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/CAHk-=wikhGExmprXgaW+MVXG1zsGpztBbVwOb23vetk41EtTBQ@mail.gmail.com	2020-12-09 17:08:49 +01:00
Ahmed S. Darwish	cf48647243	Documentation: seqlock: s/LOCKTYPE/LOCKNAME/g Sequence counters with an associated write serialization lock are called seqcount_LOCKNAME_t. Fix the documentation accordingly. While at it, remove a paragraph that inappropriately discussed a seqlock.h implementation detail. Fixes: `6dd699b13d` ("seqlock: seqcount_LOCKNAME_t: Standardize naming convention") Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201206162143.14387-2-a.darwish@linutronix.de	2020-12-09 17:08:49 +01:00
Waiman Long	617f3ef951	locking/rwsem: Remove reader optimistic spinning Reader optimistic spinning is helpful when the reader critical section is short and there aren't that many readers around. It also improves the chance that a reader can get the lock as writer optimistic spinning disproportionally favors writers much more than readers. Since commit `d3681e269f` ("locking/rwsem: Wake up almost all readers in wait queue"), all the waiting readers are woken up so that they can all get the read lock and run in parallel. When the number of contending readers is large, allowing reader optimistic spinning will likely cause reader fragmentation where multiple smaller groups of readers can get the read lock in a sequential manner separated by writers. That reduces reader parallelism. One possible way to address that drawback is to limit the number of readers (preferably one) that can do optimistic spinning. These readers act as representatives of all the waiting readers in the wait queue as they will wake up all those waiting readers once they get the lock. Alternatively, as reader optimistic lock stealing has already enhanced fairness to readers, it may be easier to just remove reader optimistic spinning and simplifying the optimistic spinning code as a result. Performance measurements (locking throughput kops/s) using a locking microbenchmark with 50/50 reader/writer distribution and turbo-boost disabled was done on a 2-socket Cascade Lake system (48-core 96-thread) to see the impacts of these changes: 1) Vanilla - 5.10-rc3 kernel 2) Before - 5.10-rc3 kernel with previous patches in this series 2) limit-rspin - 5.10-rc3 kernel with limited reader spinning patch 3) no-rspin - 5.10-rc3 kernel with reader spinning disabled # of threads CS Load Vanilla Before limit-rspin no-rspin ------------ ------- ------- ------ ----------- -------- 2 1 5,185 5,662 5,214 5,077 4 1 5,107 4,983 5,188 4,760 8 1 4,782 4,564 4,720 4,628 16 1 4,680 4,053 4,567 3,402 32 1 4,299 1,115 1,118 1,098 64 1 3,218 983 1,001 957 96 1 1,938 944 957 930 2 20 2,008 2,128 2,264 1,665 4 20 1,390 1,033 1,046 1,101 8 20 1,472 1,155 1,098 1,213 16 20 1,332 1,077 1,089 1,122 32 20 967 914 917 980 64 20 787 874 891 858 96 20 730 836 847 844 2 100 372 356 360 355 4 100 492 425 434 392 8 100 533 537 529 538 16 100 548 572 568 598 32 100 499 520 527 537 64 100 466 517 526 512 96 100 406 497 506 509 The column "CS Load" represents the number of pause instructions issued in the locking critical section. A CS load of 1 is extremely short and is not likey in real situations. A load of 20 (moderate) and 100 (long) are more realistic. It can be seen that the previous patches in this series have reduced performance in general except in highly contended cases with moderate or long critical sections that performance improves a bit. This change is mostly caused by the "Prevent potential lock starvation" patch that reduce reader optimistic spinning and hence reduce reader fragmentation. The patch that further limit reader optimistic spinning doesn't seem to have too much impact on overall performance as shown in the benchmark data. The patch that disables reader optimistic spinning shows reduced performance at lightly loaded cases, but comparable or slightly better performance on with heavier contention. This patch just removes reader optimistic spinning for now. As readers are not going to do optimistic spinning anymore, we don't need to consider if the OSQ is empty or not when doing lock stealing. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lkml.kernel.org/r/20201121041416.12285-6-longman@redhat.com	2020-12-09 17:08:48 +01:00
Waiman Long	1a728dff85	locking/rwsem: Enable reader optimistic lock stealing If the optimistic spinning queue is empty and the rwsem does not have the handoff or write-lock bits set, it is actually not necessary to call rwsem_optimistic_spin() to spin on it. Instead, it can steal the lock directly as its reader bias is in the count already. If it is the first reader in this state, it will try to wake up other readers in the wait queue. With this patch applied, the following were the lock event counts after rebooting a 2-socket system and a "make -j96" kernel rebuild. rwsem_opt_rlock=4437 rwsem_rlock=29 rwsem_rlock_steal=19 So lock stealing represents about 0.4% of all the read locks acquired in the slow path. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lkml.kernel.org/r/20201121041416.12285-4-longman@redhat.com	2020-12-09 17:08:48 +01:00
Waiman Long	2f06f70292	locking/rwsem: Prevent potential lock starvation The lock handoff bit is added in commit `4f23dbc1e6` ("locking/rwsem: Implement lock handoff to prevent lock starvation") to avoid lock starvation. However, allowing readers to do optimistic spinning does introduce an unlikely scenario where lock starvation can happen. The lock handoff bit may only be set when a waiter is being woken up. In the case of reader unlock, wakeup happens only when the reader count reaches 0. If there is a continuous stream of incoming readers acquiring read lock via optimistic spinning, it is possible that the reader count may never reach 0 and so the handoff bit will never be asserted. One way to prevent this scenario from happening is to disallow optimistic spinning if the rwsem is currently owned by readers. If the previous or current owner is a writer, optimistic spinning will be allowed. If the previous owner is a reader but the reader count has reached 0 before, a wakeup should have been issued. So the handoff mechanism will be kicked in to prevent lock starvation. As a result, it should be OK to do optimistic spinning in this case. This patch may have some impact on reader performance as it reduces reader optimistic spinning especially if the lock critical sections are short the number of contending readers are small. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lkml.kernel.org/r/20201121041416.12285-3-longman@redhat.com	2020-12-09 17:08:48 +01:00
Waiman Long	c8fe8b0564	locking/rwsem: Pass the current atomic count to rwsem_down_read_slowpath() The atomic count value right after reader count increment can be useful to determine the rwsem state at trylock time. So the count value is passed down to rwsem_down_read_slowpath() to be used when appropriate. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lkml.kernel.org/r/20201121041416.12285-2-longman@redhat.com	2020-12-09 17:08:47 +01:00
Peter Zijlstra	c995e638cc	locking/rwsem: Fold __down_{read,write}() There's a lot needless duplication in __down_{read,write}(), cure that with a helper. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201207090243.GE3040@hirez.programming.kicks-ass.net	2020-12-09 17:08:47 +01:00
Peter Zijlstra	285c61aedf	locking/rwsem: Introduce rwsem_write_trylock() One copy of this logic is better than three. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201207090243.GE3040@hirez.programming.kicks-ass.net	2020-12-09 17:08:47 +01:00
Peter Zijlstra	3379116a0c	locking/rwsem: Better collate rwsem_read_trylock() All users of rwsem_read_trylock() do rwsem_set_reader_owned(sem) on success, move it into rwsem_read_trylock() proper. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201207090243.GE3040@hirez.programming.kicks-ass.net	2020-12-09 17:08:47 +01:00
Peter Zijlstra	2b3c99ee63	Merge branch 'locking/rwsem'	2020-12-09 17:08:45 +01:00
Eric W. Biederman	31784cff7e	rwsem: Implement down_read_interruptible In preparation for converting exec_update_mutex to a rwsem so that multiple readers can execute in parallel and not deadlock, add down_read_interruptible. This is needed for perf_event_open to be converted (with no semantic changes) from working on a mutex to wroking on a rwsem. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org	2020-12-09 17:08:42 +01:00
Eric W. Biederman	0f9368b5bf	rwsem: Implement down_read_killable_nested In preparation for converting exec_update_mutex to a rwsem so that multiple readers can execute in parallel and not deadlock, add down_read_killable_nested. This is needed so that kcmp_lock can be converted from working on a mutexes to working on rw_semaphores. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org	2020-12-09 17:08:41 +01:00
Arnd Bergmann	2efc35dc43	Samsung mach/soc changes for v5.11 1. Do not use of_machine_is_compatible() in early CPU hotplug core. Full device tree walk causes "suspicious RCU usage" warnings. 2. Clear prefetch bits in default l2c_aux_val of L310 L2C - they are not needed. 3. Extend cpuidle support to P4 Note boards (Exynos4412). -----BEGIN PGP SIGNATURE----- iQJEBAABCgAuFiEE3dJiKD0RGyM7briowTdm5oaLg9cFAl/Gp/kQHGtyemtAa2Vy bmVsLm9yZwAKCRDBN2bmhouD1+clEACZJlbbkXMJCUSuASr92qw0BG0gdB/QuhwF ZaqwI1eC9rDH3WJeGozYEv9yr2slprEGZtc+xRCW0jkNce1zoYNzazj0K0UNsA7H TruFZrGnTFFx/WE70gOlMdRipF++dMJ52jusSWd5Pa6vIO9liDfppLHetqrtnIgr 6ZLfSjkx91Fy1JNuOCUtRgckIRb/Z7OOLDUXYq3I3u/Gxurz/cJRm6KmEirw3hbz TH7YQL/LGuPZktW8kEdCor7Eh3c+pSXrIKBD46AKlxRs3UubGGOL1pWmZi9SdJ5O K4Wru75oz27tpngSeavwqf12l9sOtP/cOP8eEOyZ6LJrlsWznvhKFm4x0r6NKlCa 1Po65DFzck4Mbud/JLt22d9x05Ul7OVh7nr1GeJhHSDGH/Uj9NatcjPsThTo+eFy 358vo9rtfBhdG/+7DSBUqS1MS5pR8jAFLDJ3KIglVCbWZJ0S0He0X/huaEWAZ2+a BXO2QnK16zxan4WKr+qNqMRqGhEXeXz0Oun8Kz+KZcw1BYieTZMJl4x4EIiHEjk/ x7de9kjqsZFbWx7WZMZXg0/enNlQPbuHYgp2UWWZLt/ueXeED71TuOEMiN9Pg2VD SWSjo+xiF3m7o3ckXisyqnPm1RYce9M9p1df8zswPfxtb4GK2Lnlsj4Me50WhQwM jSBM3YjcuQ== =hlj3 -----END PGP SIGNATURE----- Merge tag 'samsung-soc-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/soc Samsung mach/soc changes for v5.11 1. Do not use of_machine_is_compatible() in early CPU hotplug core. Full device tree walk causes "suspicious RCU usage" warnings. 2. Clear prefetch bits in default l2c_aux_val of L310 L2C - they are not needed. 3. Extend cpuidle support to P4 Note boards (Exynos4412). * tag 'samsung-soc-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: exynos: extend cpuidle support to P4 Note boards ARM: exynos: clear prefetch bits in default l2c_aux_val ARM: exynos: Simplify code in Exynos3250 CPU core restart path Link: https://lore.kernel.org/r/20201201204404.22675-4-krzk@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2020-12-09 16:58:48 +01:00
Daniel Borkmann	08c6a2f620	Merge branch 'bpf-xsk-selftests' Weqaar Janjua says: ==================== This patch set adds AF_XDP selftests based on veth to selftests/bpf. # Topology: # --------- # ----------- # _ \| Process \| _ # / ----------- \ # / \| \ # / \| \ # ----------- \| ----------- # \| Thread1 \| \| \| Thread2 \| # ----------- \| ----------- # \| \| \| # ----------- \| ----------- # \| xskX \| \| \| xskY \| # ----------- \| ----------- # \| \| \| # ----------- \| ---------- # \| vethX \| --------- \| vethY \| # ----------- peer ---------- # \| \| \| # namespaceX \| namespaceY These selftests test AF_XDP SKB and Native/DRV modes using veth Virtual Ethernet interfaces. The test program contains two threads, each thread is single socket with a unique UMEM. It validates in-order packet delivery and packet content by sending packets to each other. Prerequisites setup by script test_xsk.sh: Set up veth interfaces as per the topology shown ^^: * setup two veth interfaces and one namespace veth<xxxx> in root namespace veth<yyyy> in af_xdp<xxxx> namespace ** namespace af_xdp<xxxx> * create a spec file veth.spec that includes this run-time configuration *** xxxx and yyyy are randomly generated 4 digit numbers used to avoid conflict with any existing interface Adds xsk framework test to validate veth xdp DRV and SKB modes. The following tests are provided: 1. AF_XDP SKB mode Generic mode XDP is driver independent, used when the driver does not have support for XDP. Works on any netdevice using sockets and generic XDP path. XDP hook from netif_receive_skb(). a. nopoll - soft-irq processing b. poll - using poll() syscall c. Socket Teardown Create a Tx and a Rx socket, Tx from one socket, Rx on another. Destroy both sockets, then repeat multiple times. Only nopoll mode is used d. Bi-directional Sockets Configure sockets as bi-directional tx/rx sockets, sets up fill and completion rings on each socket, tx/rx in both directions. Only nopoll mode is used 2. AF_XDP DRV/Native mode Works on any netdevice with XDP_REDIRECT support, driver dependent. Processes packets before SKB allocation. Provides better performance than SKB. Driver hook available just after DMA of buffer descriptor. a. nopoll b. poll c. Socket Teardown d. Bi-directional Sockets * Only copy mode is supported because veth does not currently support zero-copy mode Total tests: 8 Flow: * Single process spawns two threads: Tx and Rx * Each of these two threads attach to a veth interface within their assigned namespaces * Each thread creates one AF_XDP socket connected to a unique umem for each veth interface * Tx thread transmits 10k packets from veth<xxxx> to veth<yyyy> * Rx thread verifies if all 10k packets were received and delivered in-order, and have the right content v2 changes: * Move selftests/xsk to selftests/bpf * Remove Makefiles under selftests/xsk, and utilize selftests/bpf/Makefile v3 changes: * merge all test scripts test_xsk_.sh into test_xsk.sh v4 changes: merge xsk_env.sh into xsk_prereqs.sh * test_xsk.sh add cliarg -c for color-coded output * test_xsk.sh PREREQUISITES disables IPv6 on veth interfaces * test_xsk.sh PREREQUISITES adds xsk framework test * test_xsk.sh is independently executable * xdpxceiver.c Tx/Rx validates only IPv4 packets with TOS 0x9, ignores others ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2020-12-09 16:44:50 +01:00
Weqaar Janjua	7d20441eb0	selftests/bpf: Xsk selftests - Bi-directional Sockets - SKB, DRV Adds following tests: 1. AF_XDP SKB mode d. Bi-directional Sockets Configure sockets as bi-directional tx/rx sockets, sets up fill and completion rings on each socket, tx/rx in both directions. Only nopoll mode is used 2. AF_XDP DRV/Native mode d. Bi-directional Sockets * Only copy mode is supported because veth does not currently support zero-copy mode Signed-off-by: Weqaar Janjua <weqaar.a.janjua@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Yonghong Song <yhs@fb.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20201207215333.11586-6-weqaar.a.janjua@intel.com	2020-12-09 16:44:45 +01:00

... 70 71 72 73 74 ...

982974 Commits