linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-24 05:02:12 +00:00

Author	SHA1	Message	Date
Ido Schimmel	76c03bf8e2	nexthop: Do not flush blackhole nexthops when loopback goes down As far as user space is concerned, blackhole nexthops do not have a nexthop device and therefore should not be affected by the administrative or carrier state of any netdev. However, when the loopback netdev goes down all the blackhole nexthops are flushed. This happens because internally the kernel associates blackhole nexthops with the loopback netdev. This behavior is both confusing to those not familiar with kernel internals and also diverges from the legacy API where blackhole IPv4 routes are not flushed when the loopback netdev goes down: # ip route add blackhole 198.51.100.0/24 # ip link set dev lo down # ip route show 198.51.100.0/24 blackhole 198.51.100.0/24 Blackhole IPv6 routes are flushed, but at least user space knows that they are associated with the loopback netdev: # ip -6 route show 2001:db8:1::/64 blackhole 2001:db8:1::/64 dev lo metric 1024 pref medium Fix this by only flushing blackhole nexthops when the loopback netdev is unregistered. Fixes: `ab84be7e54` ("net: Initial nexthop code") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reported-by: Donald Sharp <sharpd@nvidia.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 14:04:49 -08:00
Drew Fustini	d93ef30164	net: sctp: trivial: fix typo in comment Fix typo of 'overflow' for comment in sctp_tsnmap_check(). Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Drew Fustini <drew@beagleboard.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 13:48:32 -08:00
David S. Miller	e216674a5b	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-03-03 This series contains updates to ixgbe and ixgbevf drivers. Bartosz Golaszewski does not error on -ENODEV from ixgbe_mii_bus_init() as this is valid for some devices with a shared bus for ixgbe. Antony Antony adds a check to fail for non transport mode SA with offload as this is not supported for ixgbe and ixgbevf. Dinghao Liu fixes a memory leak on failure to program a perfect filter for ixgbe. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 13:47:42 -08:00
Dinghao Liu	7a76638163	ixgbe: Fix memleak in ixgbe_configure_clsu32 When ixgbe_fdir_write_perfect_filter_82599() fails, input allocated by kzalloc() has not been freed, which leads to memleak. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-04 11:05:06 -08:00
Antony Antony	d785e1fec6	ixgbe: fail to create xfrm offload of IPsec tunnel mode SA Based on talks and indirect references ixgbe IPsec offlod do not support IPsec tunnel mode offload. It can only support IPsec transport mode offload. Now explicitly fail when creating non transport mode SA with offload to avoid false performance expectations. Fixes: `63a67fe229` ("ixgbe: add ipsec offload add and remove SA") Signed-off-by: Antony Antony <antony@phenome.org> Acked-by: Shannon Nelson <snelson@pensando.io> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-03-04 11:05:05 -08:00
zhang kai	a9ecb0cbf0	rtnetlink: using dev_base_seq from target net Signed-off-by: zhang kai <zhangkaiheb@126.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 16:59:17 -08:00
Jisheng Zhang	d65614a01d	net: 9p: advance iov on empty read I met below warning when cating a small size(about 80bytes) txt file on 9pfs(msize=2097152 is passed to 9p mount option), the reason is we miss iov_iter_advance() if the read count is 0 for zerocopy case, so we didn't truncate the pipe, then iov_iter_pipe() thinks the pipe is full. Fix it by removing the exception for 0 to ensure to call iov_iter_advance() even on empty read for zerocopy case. [ 8.279568] WARNING: CPU: 0 PID: 39 at lib/iov_iter.c:1203 iov_iter_pipe+0x31/0x40 [ 8.280028] Modules linked in: [ 8.280561] CPU: 0 PID: 39 Comm: cat Not tainted 5.11.0+ #6 [ 8.281260] RIP: 0010:iov_iter_pipe+0x31/0x40 [ 8.281974] Code: 2b 42 54 39 42 5c 76 22 c7 07 20 00 00 00 48 89 57 18 8b 42 50 48 c7 47 08 b [ 8.283169] RSP: 0018:ffff888000cbbd80 EFLAGS: 00000246 [ 8.283512] RAX: 0000000000000010 RBX: ffff888000117d00 RCX: 0000000000000000 [ 8.283876] RDX: ffff88800031d600 RSI: 0000000000000000 RDI: ffff888000cbbd90 [ 8.284244] RBP: ffff888000cbbe38 R08: 0000000000000000 R09: ffff8880008d2058 [ 8.284605] R10: 0000000000000002 R11: ffff888000375510 R12: 0000000000000050 [ 8.284964] R13: ffff888000cbbe80 R14: 0000000000000050 R15: ffff88800031d600 [ 8.285439] FS: 00007f24fd8af600(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000 [ 8.285844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.286150] CR2: 00007f24fd7d7b90 CR3: 0000000000c97000 CR4: 00000000000406b0 [ 8.286710] Call Trace: [ 8.288279] generic_file_splice_read+0x31/0x1a0 [ 8.289273] ? do_splice_to+0x2f/0x90 [ 8.289511] splice_direct_to_actor+0xcc/0x220 [ 8.289788] ? pipe_to_sendpage+0xa0/0xa0 [ 8.290052] do_splice_direct+0x8b/0xd0 [ 8.290314] do_sendfile+0x1ad/0x470 [ 8.290576] do_syscall_64+0x2d/0x40 [ 8.290818] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 8.291409] RIP: 0033:0x7f24fd7dca0a [ 8.292511] Code: c3 0f 1f 80 00 00 00 00 4c 89 d2 4c 89 c6 e9 bd fd ff ff 0f 1f 44 00 00 31 8 [ 8.293360] RSP: 002b:00007ffc20932818 EFLAGS: 00000206 ORIG_RAX: 0000000000000028 [ 8.293800] RAX: ffffffffffffffda RBX: 0000000001000000 RCX: 00007f24fd7dca0a [ 8.294153] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001 [ 8.294504] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 [ 8.294867] R10: 0000000001000000 R11: 0000000000000206 R12: 0000000000000003 [ 8.295217] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000 [ 8.295782] ---[ end trace 63317af81b3ca24b ]--- Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 16:57:59 -08:00
Hayes Wang	4b5dc1a94d	Revert "r8152: adjust the settings about MAC clock speed down for RTL8153" This reverts commit `134f98bcf1`. The r8153_mac_clk_spd() is used for RTL8153A only, because the register table of RTL8153B is different from RTL8153A. However, this function would be called when RTL8153B calls r8153_first_init() and r8153_enter_oob(). That causes RTL8153B becomes unstable when suspending and resuming. The worst case may let the device stop working. Besides, revert this commit to disable MAC clock speed down for RTL8153A. It would avoid the known issue when enabling U1. The data of the first control transfer may be wrong when exiting U1. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 16:56:03 -08:00
Matthias Schiffer	3e59e88567	net: l2tp: reduce log level of messages in receive path, add counter instead Commit `5ee759cda5` ("l2tp: use standard API for warning log messages") changed a number of warnings about invalid packets in the receive path so that they are always shown, instead of only when a special L2TP debug flag is set. Even with rate limiting these warnings can easily cause significant log spam - potentially triggered by a malicious party sending invalid packets on purpose. In addition these warnings were noticed by projects like Tunneldigger [1], which uses L2TP for its data path, but implements its own control protocol (which is sufficiently different from L2TP data packets that it would always be passed up to userspace even with future extensions of L2TP). Some of the warnings were already redundant, as l2tp_stats has a counter for these packets. This commit adds one additional counter for invalid packets that are passed up to userspace. Packets with unknown session are not counted as invalid, as there is nothing wrong with the format of these packets. With the additional counter, all of these messages are either redundant or benign, so we reduce them to pr_debug_ratelimited(). [1] https://github.com/wlanslovenija/tunneldigger/issues/160 Fixes: `5ee759cda5` ("l2tp: use standard API for warning log messages") Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 16:55:02 -08:00
Atish Patra	b12422362c	net: macb: Add default usrio config to default gem config There is no usrio config defined for default gem config leading to a kernel panic devices that don't define a data. This issue can be reprdouced with microchip polar fire soc where compatible string is defined as "cdns,macb". Fixes: `edac63861d` ("add userio bits as platform configuration") Signed-off-by: Atish Patra <atish.patra@wdc.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 16:53:45 -08:00
David S. Miller	ef9a6df09c	wireless-drivers fixes for v5.12 Second set of fixes for v5.12. Only three iwlwifi fixes this time, the crash with MVM being the most important one and reported by multiple people. iwlwifi * fix kernel crash regression when using LTO with MVM devices * fix printk format warnings * fix potential deadlock found by lockdep -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJgP8G0AAoJEG4XJFUm622b1ZoH/1wSyEwQB90iYlZO4w1pR2yD HHHJjQQE8JCkOOoMNTHLkCfFWc76c/e6BL3+U9DsbvJGuPccUAIiiUQUonrEMph/ QmNBDd7OB/yIYpmkIpciKpaKAg2Vg4qF2owq8xRDpnD5NosTfCUiacvkuMRB4Wzl NuGBEECNw6Dq/l/vYVe2pgTQ+rYxCGrvpU7GsYAa5vOsSxzS+4RhMc4gFV5Ae9rK RNWT4StB7wi8sLjhdEYR2hldyS2OSPnrFNhfJggHw6d/4aXPlmBe7tnd3D1P3kwF +9FlqjA5U5bcT+3DC2ucFlGtwrfumxi+ro5j8Va19O2/eTLturCIK26A2C+BtTQ= =cOCt -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-2021-03-03' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for v5.12 Second set of fixes for v5.12. Only three iwlwifi fixes this time, the crash with MVM being the most important one and reported by multiple people. iwlwifi * fix kernel crash regression when using LTO with MVM devices * fix printk format warnings * fix potential deadlock found by lockdep ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 16:35:24 -08:00
Jakub Kicinski	dbbe7c962c	docs: networking: drop special stable handling Leave it to Greg. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 08:49:08 -08:00
Ong Boon Leong	879c348c35	net: stmmac: fix incorrect DMA channel intr enable setting of EQoS v4.10 We introduce dwmac410_dma_init_channel() here for both EQoS v4.10 and above which use different DMA_CH(n)_Interrupt_Enable bit definitions for NIE and AIE. Fixes: `48863ce594` ("stmmac: add DMA support for GMAC 4.xx") Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Ramesh Babu B <ramesh.babu.b@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 08:48:12 -08:00
Michal Suchanek	6881b07fdd	ibmvnic: Fix possibly uninitialized old_num_tx_queues variable warning. GCC 7.5 reports: ../drivers/net/ethernet/ibm/ibmvnic.c: In function 'ibmvnic_reset_init': ../drivers/net/ethernet/ibm/ibmvnic.c:5373:51: warning: 'old_num_tx_queues' may be used uninitialized in this function [-Wmaybe-uninitialized] ../drivers/net/ethernet/ibm/ibmvnic.c:5373:6: warning: 'old_num_rx_queues' may be used uninitialized in this function [-Wmaybe-uninitialized] The variable is initialized only if(reset) and used only if(reset && something) so this is a false positive. However, there is no reason to not initialize the variables unconditionally avoiding the warning. Fixes: `635e442f4a` ("ibmvnic: merge ibmvnic_reset_init and ibmvnic_init") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 08:46:24 -08:00
Dan Carpenter	2378b2c9ec	octeontx2-af: cn10k: fix an array overflow in is_lmac_valid() The value of "lmac_id" can be controlled by the user and if it is larger then the number of bits in long then it reads outside the bitmap. The highest valid value is less than MAX_LMAC_PER_CGX (4). Fixes: `91c6945ea1` ("octeontx2-af: cn10k: Add RPM MAC support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-03 08:44:24 -08:00
Jiri Kosina	295d4cd82b	iwlwifi: don't call netif_napi_add() with rxq->lock held (was Re: Lockdep warning in iwl_pcie_rx_handle()) We can't call netif_napi_add() with rxq-lock held, as there is a potential for deadlock as spotted by lockdep (see below). rxq->lock is not protecting anything over the netif_napi_add() codepath anyway, so let's drop it just before calling into NAPI. ======================================================== WARNING: possible irq lock inversion dependency detected 5.12.0-rc1-00002-gbada49429032 #5 Not tainted -------------------------------------------------------- irq/136-iwlwifi/565 just changed the state of lock: ffff89f28433b0b0 (&rxq->lock){+.-.}-{2:2}, at: iwl_pcie_rx_handle+0x7f/0x960 [iwlwifi] but this lock took another, SOFTIRQ-unsafe lock in the past: (napi_hash_lock){+.+.}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(napi_hash_lock); local_irq_disable(); lock(&rxq->lock); lock(napi_hash_lock); <Interrupt> lock(&rxq->lock); * DEADLOCK * 1 lock held by irq/136-iwlwifi/565: #0: ffff89f2b1440170 (sync_cmd_lockdep_map){+.+.}-{0:0}, at: iwl_pcie_irq_handler+0x5/0xb30 the shortest dependencies between 2nd lock and 1st lock: -> (napi_hash_lock){+.+.}-{2:2} { HARDIRQ-ON-W at: lock_acquire+0x277/0x3d0 _raw_spin_lock+0x2c/0x40 netif_napi_add+0x14b/0x270 e1000_probe+0x2fe/0xee0 [e1000e] local_pci_probe+0x42/0x90 pci_device_probe+0x10b/0x1c0 really_probe+0xef/0x4b0 driver_probe_device+0xde/0x150 device_driver_attach+0x4f/0x60 __driver_attach+0x9c/0x140 bus_for_each_dev+0x79/0xc0 bus_add_driver+0x18d/0x220 driver_register+0x5b/0xf0 do_one_initcall+0x5b/0x300 do_init_module+0x5b/0x21c load_module+0x1dae/0x22c0 __do_sys_finit_module+0xad/0x110 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae SOFTIRQ-ON-W at: lock_acquire+0x277/0x3d0 _raw_spin_lock+0x2c/0x40 netif_napi_add+0x14b/0x270 e1000_probe+0x2fe/0xee0 [e1000e] local_pci_probe+0x42/0x90 pci_device_probe+0x10b/0x1c0 really_probe+0xef/0x4b0 driver_probe_device+0xde/0x150 device_driver_attach+0x4f/0x60 __driver_attach+0x9c/0x140 bus_for_each_dev+0x79/0xc0 bus_add_driver+0x18d/0x220 driver_register+0x5b/0xf0 do_one_initcall+0x5b/0x300 do_init_module+0x5b/0x21c load_module+0x1dae/0x22c0 __do_sys_finit_module+0xad/0x110 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae INITIAL USE at: lock_acquire+0x277/0x3d0 _raw_spin_lock+0x2c/0x40 netif_napi_add+0x14b/0x270 e1000_probe+0x2fe/0xee0 [e1000e] local_pci_probe+0x42/0x90 pci_device_probe+0x10b/0x1c0 really_probe+0xef/0x4b0 driver_probe_device+0xde/0x150 device_driver_attach+0x4f/0x60 __driver_attach+0x9c/0x140 bus_for_each_dev+0x79/0xc0 bus_add_driver+0x18d/0x220 driver_register+0x5b/0xf0 do_one_initcall+0x5b/0x300 do_init_module+0x5b/0x21c load_module+0x1dae/0x22c0 __do_sys_finit_module+0xad/0x110 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae } ... key at: [<ffffffffae84ef38>] napi_hash_lock+0x18/0x40 ... acquired at: _raw_spin_lock+0x2c/0x40 netif_napi_add+0x14b/0x270 _iwl_pcie_rx_init+0x1f4/0x710 [iwlwifi] iwl_pcie_rx_init+0x1b/0x3b0 [iwlwifi] iwl_trans_pcie_start_fw+0x2ac/0x6a0 [iwlwifi] iwl_mvm_load_ucode_wait_alive+0x116/0x460 [iwlmvm] iwl_run_init_mvm_ucode+0xa4/0x3a0 [iwlmvm] iwl_op_mode_mvm_start+0x9ed/0xbf0 [iwlmvm] _iwl_op_mode_start.isra.4+0x42/0x80 [iwlwifi] iwl_opmode_register+0x71/0xe0 [iwlwifi] iwl_mvm_init+0x34/0x1000 [iwlmvm] do_one_initcall+0x5b/0x300 do_init_module+0x5b/0x21c load_module+0x1dae/0x22c0 __do_sys_finit_module+0xad/0x110 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae [ ... lockdep output trimmed .... ] Fixes: `25edc8f259` ("iwlwifi: pcie: properly implement NAPI") Signed-off-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2103021134060.12405@cbobk.fhfr.pm	2021-03-03 17:59:16 +02:00
Pierre-Louis Bossart	436b265671	iwlwifi: fix ARCH=i386 compilation warnings An unsigned long variable should rely on '%lu' format strings, not '%zd' Fixes: `a1a6a4cf49` ("iwlwifi: pnvm: implement reading PNVM from UEFI") Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210302011640.1276636-1-pierre-louis.bossart@linux.intel.com	2021-03-03 17:57:33 +02:00
Wei Yongjun	a22549f127	iwlwifi: mvm: add terminate entry for dmi_system_id tables Make sure dmi_system_id tables are NULL terminated. This crashed when LTO was enabled: BUG: KASAN: global-out-of-bounds in dmi_check_system+0x5a/0x70 Read of size 1 at addr ffffffffc16af750 by task NetworkManager/1913 CPU: 4 PID: 1913 Comm: NetworkManager Not tainted 5.12.0-rc1+ #10057 Hardware name: LENOVO 20THCTO1WW/20THCTO1WW, BIOS N2VET27W (1.12 ) 12/21/2020 Call Trace: dump_stack+0x90/0xbe print_address_description.constprop.0+0x1d/0x140 ? dmi_check_system+0x5a/0x70 ? dmi_check_system+0x5a/0x70 kasan_report.cold+0x7b/0xd4 ? dmi_check_system+0x5a/0x70 __asan_load1+0x4d/0x50 dmi_check_system+0x5a/0x70 iwl_mvm_up+0x1360/0x1690 [iwlmvm] ? iwl_mvm_send_recovery_cmd+0x270/0x270 [iwlmvm] ? setup_object.isra.0+0x27/0xd0 ? kasan_poison+0x20/0x50 ? ___slab_alloc.constprop.0+0x483/0x5b0 ? mempool_kmalloc+0x17/0x20 ? ftrace_graph_ret_addr+0x2a/0xb0 ? kasan_poison+0x3c/0x50 ? cfg80211_iftype_allowed+0x2e/0x90 [cfg80211] ? __kasan_check_write+0x14/0x20 ? mutex_lock+0x86/0xe0 ? __mutex_lock_slowpath+0x20/0x20 __iwl_mvm_mac_start+0x49/0x290 [iwlmvm] iwl_mvm_mac_start+0x37/0x50 [iwlmvm] drv_start+0x73/0x1b0 [mac80211] ieee80211_do_open+0x53e/0xf10 [mac80211] ? ieee80211_check_concurrent_iface+0x266/0x2e0 [mac80211] ieee80211_open+0xb9/0x100 [mac80211] __dev_open+0x1b8/0x280 Fixes: `a2ac0f48a0` ("iwlwifi: mvm: implement approved list for the PPAG feature") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Victor Michel <vic.michel.web@gmail.com> Acked-by: Luca Coelho <luciano.coelho@intel.com> [kvalo@codeaurora.org: improve commit log] Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210223140039.1708534-1-weiyongjun1@huawei.com	2021-03-03 17:56:11 +02:00
Biao Huang	95b39f07a1	net: ethernet: mtk-star-emac: fix wrong unmap in RX handling mtk_star_dma_unmap_rx() should unmap the dma_addr of old skb rather than that of new skb. Assign new_dma_addr to desc_data.dma_addr after all handling of old skb ends to avoid unexpected receive side error. Fixes: `f96e9641e9` ("net: ethernet: mtk-star-emac: fix error path in RX handling") Signed-off-by: Biao Huang <biao.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-02 15:28:07 -08:00
Wong Vee Khee	fa706dce2f	stmmac: intel: Fix mdio bus registration issue for TGL-H/ADL-S On Intel platforms which consist of two Ethernet Controllers such as TGL-H and ADL-S, a unique MDIO bus id is required for MDIO bus to be successful registered: [ 13.076133] sysfs: cannot create duplicate filename '/class/mdio_bus/stmmac-1' [ 13.083404] CPU: 8 PID: 1898 Comm: systemd-udevd Tainted: G U 5.11.0-net-next #106 [ 13.092410] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DRR4 CRB, BIOS ADLIFSI1.R00.1494.B00.2012031421 12/03/2020 [ 13.105709] Call Trace: [ 13.108176] dump_stack+0x64/0x7c [ 13.111553] sysfs_warn_dup+0x56/0x70 [ 13.115273] sysfs_do_create_link_sd.isra.2+0xbd/0xd0 [ 13.120371] device_add+0x4df/0x840 [ 13.123917] ? complete_all+0x2a/0x40 [ 13.127636] __mdiobus_register+0x98/0x310 [libphy] [ 13.132572] stmmac_mdio_register+0x1c5/0x3f0 [stmmac] [ 13.137771] ? stmmac_napi_add+0xa5/0xf0 [stmmac] [ 13.142493] stmmac_dvr_probe+0x806/0xee0 [stmmac] [ 13.147341] intel_eth_pci_probe+0x1cb/0x250 [dwmac_intel] [ 13.152884] pci_device_probe+0xd2/0x150 [ 13.156897] really_probe+0xf7/0x4d0 [ 13.160527] driver_probe_device+0x5d/0x140 [ 13.164761] device_driver_attach+0x4f/0x60 [ 13.168996] __driver_attach+0xa2/0x140 [ 13.172891] ? device_driver_attach+0x60/0x60 [ 13.177300] bus_for_each_dev+0x76/0xc0 [ 13.181188] bus_add_driver+0x189/0x230 [ 13.185083] ? 0xffffffffc0795000 [ 13.188446] driver_register+0x5b/0xf0 [ 13.192249] ? 0xffffffffc0795000 [ 13.195577] do_one_initcall+0x4d/0x210 [ 13.199467] ? kmem_cache_alloc_trace+0x2ff/0x490 [ 13.204228] do_init_module+0x5b/0x21c [ 13.208031] load_module+0x2a0c/0x2de0 [ 13.211838] ? __do_sys_finit_module+0xb1/0x110 [ 13.216420] __do_sys_finit_module+0xb1/0x110 [ 13.220825] do_syscall_64+0x33/0x40 [ 13.224451] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 13.229515] RIP: 0033:0x7fc2b1919ccd [ 13.233113] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 31 0c 00 f7 d8 64 89 01 48 [ 13.251912] RSP: 002b:00007ffcea2e5b98 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 13.259527] RAX: ffffffffffffffda RBX: 0000560558920f10 RCX: 00007fc2b1919ccd [ 13.266706] RDX: 0000000000000000 RSI: 00007fc2b1a881e3 RDI: 0000000000000012 [ 13.273887] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 [ 13.281036] R10: 0000000000000012 R11: 0000000000000246 R12: 00007fc2b1a881e3 [ 13.288183] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffcea2e5d58 [ 13.295389] libphy: mii_bus stmmac-1 failed to register Fixes: `88af9bd4ef` ("stmmac: intel: Add ADL-S 1Gbps PCI IDs") Fixes: `8450e23f14` ("stmmac: intel: Add PCI IDs for TGL-H platform") Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-02 15:27:14 -08:00
Eric Dumazet	8811f4a983	tcp: add sanity tests to TCP_QUEUE_SEQ Qingyu Li reported a syzkaller bug where the repro changes RCV SEQ _after_ restoring data in the receive queue. mprotect(0x4aa000, 12288, PROT_READ) = 0 mmap(0x1ffff000, 4096, PROT_NONE, MAP_PRIVATE\|MAP_FIXED\|MAP_ANONYMOUS, -1, 0) = 0x1ffff000 mmap(0x20000000, 16777216, PROT_READ\|PROT_WRITE\|PROT_EXEC, MAP_PRIVATE\|MAP_FIXED\|MAP_ANONYMOUS, -1, 0) = 0x20000000 mmap(0x21000000, 4096, PROT_NONE, MAP_PRIVATE\|MAP_FIXED\|MAP_ANONYMOUS, -1, 0) = 0x21000000 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 setsockopt(3, SOL_TCP, TCP_REPAIR, [1], 4) = 0 connect(3, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0 setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="0x0000000000000003\0\0", iov_len=20}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 setsockopt(3, SOL_TCP, TCP_REPAIR, [0], 4) = 0 setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [128], 4) = 0 recvfrom(3, NULL, 20, 0, NULL, NULL) = -1 ECONNRESET (Connection reset by peer) syslog shows: [ 111.205099] TCP recvmsg seq # bug 2: copied 80, seq 0, rcvnxt 80, fl 0 [ 111.207894] WARNING: CPU: 1 PID: 356 at net/ipv4/tcp.c:2343 tcp_recvmsg_locked+0x90e/0x29a0 This should not be allowed. TCP_QUEUE_SEQ should only be used when queues are empty. This patch fixes this case, and the tx path as well. Fixes: `ee9952831c` ("tcp: Initial repair mode") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=212005 Reported-by: Qingyu Li <ieatmuttonchuan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 15:32:05 -08:00
Andrea Parri (Microsoft)	3946688edb	hv_netvsc: Fix validation in netvsc_linkstatus_callback() Contrary to the RNDIS protocol specification, certain (pre-Fe) implementations of Hyper-V's vSwitch did not account for the status buffer field in the length of an RNDIS packet; the bug was fixed in newer implementations. Validate the status buffer fields using the length of the 'vmtransfer_page' packet (all implementations), that is known/validated to be less than or equal to the receive section size and not smaller than the length of the RNDIS message. Reported-by: Dexuan Cui <decui@microsoft.com> Suggested-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Fixes: `505e3f00c3` ("hv_netvsc: Add (more) validation for untrusted Hyper-V values") Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 15:30:52 -08:00
DENG Qingfang	9200f515c4	net: dsa: tag_mtk: fix 802.1ad VLAN egress A different TPID bit is used for 802.1ad VLAN frames. Reported-by: Ilario Gelmetti <iochesonome@gmail.com> Fixes: `f0af34317f` ("net: dsa: mediatek: combine MediaTek tag with VLAN tag") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 15:29:43 -08:00
Willem de Bruijn	b228c9b058	net: expand textsearch ts_state to fit skb_seq_state The referenced commit expands the skb_seq_state used by skb_find_text with a 4B frag_off field, growing it to 48B. This exceeds container ts_state->cb, causing a stack corruption: [ 73.238353] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: skb_find_text+0xc5/0xd0 [ 73.247384] CPU: 1 PID: 376 Comm: nping Not tainted 5.11.0+ #4 [ 73.252613] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 73.260078] Call Trace: [ 73.264677] dump_stack+0x57/0x6a [ 73.267866] panic+0xf6/0x2b7 [ 73.270578] ? skb_find_text+0xc5/0xd0 [ 73.273964] __stack_chk_fail+0x10/0x10 [ 73.277491] skb_find_text+0xc5/0xd0 [ 73.280727] string_mt+0x1f/0x30 [ 73.283639] ipt_do_table+0x214/0x410 The struct is passed between skb_find_text and its callbacks skb_prepare_seq_read, skb_seq_read and skb_abort_seq read through the textsearch interface using TS_SKB_CB. I assumed that this mapped to skb->cb like other .._SKB_CB wrappers. skb->cb is 48B. But it maps to ts_state->cb, which is only 40B. skb->cb was increased from 40B to 48B after ts_state was introduced, in commit `3e3850e989` ("[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder"). Increase ts_state.cb[] to 48 to fit the struct. Also add a BUILD_BUG_ON to avoid a repeat. The alternative is to directly add a dependency from textsearch onto linux/skbuff.h, but I think the intent is textsearch to have no such dependencies on its callers. Link: https://bugzilla.kernel.org/show_bug.cgi?id=211911 Fixes: `97550f6fa5` ("net: compound page support in skb_seq_read") Reported-by: Kris Karas <bugs-a17@moonlit-rail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 15:25:24 -08:00
Masanari Iida	2353db75c3	docs: networking: bonding.rst Fix a typo in bonding.rst This patch fixes a spelling typo in bonding.rst. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:53:21 -08:00
David S. Miller	2eb4898255	linux-can-fixes-for-5.12-20210301 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmA8xfATHG1rbEBwZW5n dXRyb25peC5kZQAKCRCpyVqK+u3vqZzcB/9LC7BTXmD2S5YMUvbH8Dy1XQ1J/Ss5 VPiT+6eNCxFhl9YzsWXPhhvNg7WYLa5SbWid/yht7UAxgiibtB7COaePeuzTSmtQ OaTp1J0QcTnaXVWv2GdUnYu0S3xpIyHmvkUQND7AlIfdQg3evKiN+J3JhrO30mVx zLhf+gtZUU48fY/DSD7YgwsRv6+L8FFRgxcPlz5QXS3VJOo3Tic4VEjGBWaGrn0U xxuRsjBIPXGBYAhru3TOWuDarLWmEFDL2m6s+hPZG1YLAuuAqLiDKdXKzLjwI8mh yfgcnaSS8Mdrd4vjPyhSmr3pULi1Ac8BFhQkGPysJIad0/25eiuMBUZI =iiwZ -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-5.12-20210301' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2021-03-01 this is a pull request of 6 patches for net/master. The first 3 patches are by Joakim Zhang for the flexcan driver and fix the probing and starting of the chip. The next patch is by me, for the mcp251xfd driver and reverts the BQL support. BQL support got mainline with rc1 and assumes that CAN frames are always echoed, which is not the case. A proper fix requires changes more changes and will be rolled out via linux-can-next later. Oleksij Rempel's patch fixes the socket ref counting if socket was closed before setting skb ownership. Torin Cooper-Bennun's patch for the tcan4x5x driver fixes a race condition, where the chip is first attached the bus and then the MRAM is initialized, which may result in lost data. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:37:08 -08:00
David S. Miller	8a00946e1a	Merge branch 'enetc-fixes' Vladimir Oltean says: ==================== Fixes for NXP ENETC driver This contains an assorted set of fixes collected over the past 2 weeks on the enetc driver. Some are related to VLAN processing, some to physical link settings, some are fixups of previous hardware workarounds, and some are simply zero-day data path bugs that for some reason were never caught or at least identified. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	3a5d12c9be	net: enetc: keep RX ring consumer index in sync with hardware The RX rings have a producer index owned by hardware, where newly received frame buffers are placed, and a consumer index owned by software, where newly allocated buffers are placed, in expectation of hardware being able to place frame data in them. Hardware increments the producer index when a frame is received, however it is not allowed to increment the producer index to match the consumer index (RBCIR) since the ring can hold at most RBLENR[LENGTH]-1 received BDs. Whenever the producer index matches the value of the consumer index, the ring has no unprocessed received frames and all BDs in the ring have been initialized/prepared by software, i.e. hardware owns all BDs in the ring. The code uses the next_to_clean variable to keep track of the producer index, and the next_to_use variable to keep track of the consumer index. The RX rings are seeded from enetc_refill_rx_ring, which is called from two places: 1. initially the ring is seeded until full with enetc_bd_unused(rx_ring), i.e. with 511 buffers. This will make next_to_clean=0 and next_to_use=511: .ndo_open -> enetc_open -> enetc_setup_bdrs -> enetc_setup_rxbdr -> enetc_refill_rx_ring 2. then during the data path processing, it is refilled with 16 buffers at a time: enetc_msix -> napi_schedule -> enetc_poll -> enetc_clean_rx_ring -> enetc_refill_rx_ring There is just one problem: the initial seeding done during .ndo_open updates just the producer index (ENETC_RBPIR) with 0, and the software next_to_clean and next_to_use variables. Notably, it will not update the consumer index to make the hardware aware of the newly added buffers. Wait, what? So how does it work? Well, the reset values of the producer index and of the consumer index of a ring are both zero. As per the description in the second paragraph, it means that the ring is full of buffers waiting for hardware to put frames in them, which by coincidence is almost true, because we have in fact seeded 511 buffers into the ring. But will the hardware attempt to access the 512th entry of the ring, which has an invalid BD in it? Well, no, because in order to do that, it would have to first populate the first 511 entries, and the NAPI enetc_poll will kick in by then. Eventually, after 16 processed slots have become available in the RX ring, enetc_clean_rx_ring will call enetc_refill_rx_ring and then will [ finally ] update the consumer index with the new software next_to_use variable. From now on, the next_to_clean and next_to_use variables are in sync with the producer and consumer ring indices. So the day is saved, right? Well, not quite. Freeing the memory allocated for the rings is done in: enetc_close -> enetc_clear_bdrs -> enetc_clear_rxbdr -> this just disables the ring -> enetc_free_rxtx_rings -> enetc_free_rx_ring -> sets next_to_clean and next_to_use to 0 but again, nothing is committed to the hardware producer and consumer indices (yay!). The assumption is that the ring is disabled, so the indices don't matter anyway, and it's the responsibility of the "open" code path to set those up. .. Except that the "open" code path does not set those up properly. While initially, things almost work, during subsequent enetc_close -> enetc_open sequences, we have problems. To be precise, the enetc_open that is subsequent to enetc_close will again refill the ring with 511 entries, but it will leave the consumer index untouched. Untouched means, of course, equal to the value it had before disabling the ring and draining the old buffers in enetc_close. But as mentioned, enetc_setup_rxbdr will at least update the producer index though, through this line of code: enetc_rxbdr_wr(hw, idx, ENETC_RBPIR, 0); so at this stage we'll have: next_to_clean=0 (in hardware 0) next_to_use=511 (in hardware we'll have the refill index prior to enetc_close) Again, the next_to_clean and producer index are in sync and set to correct values, so the driver manages to limp on. Eventually, 16 ring entries will be consumed by enetc_poll, and the savior enetc_clean_rx_ring will come and call enetc_refill_rx_ring, and then update the hardware consumer ring based upon the new next_to_use. So.. it works? Well, by coincidence, it almost does, but there's a circumstance where enetc_clean_rx_ring won't be there to save us. If the previous value of the consumer index was 15, there's a problem, because the NAPI poll sequence will only issue a refill when 16 or more buffers have been consumed. It's easiest to illustrate this with an example: ip link set eno0 up ip addr add 192.168.100.1/24 dev eno0 ping 192.168.100.1 -c 20 # ping this port from another board ip link set eno0 down ip link set eno0 up ping 192.168.100.1 -c 20 # ping it again from the same other board One by one: 1. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 0) 2. ping 192.168.100.1 -c 20 # ping this port from another board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=15 next_to_clean 14 (in hw 15) next_to_use 511 (in hw 0) enetc_clean_rx_ring: enetc_refill_rx_ring(16) increments next_to_use by 16 (mod 512) and writes it to hw enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=0 next_to_clean 15 (in hw 16) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 16 (in hw 17) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 17 (in hw 18) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 18 (in hw 19) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 19 (in hw 20) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 20 (in hw 21) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 21 (in hw 22) next_to_use 15 (in hw 15) 20 packets transmitted, 20 packets received, 0% packet loss 3. ip link set eno0 down enetc_free_rx_ring: next_to_clean 0 (in hw 22), next_to_use 0 (in hw 15) 4. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 15) 5. ping 192.168.100.1 -c 20 # ping it again from the same other board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 15) 20 packets transmitted, 12 packets received, 40% packet loss And there it dies. No enetc_refill_rx_ring (because cleaned_cnt must be equal to 15 for that to happen), no nothing. The hardware enters the condition where the producer (14) + 1 is equal to the consumer (15) index, which makes it believe it has no more free buffers to put packets in, so it starts discarding them: ip netns exec ns0 ethtool -S eno0 \| grep -v ': 0' NIC statistics: Rx ring 0 discarded frames: 8 Summarized, if the interface receives between 16 and 32 (mod 512) frames and then there is a link flap, then the port will eventually die with no way to recover. If it receives less than 16 (mod 512) frames, then the initial NAPI poll [ before the link flap ] will not update the consumer index in hardware (it will remain zero) which will be ok when the buffers are later reinitialized. If more than 32 (mod 512) frames are received, the initial NAPI poll has the chance to refill the ring twice, updating the consumer index to at least 32. So after the link flap, the consumer index is still wrong, but the post-flap NAPI poll gets a chance to refill the ring once (because it passes through cleaned_cnt=15) and makes the consumer index be again back in sync with next_to_use. The solution to this problem is actually simple, we just need to write next_to_use into the hardware consumer index at enetc_open time, which always brings it back in sync after an initial buffer seeding process. The simpler thing would be to put the write to the consumer index into enetc_refill_rx_ring directly, but there are issues with the MDIO locking: in the NAPI poll code we have the enetc_lock_mdio() taken from top-level and we use the unlocked enetc_wr_reg_hot, whereas in enetc_open, the enetc_lock_mdio() is not taken at the top level, but instead by each individual enetc_wr_reg, so we are forced to put an additional enetc_wr_reg in enetc_setup_rxbdr. Better organization of the code is left as a refactoring exercise. Fixes: `d4fd0404c1` ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	96a5223b91	net: enetc: remove bogus write to SIRXIDR from enetc_setup_rxbdr The Station Interface Receive Interrupt Detect Register (SIRXIDR) contains a 16-bit wide mask of 'interrupt detected' events for each ring associated with a port. Bit i is write-1-to-clean for RX ring i. I have no explanation whatsoever how this line of code came to be inserted in the blamed commit. I checked the downstream versions of that patch and none of them have it. The somewhat comical aspect of it is that we're writing a binary number to the SIRXIDR register, which is derived from enetc_bd_unused(rx_ring). Since the RX rings have 512 buffer descriptors, we end up writing 511 to this register, which is 0x1ff, so we are effectively clearing the 'interrupt detected' event for rings 0-8. This register is not what is used for interrupt handling though - it only provides a summary for the entire SI. The hardware provides one separate Interrupt Detect Register per RX ring, which auto-clears upon read. So there doesn't seem to be any adverse effect caused by this bogus write. There is, however, one reason why this should be handled as a bugfix: next_to_clean _should_ be committed to hardware, just not to that register, and this was obscuring the fact that it wasn't. This is fixed in the next patch, and removing the bogus line now allows the fix patch to be backported beyond that point. Fixes: `fd5736bf9f` ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	c76a97218d	net: enetc: force the RGMII speed and duplex instead of operating in inband mode The ENETC port 0 MAC supports in-band status signaling coming from a PHY when operating in RGMII mode, and this feature is enabled by default. It has been reported that RGMII is broken in fixed-link, and that is not surprising considering the fact that no PHY is attached to the MAC in that case, but a switch. This brings us to the topic of the patch: the enetc driver should have not enabled the optional in-band status signaling for RGMII unconditionally, but should have forced the speed and duplex to what was resolved by phylink. Note that phylink does not accept the RGMII modes as valid for in-band signaling, and these operate a bit differently than 1000base-x and SGMII (notably there is no clause 37 state machine so no ACK required from the MAC, instead the PHY sends extra code words on RXD[3:0] whenever it is not transmitting something else, so it should be safe to leave a PHY with this option unconditionally enabled even if we ignore it). The spec talks about this here: https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/138/RGMIIv1_5F00_3.pdf Fixes: `71b77a7a27` ("enetc: Migrate to PHYLINK and PCS_LYNX") Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	a74dbce9d4	net: enetc: don't disable VLAN filtering in IFF_PROMISC mode Quoting from the blamed commit: In promiscuous mode, it is more intuitive that all traffic is received, including VLAN tagged traffic. It appears that it is necessary to set the flag in PSIPVMR for that to be the case, so VLAN promiscuous mode is also temporarily enabled. On exit from promiscuous mode, the setting made by ethtool is restored. Intuitive or not, there isn't any definition issued by a standards body which says that promiscuity has anything to do with VLAN filtering - it only has to do with accepting packets regardless of destination MAC address. In fact people are already trying to use this misunderstanding/bug of the enetc driver as a justification to transform promiscuity into something it never was about: accepting every packet (maybe that would be the "rx-all" netdev feature?): https://lore.kernel.org/netdev/20201110153958.ci5ekor3o2ekg3ky@ipetronik.com/ This is relevant because there are use cases in the kernel (such as tc-flower rules with the protocol 802.1Q and a vlan_id key) which do not (yet) use the vlan_vid_add API to be compatible with VLAN-filtering NICs such as enetc, so for those, disabling rx-vlan-filter is currently the only right solution to make these setups work: https://lore.kernel.org/netdev/CA+h21hoxwRdhq4y+w8Kwgm74d4cA0xLeiHTrmT-VpSaM7obhkg@mail.gmail.com/ The blamed patch has unintentionally introduced one more way for this to work, which is to enable IFF_PROMISC, however this is non-portable because port promiscuity is not meant to disable VLAN filtering. Therefore, it could invite people to write broken scripts for enetc, and then wonder why they are broken when migrating to other drivers that don't handle promiscuity in the same way. Fixes: `7070eea5e9` ("enetc: permit configuration of rx-vlan-filter with ethtool") Cc: Markus Blöchl <Markus.Bloechl@ipetronik.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	827b6fd046	net: enetc: fix incorrect TPID when receiving 802.1ad tagged packets When the enetc ports have rx-vlan-offload enabled, they report a TPID of ETH_P_8021Q regardless of what was actually in the packet. When rx-vlan-offload is disabled, packets have the proper TPID. Fix this inconsistency by finishing the TODO left in the code. Fixes: `d4fd0404c1` ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	6d36ecdbc4	net: enetc: take the MDIO lock only once per NAPI poll cycle The workaround for the ENETC MDIO erratum caused a performance degradation of 82 Kpps (seen with IP forwarding of two 1Gbps streams of 64B packets). This is due to excessive locking and unlocking in the fast path, which can be avoided. By taking the MDIO read-side lock only once per NAPI poll cycle, we are able to regain 54 Kpps (65%) of the performance hit. The rest of the performance degradation comes from the TX data path, but unfortunately it doesn't look like we can optimize that away easily, even with netdev_xmit_more(), there just isn't any skb batching done, to help with taking the MDIO lock less often than once per packet. We need to change the register accessor type for enetc_get_tx_tstamp, because it now runs under the enetc_lock_mdio as per the new call path detailed below: enetc_msix -> napi_schedule -> enetc_poll -> enetc_lock_mdio -> enetc_clean_tx_ring -> enetc_get_tx_tstamp -> enetc_clean_rx_ring -> enetc_unlock_mdio Fixes: `fd5736bf9f` ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	3222b5b613	net: enetc: initialize RFS/RSS memories for unused ports too Michael reports that since linux-next-20210211, the AER messages for ECC errors have started reappearing, and this time they can be reliably reproduced with the first ping on one of his LS1028A boards. $ ping 1[ 33.258069] pcieport 0000:00:1f.0: AER: Multiple Corrected error received: 0000:00:00.0 72.16.0.1 PING [ 33.267050] pcieport 0000:00:1f.0: AER: can't find device of ID0000 172.16.0.1 (172.16.0.1): 56 data bytes 64 bytes from 172.16.0.1: seq=0 ttl=64 time=17.124 ms 64 bytes from 172.16.0.1: seq=1 ttl=64 time=0.273 ms $ devmem 0x1f8010e10 32 0xC0000006 It isn't clear why this is necessary, but it seems that for the errors to go away, we must clear the entire RFS and RSS memory, not just for the ports in use. Sadly the code is structured in such a way that we can't have unified logic for the used and unused ports. For the minimal initialization of an unused port, we need just to enable and ioremap the PF memory space, and a control buffer descriptor ring. Unused ports must then free the CBDR because the driver will exit, but used ports can not pick up from where that code path left, since the CBDR API does not reinitialize a ring when setting it up, so its producer and consumer indices are out of sync between the software and hardware state. So a separate enetc_init_unused_port function was created, and it gets called right after the PF memory space is enabled. Fixes: `07bf34a50e` ("net: enetc: initialize the RFS and RSS memories") Reported-by: Michael Walle <michael@walle.cc> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Michael Walle <michael@walle.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	c646d10dda	net: enetc: don't overwrite the RSS indirection table when initializing After the blamed patch, all RX traffic gets hashed to CPU 0 because the hashing indirection table set up in: enetc_pf_probe -> enetc_alloc_si_resources -> enetc_configure_si -> enetc_setup_default_rss_table is overwritten later in: enetc_pf_probe -> enetc_init_port_rss_memory which zero-initializes the entire port RSS table in order to avoid ECC errors. The trouble really is that enetc_init_port_rss_memory really neads enetc_alloc_si_resources to be called, because it depends upon enetc_alloc_cbdr and enetc_setup_cbdr. But that whole enetc_configure_si thing could have been better thought out, it has nothing to do in a function called "alloc_si_resources", especially since its counterpart, "free_si_resources", does nothing to unwind the configuration of the SI. The point is, we need to pull out enetc_configure_si out of enetc_alloc_resources, and move it after enetc_init_port_rss_memory. This allows us to set up the default RSS indirection table after initializing the memory. Fixes: `07bf34a50e` ("net: enetc: initialize the RFS and RSS memories") Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Yejune Deng	8bd2a05527	inetpeer: use div64_ul() and clamp_val() calculate inet_peer_threshold In inet_initpeers(), struct inet_peer on IA32 uses 128 bytes in nowdays. Get rid of the cascade and use div64_ul() and clamp_val() calculate that will not need to be adjusted in the future as suggested by Eric Dumazet. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Yejune Deng <yejune.deng@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:32:12 -08:00
Pavel Skripkin	093b036aa9	net/qrtr: fix __netdev_alloc_skb call syzbot found WARNING in __alloc_pages_nodemask()[1] when order >= MAX_ORDER. It was caused by a huge length value passed from userspace to qrtr_tun_write_iter(), which tries to allocate skb. Since the value comes from the untrusted source there is no need to raise a warning in __alloc_pages_nodemask(). [1] WARNING in __alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:5014 Call Trace: __alloc_pages include/linux/gfp.h:511 [inline] __alloc_pages_node include/linux/gfp.h:524 [inline] alloc_pages_node include/linux/gfp.h:538 [inline] kmalloc_large_node+0x60/0x110 mm/slub.c:3999 __kmalloc_node_track_caller+0x319/0x3f0 mm/slub.c:4496 __kmalloc_reserve net/core/skbuff.c:150 [inline] __alloc_skb+0x4e4/0x5a0 net/core/skbuff.c:210 __netdev_alloc_skb+0x70/0x400 net/core/skbuff.c:446 netdev_alloc_skb include/linux/skbuff.h:2832 [inline] qrtr_endpoint_post+0x84/0x11b0 net/qrtr/qrtr.c:442 qrtr_tun_write_iter+0x11f/0x1a0 net/qrtr/tun.c:98 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write+0x426/0x650 fs/read_write.c:518 vfs_write+0x791/0xa30 fs/read_write.c:605 ksys_write+0x12d/0x250 fs/read_write.c:658 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: syzbot+80dccaee7c6630fa9dcf@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Acked-by: Alexander Lobakin <alobakin@pm.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:24:03 -08:00
David S. Miller	5db4f74ec8	Merge branch 'sh_eth-masks' Sergey Shtylyov says: ==================== Fix TRSCER masks in the Ether driver Here are 3 patches against DaveM's 'net' repo. I'm fixing the TRSCER masks in the driver to match the manuals... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:22:34 -08:00
Sergey Shtylyov	165bc5a4f3	sh_eth: fix TRSCER mask for R7S9210 According to the RZ/A2M Group User's Manual: Hardware, Rev. 2.00, the TRSCER register has bit 9 reserved, hence we can't use the driver's default TRSCER mask. Add the explicit initializer for sh_eth_cpu_data:: trscer_err_mask for R7S9210. Fixes: `6e0bb04d0e` ("sh_eth: Add R7S9210 support") Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:22:34 -08:00
Sergey Shtylyov	75be7fb7f9	sh_eth: fix TRSCER mask for R7S72100 According to the RZ/A1H Group, RZ/A1M Group User's Manual: Hardware, Rev. 4.00, the TRSCER register has bit 9 reserved, hence we can't use the driver's default TRSCER mask. Add the explicit initializer for sh_eth_cpu_data::trscer_err_mask for R7S72100. Fixes: `db893473d3` ("sh_eth: Add support for r7s72100") Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:22:34 -08:00
Sergey Shtylyov	8c91bc3d44	sh_eth: fix TRSCER mask for SH771x According to the SH7710, SH7712, SH7713 Group User's Manual: Hardware, Rev. 3.00, the TRSCER register actually has only bit 7 valid (and named differently), with all the other bits reserved. Apparently, this was not the case with some early revisions of the manual as we have the other bits declared (and set) in the original driver. Follow the suit and add the explicit sh_eth_cpu_data::trscer_err_mask initializer for SH771x... Fixes: `86a74ff21a` ("net: sh_eth: add support for Renesas SuperH Ethernet") Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:22:34 -08:00
Tong Zhang	a2bd45834e	atm: lanai: dont run lanai_dev_close if not open lanai_dev_open() can fail. When it fail, lanai->base is unmapped and the pci device is disabled. The caller, lanai_init_one(), then tries to run atm_dev_deregister(). This will subsequently call lanai_dev_close() and use the already released MMIO area. To fix this issue, set the lanai->base to NULL if open fail, and test the flag in lanai_dev_close(). [ 8.324153] lanai: lanai_start() failed, err=19 [ 8.324819] lanai(itf 0): shutting down interface [ 8.325211] BUG: unable to handle page fault for address: ffffc90000180024 [ 8.325781] #PF: supervisor write access in kernel mode [ 8.326215] #PF: error_code(0x0002) - not-present page [ 8.326641] PGD 100000067 P4D 100000067 PUD 100139067 PMD 10013a067 PTE 0 [ 8.327206] Oops: 0002 [#1] SMP KASAN NOPTI [ 8.327557] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc7-00090-gdcc0b49040c7 #12 [ 8.328229] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-4 [ 8.329145] RIP: 0010:lanai_dev_close+0x4f/0xe5 [lanai] [ 8.329587] Code: 00 48 c7 c7 00 d3 01 c0 e8 49 4e 0a c2 48 8d bd 08 02 00 00 e8 6e 52 14 c1 48 80 [ 8.330917] RSP: 0018:ffff8881029ef680 EFLAGS: 00010246 [ 8.331196] RAX: 000000000003fffe RBX: ffff888102fb4800 RCX: ffffffffc001a98a [ 8.331572] RDX: ffffc90000180000 RSI: 0000000000000246 RDI: ffff888102fb4000 [ 8.331948] RBP: ffff888102fb4000 R08: ffffffff8115da8a R09: ffffed102053deaa [ 8.332326] R10: 0000000000000003 R11: ffffed102053dea9 R12: ffff888102fb48a4 [ 8.332701] R13: ffffffffc00123c0 R14: ffff888102fb4b90 R15: ffff888102fb4b88 [ 8.333077] FS: 00007f08eb9056a0(0000) GS:ffff88815b400000(0000) knlGS:0000000000000000 [ 8.333502] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.333806] CR2: ffffc90000180024 CR3: 0000000102a28000 CR4: 00000000000006f0 [ 8.334182] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8.334557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8.334932] Call Trace: [ 8.335066] atm_dev_deregister+0x161/0x1a0 [atm] [ 8.335324] lanai_init_one.cold+0x20c/0x96d [lanai] [ 8.335594] ? lanai_send+0x2a0/0x2a0 [lanai] [ 8.335831] local_pci_probe+0x6f/0xb0 [ 8.336039] pci_device_probe+0x171/0x240 [ 8.336255] ? pci_device_remove+0xe0/0xe0 [ 8.336475] ? kernfs_create_link+0xb6/0x110 [ 8.336704] ? sysfs_do_create_link_sd.isra.0+0x76/0xe0 [ 8.336983] really_probe+0x161/0x420 [ 8.337181] driver_probe_device+0x6d/0xd0 [ 8.337401] device_driver_attach+0x82/0x90 [ 8.337626] ? device_driver_attach+0x90/0x90 [ 8.337859] __driver_attach+0x60/0x100 [ 8.338065] ? device_driver_attach+0x90/0x90 [ 8.338298] bus_for_each_dev+0xe1/0x140 [ 8.338511] ? subsys_dev_iter_exit+0x10/0x10 [ 8.338745] ? klist_node_init+0x61/0x80 [ 8.338956] bus_add_driver+0x254/0x2a0 [ 8.339164] driver_register+0xd3/0x150 [ 8.339370] ? 0xffffffffc0028000 [ 8.339550] do_one_initcall+0x84/0x250 [ 8.339755] ? trace_event_raw_event_initcall_finish+0x150/0x150 [ 8.340076] ? free_vmap_area_noflush+0x1a5/0x5c0 [ 8.340329] ? unpoison_range+0xf/0x30 [ 8.340532] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 8.340806] ? unpoison_range+0xf/0x30 [ 8.341014] ? unpoison_range+0xf/0x30 [ 8.341217] do_init_module+0xf8/0x350 [ 8.341419] load_module+0x3fe6/0x4340 [ 8.341621] ? vm_unmap_ram+0x1d0/0x1d0 [ 8.341826] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 8.342101] ? module_frob_arch_sections+0x20/0x20 [ 8.342358] ? __do_sys_finit_module+0x108/0x170 [ 8.342604] __do_sys_finit_module+0x108/0x170 [ 8.342841] ? __ia32_sys_init_module+0x40/0x40 [ 8.343083] ? file_open_root+0x200/0x200 [ 8.343298] ? do_sys_open+0x85/0xe0 [ 8.343491] ? filp_open+0x50/0x50 [ 8.343675] ? exit_to_user_mode_prepare+0xfc/0x130 [ 8.343935] do_syscall_64+0x33/0x40 [ 8.344132] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 8.344401] RIP: 0033:0x7f08eb887cf7 [ 8.344594] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f7 48 89 d6 41 [ 8.345565] RSP: 002b:00007ffcd5c98ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 8.345962] RAX: ffffffffffffffda RBX: 00000000008fea70 RCX: 00007f08eb887cf7 [ 8.346336] RDX: 0000000000000000 RSI: 00000000008fd9e0 RDI: 0000000000000003 [ 8.346711] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001 [ 8.347085] R10: 00007f08eb8eb300 R11: 0000000000000246 R12: 00000000008fd9e0 [ 8.347460] R13: 0000000000000000 R14: 00000000008fddd0 R15: 0000000000000001 [ 8.347836] Modules linked in: lanai(+) atm [ 8.348065] CR2: ffffc90000180024 [ 8.348244] ---[ end trace 7fdc1c668f2003e5 ]--- [ 8.348490] RIP: 0010:lanai_dev_close+0x4f/0xe5 [lanai] [ 8.348772] Code: 00 48 c7 c7 00 d3 01 c0 e8 49 4e 0a c2 48 8d bd 08 02 00 00 e8 6e 52 14 c1 48 80 [ 8.349745] RSP: 0018:ffff8881029ef680 EFLAGS: 00010246 [ 8.350022] RAX: 000000000003fffe RBX: ffff888102fb4800 RCX: ffffffffc001a98a [ 8.350397] RDX: ffffc90000180000 RSI: 0000000000000246 RDI: ffff888102fb4000 [ 8.350772] RBP: ffff888102fb4000 R08: ffffffff8115da8a R09: ffffed102053deaa [ 8.351151] R10: 0000000000000003 R11: ffffed102053dea9 R12: ffff888102fb48a4 [ 8.351525] R13: ffffffffc00123c0 R14: ffff888102fb4b90 R15: ffff888102fb4b88 [ 8.351918] FS: 00007f08eb9056a0(0000) GS:ffff88815b400000(0000) knlGS:0000000000000000 [ 8.352343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.352647] CR2: ffffc90000180024 CR3: 0000000102a28000 CR4: 00000000000006f0 [ 8.353022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8.353397] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8.353958] modprobe (95) used greatest stack depth: 26216 bytes left Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:18:54 -08:00
Tong Zhang	4deb550bc3	atm: eni: dont release is never initialized label err_eni_release is reachable when eni_start() fail. In eni_start() it calls dev->phy->start() in the last step, if start() fail we don't need to call phy->stop(), if start() is never called, we neither need to call phy->stop(), otherwise null-ptr-deref will happen. In order to fix this issue, don't call phy->stop() in label err_eni_release [ 4.875714] ================================================================== [ 4.876091] BUG: KASAN: null-ptr-deref in suni_stop+0x47/0x100 [suni] [ 4.876433] Read of size 8 at addr 0000000000000030 by task modprobe/95 [ 4.876778] [ 4.876862] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc7-00090-gdcc0b49040c7 #2 [ 4.877290] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd94 [ 4.877876] Call Trace: [ 4.878009] dump_stack+0x7d/0xa3 [ 4.878191] kasan_report.cold+0x10c/0x10e [ 4.878410] ? __slab_free+0x2f0/0x340 [ 4.878612] ? suni_stop+0x47/0x100 [suni] [ 4.878832] suni_stop+0x47/0x100 [suni] [ 4.879043] eni_do_release+0x3b/0x70 [eni] [ 4.879269] eni_init_one.cold+0x1152/0x1747 [eni] [ 4.879528] ? _raw_spin_lock_irqsave+0x7b/0xd0 [ 4.879768] ? eni_ioctl+0x270/0x270 [eni] [ 4.879990] ? __mutex_lock_slowpath+0x10/0x10 [ 4.880226] ? eni_ioctl+0x270/0x270 [eni] [ 4.880448] local_pci_probe+0x6f/0xb0 [ 4.880650] pci_device_probe+0x171/0x240 [ 4.880864] ? pci_device_remove+0xe0/0xe0 [ 4.881086] ? kernfs_create_link+0xb6/0x110 [ 4.881315] ? sysfs_do_create_link_sd.isra.0+0x76/0xe0 [ 4.881594] really_probe+0x161/0x420 [ 4.881791] driver_probe_device+0x6d/0xd0 [ 4.882010] device_driver_attach+0x82/0x90 [ 4.882233] ? device_driver_attach+0x90/0x90 [ 4.882465] __driver_attach+0x60/0x100 [ 4.882671] ? device_driver_attach+0x90/0x90 [ 4.882903] bus_for_each_dev+0xe1/0x140 [ 4.883114] ? subsys_dev_iter_exit+0x10/0x10 [ 4.883346] ? klist_node_init+0x61/0x80 [ 4.883557] bus_add_driver+0x254/0x2a0 [ 4.883764] driver_register+0xd3/0x150 [ 4.883971] ? 0xffffffffc0038000 [ 4.884149] do_one_initcall+0x84/0x250 [ 4.884355] ? trace_event_raw_event_initcall_finish+0x150/0x150 [ 4.884674] ? unpoison_range+0xf/0x30 [ 4.884875] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 4.885150] ? unpoison_range+0xf/0x30 [ 4.885352] ? unpoison_range+0xf/0x30 [ 4.885557] do_init_module+0xf8/0x350 [ 4.885760] load_module+0x3fe6/0x4340 [ 4.885960] ? vm_unmap_ram+0x1d0/0x1d0 [ 4.886166] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 4.886441] ? module_frob_arch_sections+0x20/0x20 [ 4.886697] ? __do_sys_finit_module+0x108/0x170 [ 4.886941] __do_sys_finit_module+0x108/0x170 [ 4.887178] ? __ia32_sys_init_module+0x40/0x40 [ 4.887419] ? file_open_root+0x200/0x200 [ 4.887634] ? do_sys_open+0x85/0xe0 [ 4.887826] ? filp_open+0x50/0x50 [ 4.888009] ? fpregs_assert_state_consistent+0x4d/0x60 [ 4.888287] ? exit_to_user_mode_prepare+0x2f/0x130 [ 4.888547] do_syscall_64+0x33/0x40 [ 4.888739] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 4.889010] RIP: 0033:0x7ff62fcf1cf7 [ 4.889202] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f71 [ 4.890172] RSP: 002b:00007ffe6644ade8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 4.890570] RAX: ffffffffffffffda RBX: 0000000000f2ca70 RCX: 00007ff62fcf1cf7 [ 4.890944] RDX: 0000000000000000 RSI: 0000000000f2b9e0 RDI: 0000000000000003 [ 4.891318] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001 [ 4.891691] R10: 00007ff62fd55300 R11: 0000000000000246 R12: 0000000000f2b9e0 [ 4.892064] R13: 0000000000000000 R14: 0000000000f2bdd0 R15: 0000000000000001 [ 4.892439] ================================================================== Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:17:36 -08:00
Guangbin Huang	d9032dba5a	net: phy: fix save wrong speed and duplex problem if autoneg is on If phy uses generic driver and autoneg is on, enter command "ethtool -s eth0 speed 50" will not change phy speed actually, but command "ethtool eth0" shows speed is 50Mb/s because phydev->speed has been set to 50 and no update later. And duplex setting has same problem too. However, if autoneg is on, phy only changes speed and duplex according to phydev->advertising, but not phydev->speed and phydev->duplex. So in this case, phydev->speed and phydev->duplex don't need to be set in function phy_ethtool_ksettings_set() if autoneg is on. Fixes: `51e2a3846e` ("PHY: Avoid unnecessary aneg restarts") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:16:16 -08:00
Jason A. Donenfeld	4372339efc	net: always use icmp{,v6}_ndo_send from ndo_start_xmit There were a few remaining tunnel drivers that didn't receive the prior conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to memory corrution (see `ee576c47db` ("net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending") for details), there's even more imperative to have these all converted. So this commit goes through the remaining cases that I could find and does a boring translation to the ndo variety. The Fixes: line below is the merge that originally added icmp{,v6}_ ndo_send and converted the first batch of icmp{,v6}_send users. The rationale then for the change applies equally to this patch. It's just that these drivers were left out of the initial conversion because these network devices are hiding in net/ rather than in drivers/net/. Cc: Florian Westphal <fw@strlen.de> Cc: Willem de Bruijn <willemb@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David Ahern <dsahern@kernel.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Fixes: `803381f9f1` ("Merge branch 'icmp-account-for-NAT-when-sending-icmps-from-ndo-layer'") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:11:35 -08:00
DENG Qingfang	9eb8bc593a	net: dsa: tag_rtl4_a: fix egress tags Commit `86dd9868b8` has several issues, but was accepted too soon before anyone could take a look. - Double free. dsa_slave_xmit() will free the skb if the xmit function returns NULL, but the skb is already freed by eth_skb_pad(). Use __skb_put_padto() to avoid that. - Unnecessary allocation. It has been done by DSA core since commit `a3b0b64797`. - A u16 pointer points to skb data. It should be __be16 for network byte order. - Typo in comments. "numer" -> "number". Fixes: `86dd9868b8` ("net: dsa: tag_rtl4_a: Support also egress tags") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:09:55 -08:00
Jan Beulich	826d82170b	xen-netback: use local var in xenvif_tx_check_gop() instead of re-calculating shinfo already holds the result of skb_shinfo(skb) at this point - no need to re-invoke the construct even twice. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:05:08 -08:00
Ioana Ciornei	73f476aa19	net: phy: ti: take into account all possible interrupt sources The previous implementation of .handle_interrupt() did not take into account the fact that all the interrupt status registers should be acknowledged since multiple interrupt sources could be asserted. Fix this by reading all the status registers before exiting with IRQ_NONE or triggering the PHY state machine. Fixes: `1d1ae3c6ca` ("net: phy: ti: implement generic .handle_interrupt() callback") Reported-by: Sven Schuchmann <schuchmann@schleissheimer.de> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20210226153020.867852-1-ciorneiioana@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-03-01 11:46:55 -08:00
Torin Cooper-Bennun	2712625200	can: tcan4x5x: tcan4x5x_init(): fix initialization - clear MRAM before entering Normal Mode This patch prevents a potentially destructive race condition. The device is fully operational on the bus after entering Normal Mode, so zeroing the MRAM after entering this mode may lead to loss of information, e.g. new received messages. This patch fixes the problem by first initializing the MRAM, then bringing the device into Normale Mode. Fixes: `5443c226ba` ("can: tcan4x5x: Add tcan4x5x driver to the kernel") Link: https://lore.kernel.org/r/20210226163440.313628-1-torin@maxiluxsystems.com Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Torin Cooper-Bennun <torin@maxiluxsystems.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-03-01 11:45:15 +01:00
Oleksij Rempel	e940e0895a	can: skb: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership There are two ref count variables controlling the free()ing of a socket: - struct sock::sk_refcnt - which is changed by sock_hold()/sock_put() - struct sock::sk_wmem_alloc - which accounts the memory allocated by the skbs in the send path. In case there are still TX skbs on the fly and the socket() is closed, the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack clones an "echo" skb, calls sock_hold() on the original socket and references it. This produces the following back trace: \| WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134 \| refcount_t: addition on 0; use-after-free. \| Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E) \| CPU: 0 PID: 280 Comm: test_can.sh Tainted: G E 5.11.0-04577-gf8ff6603c617 #203 \| Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) \| Backtrace: \| [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220 \| [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8) \| [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90 \| [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2 \| [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540 \| [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50) \| [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c) \| [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600 \| [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00 \| [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600 \| [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400 \| [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534) To fix this problem, only set skb ownership to sockets which have still a ref count > 0. Fixes: `0ae89beb28` ("can: add destructor for self generated skbs") Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Andre Naujoks <nautsch2@gmail.com> Link: https://lore.kernel.org/r/20210226092456.27126-1-o.rempel@pengutronix.de Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-03-01 11:45:04 +01:00

1 2 3 4 5 ...

995518 Commits