Commit Graph

999697 Commits

Author SHA1 Message Date
Mike Marciniszyn
5de61a47eb IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
A panic can result when AIP is enabled:

  BUG: unable to handle kernel NULL pointer dereference at 000000000000000
  PGD 0 P4D 0
  Oops: 0000 1 SMP PTI
  CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 
  Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014
  RIP: 0010:__bitmap_and+0x1b/0x70
  RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048
  RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750
  RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0
  R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003
  FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0
  Call Trace:
  hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
  hfi1_init_dd+0xd7f/0x1a90 [hfi1]
  ? pci_bus_read_config_dword+0x49/0x70
  ? pci_mmcfg_read+0x3e/0xe0
  do_init_one.isra.18+0x336/0x640 [hfi1]
  local_pci_probe+0x41/0x90
  pci_device_probe+0x105/0x1c0
  really_probe+0x212/0x440
  driver_probe_device+0x49/0xc0
  device_driver_attach+0x50/0x60
  __driver_attach+0x61/0x130
  ? device_driver_attach+0x60/0x60
  bus_for_each_dev+0x77/0xc0
  ? klist_add_tail+0x3b/0x70
  bus_add_driver+0x14d/0x1e0
  ? dev_init+0x10b/0x10b [hfi1]
  driver_register+0x6b/0xb0
  ? dev_init+0x10b/0x10b [hfi1]
  hfi1_mod_init+0x1e6/0x20a [hfi1]
  do_one_initcall+0x46/0x1c3
  ? free_unref_page_commit+0x91/0x100
  ? _cond_resched+0x15/0x30
  ? kmem_cache_alloc_trace+0x140/0x1c0
  do_init_module+0x5a/0x220
  load_module+0x14b4/0x17e0
  ? __do_sys_finit_module+0xa8/0x110
  __do_sys_finit_module+0xa8/0x110
  do_syscall_64+0x5b/0x1a0

The issue happens when pcibus_to_node() returns NO_NUMA_NODE.

Fix this issue by moving the initialization of dd->node to hfi1_devdata
allocation and remove the other pcibus_to_node() calls in the probe path
and use dd->node instead.

Affinity logic is adjusted to use a new field dd->affinity_entry as a
guard instead of dd->node.

Fixes: 4730f4a6c6 ("IB/hfi1: Activate the dummy netdev")
Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessandro@cornelisnetworks.com
Cc: stable@vger.kernel.org
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 15:31:59 -03:00
Potnuri Bharat Teja
603c4690b0 RDMA/cxgb4: check for ipv6 address properly while destroying listener
ipv6 bit is wrongly set by the below which causes fatal adapter lookup
engine errors for ipv4 connections while destroying a listener.  Fix it to
properly check the local address for ipv6.

Fixes: 3408be145a ("RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server")
Link: https://lore.kernel.org/r/20210331135715.30072-1-bharat@chelsio.com
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 15:31:45 -03:00
Frank Rowand
649cab56de of: properly check for error returned by fdt_get_name()
fdt_get_name() returns error values via a parameter pointer
instead of in function return.  Fix check for this error value
in populate_node() and callers of populate_node().

Chasing up the caller tree showed callers of various functions
failing to initialize the value of pointer parameters that
can return error values.  Initialize those values to NULL.

The bug was introduced by
commit e6a6928c3e ("of/fdt: Convert FDT functions to use libfdt")
but this patch can not be backported directly to that commit
because the relevant code has further been restructured by
commit dfbd4c6eff ("drivers/of: Split unflatten_dt_node()")

The bug became visible by triggering a crash on openrisc with:
commit 79edff1206 ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9")
as reported in:
https://lore.kernel.org/lkml/20210327224116.69309-1-linux@roeck-us.net/

Fixes: 79edff1206 ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210405032845.1942533-1-frowand.list@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
2021-04-07 13:07:30 -05:00
Vitaly Kuznetsov
fa26d0c778 ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m
Commit 8cdddd182b ("ACPI: processor: Fix CPU0 wakeup in
acpi_idle_play_dead()") tried to fix CPU0 hotplug breakage by copying
wakeup_cpu0() + start_cpu0() logic from hlt_play_dead()//mwait_play_dead()
into acpi_idle_play_dead(). The problem is that these functions are not
exported to modules so when CONFIG_ACPI_PROCESSOR=m build fails.

The issue could've been fixed by exporting both wakeup_cpu0()/start_cpu0()
(the later from assembly) but it seems putting the whole pattern into a
new function and exporting it instead is better.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 8cdddd182b ("CPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()")
Cc: <stable@vger.kernel.org> # 5.10+
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-07 19:02:43 +02:00
Linus Torvalds
3a22981230 ARM SoC fixes for v5.12, part 2
Most of the changes again are devicetree fixes, but there are also five
 trivial build fixes for issues I found when test building with gcc-11 or
 when running 'make W=1', and some OMAP platform specific code fixups.
 
 Broadcom
   - One revert for a Raspberry pi interrupt controller change that
     caused a regression.
 
 TI OMAP:
   - Remove unused duplicate sha2md5_fck clock node that can race with the
     OMAP4_SHA2MD5_CLKCTRL clock node for disable for unused clocks
 
   - Add aliases for omap4/5 mmc to put the slots back into the right
     order again
 
   - Fix typo for bionic voltage controllers that accidentally use mpu
     for all instances instead of mpu, core and iva
 
   - Fix random hangs for droid4 caused by missing fix from TI Android
     kernel tree to do a dummy smc call on cpuidle wakeup path
 
 NXP i.MX:
   - Fix a system failure on imx6qdl-phytec-pfla02 board when booting from
     SD, by adding missing vmmc supply for SD interfaces.
 
   - Fix address typo in i.MX8MM/Q IOMUXC_SD1_DATA0_GPIO2_IO2 definition.
 
 Marvell mvebu:
   - Fix storm interrupt on Turris Omnia
 
   - Enable hardware buffer management as it should be
 
 Build fixes for PXA, Freescale, Marvell, OMAP1 an Keystone.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmBs0OgACgkQYKtH/8kJ
 UifdEBAAiD3JebS8a1jsgL+/va/ptOuBZP2l4sCH3P/bczsNKeAn+BvwAy4jNJ4b
 C55ZFnz6tX37CGY7e1Pe7LC8WhVd1LGfCm/gSreKUTkETZd/87PoR1xM4GxbhmBQ
 8HNJOVDBSes6tHgWTAgQ7rHGQQ71JoRYc9FJPOH2JDsk8SaeL8Z+Bjay3O3nlBQw
 RU0zoWv/khkdRvzt4oDTmW6pPDQh5c9twv2ORZM92+tXhSeF2AAY08GdAAmiZL5W
 Lq30YozGSJHPcIYSN+jSWPJNtzmrF3oZVTqDzqTN/aIVoH+8MFZHSmCd3iM1RWkT
 wkanNiqF7CRYAdLmC00YTToJUQxsbOYugfUMWYC04VocVbeEDAhnITFVF1zrJLZ4
 q4E/S5WSZjLPUsiDhSK+d0S2bFVrEyQUaDaFWrC6Aet5wA6pI/8X0Q3ZSMV7jzq+
 NkZYuA2oKoW0vwnH+7432/1g33CpCxKRVr/zBhesjCpB3Ymj0OWfqGeHA2fyjFQq
 fNvUnG6LyXE+NBgIfgZTGbBr1gCT/XHqd0GcYrBy4v0L3x8qJSh1ClA0qlpWr+Zl
 mY5jMC6MrGGuHXEhqIoS38mO0RTyx9i2iDjge2CrAMmRxdVR453Z4VIbDnSwGDAe
 K8lASQKHEyvRzdmJDVhaesHqwU9BDtWULY8Q2+3jKqv3wwf6d0I=
 =YY35
 -----END PGP SIGNATURE-----

Merge tag 'arm-fixes-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Most of the changes again are devicetree fixes, but there are also
  five trivial build fixes for issues I found when test building with
  gcc-11 or when running 'make W=1', and some OMAP platform specific
  code fixups.

  Broadcom:
   - One revert for a Raspberry pi interrupt controller change that
     caused a regression.

  TI OMAP:
   - Remove unused duplicate sha2md5_fck clock node that can race with
     the OMAP4_SHA2MD5_CLKCTRL clock node for disable for unused clocks

   - Add aliases for omap4/5 mmc to put the slots back into the right
     order again

   - Fix typo for bionic voltage controllers that accidentally use mpu
     for all instances instead of mpu, core and iva

   - Fix random hangs for droid4 caused by missing fix from TI Android
     kernel tree to do a dummy smc call on cpuidle wakeup path

  NXP i.MX:
   - Fix a system failure on imx6qdl-phytec-pfla02 board when booting
     from SD, by adding missing vmmc supply for SD interfaces.

   - Fix address typo in i.MX8MM/Q IOMUXC_SD1_DATA0_GPIO2_IO2
     definition.

  Marvell mvebu:
   - Fix storm interrupt on Turris Omnia

   - Enable hardware buffer management as it should be

  ... and build fixes for PXA, Freescale, Marvell, OMAP1 and Keystone"

* tag 'arm-fixes-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: dts: turris-omnia: configure LED[2]/INTn pin as interrupt pin
  ARM: dts: turris-omnia: fix hardware buffer management
  Revert "arm64: dts: marvell: armada-cp110: Switch to per-port SATA interrupts"
  ARM: mvebu: avoid clang -Wtautological-constant warning
  ARM: pxa: mainstone: avoid -Woverride-init warning
  ARM: omap1: fix building with clang IAS
  soc/fsl: qbman: fix conflicting alignment attributes
  ARM: keystone: fix integer overflow warning
  ARM: dts: imx6: pbab01: Set vmmc supply for both SD interfaces
  arm64: dts: imx8mm/q: Fix pad control of SD1_DATA0
  ARM: OMAP4: PM: update ROM return address for OSWR and OFF
  ARM: OMAP4: Fix PMIC voltage domains for bionic
  ARM: dts: Fix moving mmc devices with aliases for omap4 & 5
  ARM: dts: Drop duplicate sha2md5_fck to fix clk_disable race
  Revert "ARM: dts: bcm2711: Add the BSC interrupt controller"
2021-04-07 09:26:50 -07:00
Linus Torvalds
dbaa5d1c25 Merge branch 'parisc-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
 "One link error fix found by the kernel test robot, one sparse warning
  fix, remove a duplicate declaration and some spelling fixes"

* 'parisc-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: math-emu: Few spelling fixes in the file fpu.h
  parisc: avoid a warning on u8 cast for cmpxchg on u8 pointers
  parisc: parisc-agp requires SBA IOMMU driver
  parisc: Remove duplicate struct task_struct declaration
2021-04-07 09:20:07 -07:00
Linus Torvalds
5ba091db93 platform-drivers-x86 for v5.12-3
A single bugfix (on top of platform-drivers-x86-v5.12-2) to fix spurious
 wakeups from suspend caused by recent intel-hid driver changes.
 
 The following is an automated git shortlog grouped by driver:
 
 intel-hid:
  -  Fix spurious wakeups caused by tablet-mode events during suspend
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmBtg4YUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9x6DQf+MpmnVmDoJ5gB6yvICybA080oz4Vr
 7y9QHHP98ErpS1jROLXxUyrbKksOvc7JBKEEk06soGLX4M7+rv0GgngE43EaK1O/
 7VSW0i59j9wCvMrav8IQL/br/CvJt8oSuJ3YGP4LuM6bgSCzOSyiHYqAcfH8/xhs
 4UZQQWTnhTqmNOOGUOGRqHnT0GqF8S1DD3/dyBlX68TRV0+iNsy0/WCUQCfB9alY
 yqzHqHydFCpApVgaT4MsX4nJSrKDT/X5wBXBFDkBNevCrK2SOp/B4STfbSdB6GRR
 q6hsSIMPo86EHQWY4hkMk1+hi7G572RTyJ6vOi5DJe1bD3cFPJSrm2uO/Q==
 =aUK1
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fix from Hans de Goede:
 "A single bugfix to fix spurious wakeups from suspend caused by recent
  intel-hid driver changes"

* tag 'platform-drivers-x86-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: intel-hid: Fix spurious wakeups caused by tablet-mode events during suspend
2021-04-07 09:14:04 -07:00
Linus Torvalds
e3bb2f4f96 regulator: bd9571mwv fixes for v5.12
A set of driver specific fixes here, the main one is a fix to not try to
 set unsupported voltages on this device.  The other two patches clean up
 the error handling and eliminate the possibility that we could overflow
 the page when writing sysfs output (which AFAICT wasn't an issue but
 better to be sure).
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmBsnnAACgkQJNaLcl1U
 h9AZYAgAg2kEX44MHe5P7fOgSI7rTQum+Q55zGNIk0jdtThLjq62qSQuq3QJp19e
 +xVdNA76CtVONz3LYxMRRuyxXW29jvc+RCcCYbMQ/mlKQXIcia8nYzRqzhswZI4p
 AY3rVtSXf8WcL0cP30LeyB7+ynu0t8P4pZ+IlXCNc5gWJK0D06iG2zldHJFgy4Oa
 B/0eP/NxVgy3EzLWB2dhyQOeP0Yi+U24XFwY6qkMZtvcLfHllo5L8ts9GT+0PD5T
 ozXgZxD08V4zReFY0Y1Ptnt2ETh/yNLDhSp+NQcb7xrDjQ0qcKLSU7aV1I7N2vva
 6tG8nO1I+HoHjldHUmcRF/4hqr4uoQ==
 =eD6Z
 -----END PGP SIGNATURE-----

Merge tag 'regulator-fix-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "bd9571mwv regulator fixes for v5.12.

  A set of driver specific fixes here, the main one is a fix to not try
  to set unsupported voltages on this device. The other two patches
  clean up the error handling and eliminate the possibility that we
  could overflow the page when writing sysfs output (which AFAICT wasn't
  an issue but better to be sure)"

* tag 'regulator-fix-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: bd9571mwv: Convert device attribute to sysfs_emit()
  regulator: bd9571mwv: Fix regulator name printed on registration failure
  regulator: bd9571mwv: Fix AVS and DVFS voltage range
2021-04-07 09:08:36 -07:00
Luca Fancellu
d120198bd5 xen/evtchn: Change irq_info lock to raw_spinlock_t
Unmask operation must be called with interrupt disabled,
on preempt_rt spin_lock_irqsave/spin_unlock_irqrestore
don't disable/enable interrupts, so use raw_* implementation
and change lock variable in struct irq_info from spinlock_t
to raw_spinlock_t

Cc: stable@vger.kernel.org
Fixes: 25da4618af ("xen/events: don't unmask an event channel when an eoi is pending")
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20210406105105.10141-1-luca.fancellu@arm.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-04-07 08:33:28 -05:00
Takashi Iwai
9c3195778c ASoC: Fixes for v5.12
A fairly small batch of driver specific fixes, mainly for various x86
 systems with the biggest set being fixes to power down DSPs properly on
 x86 SOF systems.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmBsoLcACgkQJNaLcl1U
 h9A+yAf+OvyzGS1gejvjZNs7KLfDs+ujjnVhPW+PX+1FOZmwcCQn+BirFL8iWUPG
 /8JxHDR87TZPk01mW1/o0m2MzmiwwE25IXoyH5qCr4h8azDloOcPz+AwEo7E/yBU
 pxY2m/GNeLRzQB4kXAs6XtA2HTK0pz4qtbLAVVfroXeOPx2sBN79jypN73nIDgku
 cLJAP5J+1e8BBZA0C8zupVRcs65nC7RN1AdaFYaEKCxyn/0Qd7LA+h7p5nJZvJmF
 v6vY0YUmA+cFOA9ExHW7Rt6QIn8iIwp53i3BrtEbuLkxUv+2GckDP//u9C0WnRJq
 BDPN9VXcmPRD3xjOdmkf79BJ7ptByw==
 =BzCL
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v5.12-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.12

A fairly small batch of driver specific fixes, mainly for various x86
systems with the biggest set being fixes to power down DSPs properly on
x86 SOF systems.
2021-04-07 15:00:33 +02:00
Heiko Carstens
ad31a8c051 s390/setup: use memblock_free_late() to free old stack
Use memblock_free_late() to free the old machine check stack to the
buddy allocator instead of leaking it.

Fixes: b61b159512 ("s390: add stack for machine check handler")
Cc: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-04-07 14:37:28 +02:00
Jonas Holmberg
168632a495 ALSA: aloop: Fix initialization of controls
Add a control to the card before copying the id so that the numid field
is initialized in the copy. Otherwise the numid field of active_id,
format_id, rate_id and channels_id will be the same (0) and
snd_ctl_notify() will not queue the events properly.

Signed-off-by: Jonas Holmberg <jonashg@axis.com>
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210407075428.2666787-1-jonashg@axis.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-04-07 10:21:25 +02:00
Marc Kleine-Budde
c7eb923c3c can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
MCP251XFD_REG_TBC is the time base counter register. It increments
once per SYS clock tick, which is 20 or 40 MHz. Observation shows that
if the lowest byte (which is transferred first on the SPI bus) of that
register is 0x00 or 0x80 the calculated CRC doesn't always match the
transferred one.

To reproduce this problem let the driver read the TBC register in a
high frequency. This can be done by attaching only the mcp251xfd CAN
controller to a valid terminated CAN bus and send a single CAN frame.
As there are no other CAN controller on the bus, the sent CAN frame is
not ACKed and the mcp251xfd repeats it. If user space enables the bus
error reporting, each of the NACK errors is reported with a time
stamp (which is read from the TBC register) to user space.

$ ip link set can0 down
$ ip link set can0 up type can bitrate 500000 berr-reporting on
$ cansend can0 4FF#ff.01.00.00.00.00.00.00

This leads to several error messages per second:

| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 3a 86 da, CRC=0x7753) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=80 01 b4 da, CRC=0x5830) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 e9 23 db, CRC=0xa723) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=00 8a 30 db, CRC=0x4a9c) retrying.
| mcp251xfd spi0.0 can0: CRC read error at address 0x0010 (length=4, data=80 f3 43 db, CRC=0x66d2) retrying.

If the highest bit in the lowest byte is flipped the transferred CRC
matches the calculated one. We assume for now the CRC calculation in
the chip works on wrong data and the transferred data is correct.

This patch implements the following workaround:

- If a CRC read error on the TBC register is detected and the lowest
  byte is 0x00 or 0x80, the highest bit of the lowest byte is flipped
  and the CRC is calculated again.
- If the CRC now matches, the _original_ data is passed to the reader.
  For now we assume transferred data was OK.

Link: https://lore.kernel.org/r/20210406110617.1865592-5-mkl@pengutronix.de
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Thomas Kopp <thomas.kopp@microchip.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-04-07 09:31:28 +02:00
Marc Kleine-Budde
ef7a8c3e75 can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function
This patch factors out the crc check into a separate function. This is
preparation for the next patch.

Link: https://lore.kernel.org/r/20210406110617.1865592-4-mkl@pengutronix.de
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Thomas Kopp <thomas.kopp@microchip.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-04-07 09:31:28 +02:00
Marc Kleine-Budde
0084e298ac can: mcp251xfd: add BQL support
This patch re-adds BQL support to the driver. Support for
netdev_xmit_more() will be added in a separate patch series.

Link: https://lore.kernel.org/r/20210406110617.1865592-3-mkl@pengutronix.de
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Thomas Kopp <thomas.kopp@microchip.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-04-07 09:31:28 +02:00
Marc Kleine-Budde
8dc987519a can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM
This patch removes the unused enum BOSCH_C_CAN_PLATFORM.

Link: https://lore.kernel.org/r/20210406110617.1865592-2-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-04-07 09:31:28 +02:00
Marc Kleine-Budde
644022b1de can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call
In commit 1be37d3b04 ("can: m_can: fix periph RX path: use
rx-offload to ensure skbs are sent from softirq context") the RX path
for peripherals (i.e. SPI based m_can controllers) was converted to
the rx-offload infrastructure. However, the error handling for
can_rx_offload_queue_sorted() was forgotten.
can_rx_offload_queue_sorted() will return with an error if the
internal queue is full.

This patch adds the missing error handling, by increasing the
rx_fifo_errors.

Fixes: 1be37d3b04 ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context")
Link: https://lore.kernel.org/r/20210401084515.1455013-1-mkl@pengutronix.de
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1503583 ("Error handling issues")
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Torin Cooper-Bennun <torin@maxiluxsystems.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-04-07 09:31:28 +02:00
Marc Kleine-Budde
c812948744 can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails
The handling of CAN bus errors typically consist of allocating a CAN
error SKB using alloc_can_err_skb() followed by stats handling and
filling the error details in the newly allocated CAN error SKB. Even
if the allocation of the SKB fails the stats handling should not be
skipped.

The common pattern in CAN drivers is to allocate the skb and work on
the struct can_frame pointer "cf", if it has been assigned by
alloc_can_err_skb().

|	skb = alloc_can_err_skb(priv->ndev, &cf);
|
| 	/* RX errors */
| 	if (bdiag1 & (MCP251XFD_REG_BDIAG1_DCRCERR |
| 		      MCP251XFD_REG_BDIAG1_NCRCERR)) {
| 		netdev_dbg(priv->ndev, "CRC error\n");
|
| 		stats->rx_errors++;
| 		if (cf)
| 			cf->data[3] |= CAN_ERR_PROT_LOC_CRC_SEQ;
| 	}

In case of an OOM alloc_can_err_skb() returns NULL, but doesn't set
"cf" to NULL as well. For the above pattern to work the "cf" has to be
initialized to NULL, which is easily forgotten.

To solve this kind of problems, set "cf" to NULL if
alloc_can_err_skb() returns NULL.

Link: https://lore.kernel.org/r/20210402102245.1512583-1-mkl@pengutronix.de
Suggested-by: Vincent MAILHOL <mailhol.vincent@wanadoo.fr>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-04-07 09:31:19 +02:00
Chris Mi
f94d6389f6 net/mlx5e: TC, Add support to offload sample action
The following diagram illustrates the hardware model for tc sample action:

        +---------------------+
        + original flow table +
        +---------------------+
        +   original match    +
        +---------------------+
                   |
                   v
+------------------------------------------------+
+                Flow Sampler Object             +
+------------------------------------------------+
+                    sample ratio                +
+------------------------------------------------+
+    sample table id    |    default table id    +
+------------------------------------------------+
           |                            |
           v                            v
+-----------------------------+  +----------------------------------------+
+        sample table         +  + default table per <vport, chain, prio> +
+-----------------------------+  +----------------------------------------+
+ forward to management vport +  +            original match              +
+-----------------------------+  +----------------------------------------+
                                 +            other actions               +
                                 +----------------------------------------+

The sample action is translated to a goto flow table object
destination which samples packets according to the provided
sample ratio. Sampled packets are duplicated. One copy is
processed by a termination table, named the sample table,
which sends the packet to the eswitch manager port (that will
be processed by software).

The second copy is processed by the default table which executes
the subsequent actions. The default table is created per <vport,
chain, prio> tuple as rules with different prios and chains may
overlap.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:05 -07:00
Chris Mi
be9dc00474 net/mlx5e: TC, Handle sampled packets
Mark the sampled packets with a sample restore object. Send sampled
packets using the psample api.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:04 -07:00
Chris Mi
7319a1cc3c net/mlx5e: TC, Refactor tc update skb function
As a pre-step to process sampled packet in this function.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:04 -07:00
Chris Mi
36a3196256 net/mlx5e: TC, Add sampler restore handle API
Use common object pool to create an object ID to map sample parameters.
Allocate a modify header action to write the object ID to reg_c0 lower
16 bits. Create a restore rule to pass the object ID to software. So
software can identify sampled packets via the object ID and send it to
userspace.

Aggregate the modify header action, restore rule and object ID to a
sample restore handle. Re-use identical sample restore handle for
the same object ID.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:04 -07:00
Chris Mi
11ecd6c60b net/mlx5e: TC, Add sampler object API
In order to offload sample action, HW introduces sampler object. The
sampler object samples packets according to the provided sample ratio.
Sampled packets are duplicated. One copy is processed by a termination
table, named the sample table, which sends the packet up to software.
The second copy is processed by the default table.

Instantiate sampler object. Re-use identical sampler object for
the same sample ratio, sample table and default table as a prestep for
offloading tc sample actions.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:03 -07:00
Chris Mi
2a9ab10a56 net/mlx5e: TC, Add sampler termination table API
Sampled packets are sent to software using termination tables. There
is only one rule in that table that is to forward sampled packets to
the e-switch management vport.

Create a sampler termination table and rule for each eswitch.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:03 -07:00
Chris Mi
41c2fd9498 net/mlx5e: TC, Parse sample action
Parse TC sample action and save sample parameters in flow attribute
data structure.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:03 -07:00
Chris Mi
c935568271 net/mlx5: Instantiate separate mapping objects for FDB and NIC tables
Currently, the u32 chain id is mapped to u16 value which is stored on
the lower 16 bits of reg_c0 for FDB and reg_b for NIC tables. The
mapping is internally maintained by the chains object. However, with
the introduction of reg_c0 objects the fdb may store more than just
the chain id on reg_c0. This is not relevant for NIC tables.

Separate the chains mapping instantiation for FDB and NIC tables.
Remove the mapping from the chains object. For FDB tables, create
the mapping per eswitch. For NIC tables, create the mapping per tc
table. Pass the corresponding mapping pointer when creating the
chains object.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:02 -07:00
Chris Mi
a91d98a0a2 net/mlx5: Map register values to restore objects
Currently reg_c0 lower 16 bits and reg_b are used to store the chain
id that missed in FDB and NIC tables accordingly. However, the
registers' values may index a restore object, rather than a single u32
value. Different object types can be used to restore mutually exclusive
contexts such as chain id and sample group id.

Use the mapping object to associate an index with a restore object
as a prestep for supporting additional restore types.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:02 -07:00
Chris Mi
c1904360dd net/mlx5: E-switch, Set per vport table default group number
Different per voprt table is created using a different per vport table
namespace. Because we can't use variable to set the namespace member
value.  If max group number is 0 in the namespace, use the eswitch
default max group number.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:02 -07:00
Chris Mi
c796bb7cd2 net/mlx5: E-switch, Generalize per vport table API
Currently, per vport table was used only for port mirroring actions.
However, sample action will also require a per vport table instance.

Generalize the vport table API to work with multiple namespaces where
each namespace manages its own vport table instance.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:01 -07:00
Chris Mi
0a9e230787 net/mlx5: E-switch, Rename functions to follow naming convention.
Public api starts with mlx5 and remove mlx5 for non-public api.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:01 -07:00
Chris Mi
4c7f40287a net/mlx5: E-switch, Move vport table functions to a new file
Currently, the vport table functions are in common eswitch offload
file. This file is too big. Move the vport table create, delete and
lookup functions to a separate file. Put the file in esw directory.

Pre-step for generalizing its functionality for serving both the
mirroring and the sample features.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:36:01 -07:00
Xiaoming Ni
d5f9b005c3 net/mlx5: fix kfree mismatch in indir_table.c
Memory allocated by kvzalloc() should be freed by kvfree().

Fixes: 34ca65352d ("net/mlx5: E-Switch, Indirect table infrastructur")
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:04:36 -07:00
Aya Levin
534b1204ca net/mlx5: Fix PBMC register mapping
Add reserved mapping to cover all the register in order to avoid setting
arbitrary values to newer FW which implements the reserved fields.

Fixes: 50b4a3c236 ("net/mlx5: PPTB and PBMC register firmware command support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:04:36 -07:00
Aya Levin
ce28f0fd67 net/mlx5: Fix PPLM register mapping
Add reserved mapping to cover all the register in order to avoid
setting arbitrary values to newer FW which implements the reserved
fields.

Fixes: a58837f52d ("net/mlx5e: Expose FEC feilds and related capability bit")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:04:35 -07:00
Raed Salem
a14587dfc5 net/mlx5: Fix placement of log_max_flow_counter
The cited commit wrongly placed log_max_flow_counter field of
mlx5_ifc_flow_table_prop_layout_bits, align it to the HW spec intended
placement.

Fixes: 16f1c5bb3e ("net/mlx5: Check device capability for maximum flow counters")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:04:35 -07:00
Eli Cohen
1a73704c82 net/mlx5: Fix HW spec violation configuring uplink
Make sure to modify uplink port to follow only if the uplink_follow
capability is set as required by the HW spec. Failure to do so causes
traffic to the uplink representor net device to cease after switching to
switchdev mode.

Fixes: 7d0314b11c ("net/mlx5e: Modify uplink state on interface up/down")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06 21:04:35 -07:00
Al Viro
4f0ed93fb9 LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late
That (and traversals in case of umount .) should be done before
complete_walk().  Either a braino or mismerge damage on queue
reorders - either way, I should've spotted that much earlier.

Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
X-Paperbag: Brown
Fixes: 161aff1d93 "LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat()"
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-04-06 20:33:00 -04:00
Jakub Kicinski
0b35e0deb5 docs: ethtool: correct quotes
Quotes to backticks. All commands use backticks since the names
are constants.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:56:58 -07:00
Jakub Kicinski
5219d6012d docs: ethtool: fix some copy-paste errors
Fix incorrect documentation. Mostly referring to other objects,
likely because the text was copied and not adjusted.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:55:41 -07:00
Phillip Potter
cca8ea3b05 net: tun: set tun->dev->addr_len during TUNSETLINK processing
When changing type with TUNSETLINK ioctl command, set tun->dev->addr_len
to match the appropriate type, using new tun_get_addr_len utility function
which returns appropriate address length for given type. Fixes a
KMSAN-found uninit-value bug reported by syzbot at:
https://syzkaller.appspot.com/bug?id=0766d38c656abeace60621896d705743aeefed51

Reported-by: syzbot+001516d86dbe88862cec@syzkaller.appspotmail.com
Diagnosed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:52:21 -07:00
Peng Zhang
631a44ed25 nfp: flower: add support for packet-per-second policing
Allow hardware offload of a policer action attached to a matchall filter
which enforces a packets-per-second rate-limit.

e.g.
tc filter add dev tap1 parent ffff: u32 match \
        u32 0 0 police pkts_rate 3000 pkts_burst 1000

Signed-off-by: Peng Zhang <peng.zhang@corigine.com>
Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:47:46 -07:00
Wong Vee Khee
63cf323899 ethtool: fix incorrect datatype in set_eee ops
The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool
header file. Hence, we should use ethnl_update_u32() in set_eee ops.

Fixes: fd77be7bd4 ("ethtool: set EEE settings with EEE_SET request")
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:42:25 -07:00
Guangbin Huang
ed7bedd2c3 net: hns3: clear VF down state bit before request link status
Currently, the VF down state bit is cleared after VF sending
link status request command. There is problem that when VF gets
link status replied from PF, the down state bit may still set
as 1. In this case, the link status replied from PF will be
ignored and always set VF link status to down.

To fix this problem, clear VF down state bit before VF requests
link status.

Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:40:06 -07:00
David S. Miller
5106efe6ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following batch contains Netfilter/IPVS updates for your net-next tree:

1) Simplify log infrastructure modularity: Merge ipv4, ipv6, bridge,
   netdev and ARP families to nf_log_syslog.c. Add module softdeps.
   This fixes a rare deadlock condition that might occur when log
   module autoload is required. From Florian Westphal.

2) Moves part of netfilter related pernet data from struct net to
   net_generic() infrastructure. All of these users can be modules,
   so if they are not loaded there is no need to waste space. Size
   reduction is 7 cachelines on x86_64, also from Florian.

2) Update nftables audit support to report events once per table,
   to get it aligned with iptables. From Richard Guy Briggs.

3) Check for stale routes from the flowtable garbage collector path.
   This is fixing IPv6 which breaks due missing check for the dst_cookie.

4) Add a nfnl_fill_hdr() function to simplify netlink + nfnetlink
   headers setup.

5) Remove documentation on several statified functions.

6) Remove printk on netns creation for the FTP IPVS tracker,
   from Florian Westphal.

7) Remove unnecessary nf_tables_destroy_list_lock spinlock
   initialization, from Yang Yingliang.

7) Remove a duplicated forward declaration in ipset,
   from Wan Jiabing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:36:41 -07:00
David S. Miller
f57796a4b8 linux-can-fixes-for-5.12-20210406
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmBsOJcTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCpyVqK+u3vqWR9B/4zU7gK2vJ1HfN8zk4pHrCde5KGuqyJ
 8Neh73b6p6u/4fssUtKigmMmdcEDbjAM7fWxrIA13RVbUXgpye98ikZAQPQRlwLY
 WWAEY9RZ97uEWUKAx5J4TKxSbq81c649t7NHvnhJqLWdHZ9WPhTIqo3AuLhrSqP8
 ezszCZbiiu+AJIJzpcMeWFWowYo7/uFbrc/F5nz+2Xb9jp83n2yUYcp6S6XYIsZw
 I1lZqlQX6PQ6cRJKugxX5DkXy8VO2Xi+sm+YwE5Bj3NLTC70DgEIEpNDPch3g6mu
 uKChSuyjCj2emCCBbnkGK3azdnW4wbtbhF913puw/crb38DNlCi/XlTi
 =JN6W
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-5.12-20210406' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2021-04-06

this is a pull request of 1 patch for net/master.

The patch is by me and fixes the SPI half duplex support in the
mcp251x CAN driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:34:11 -07:00
Andy Shevchenko
a460513ed4 time64.h: Consolidated PSEC_PER_SEC definition
We have currently three users of the PSEC_PER_SEC each of them defining it
individually. Instead, move it to time64.h to be available for everyone.

There is a new user coming with the same constant in use. It will also
make its life easier.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:32:17 -07:00
Andy Shevchenko
3036ec035c stmmac: intel: Drop duplicate ID in the list of PCI device IDs
The PCI device IDs are defined with a prefix PCI_DEVICE_ID.
There is no need to repeat the ID part at the end of each definition.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:31:18 -07:00
Guenter Roeck
66c3f05ddc pcnet32: Use pci_resource_len to validate PCI resource
pci_resource_start() is not a good indicator to determine if a PCI
resource exists or not, since the resource may start at address 0.
This is seen when trying to instantiate the driver in qemu for riscv32
or riscv64.

pci 0000:00:01.0: reg 0x10: [io  0x0000-0x001f]
pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000001f]
...
pcnet32: card has no PCI IO resources, aborting

Use pci_resouce_len() instead.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06 16:30:17 -07:00
John Fastabend
144748eb0c bpf, sockmap: Fix incorrect fwd_alloc accounting
Incorrect accounting fwd_alloc can result in a warning when the socket
is torn down,

 [18455.319240] WARNING: CPU: 0 PID: 24075 at net/core/stream.c:208 sk_stream_kill_queues+0x21f/0x230
 [...]
 [18455.319543] Call Trace:
 [18455.319556]  inet_csk_destroy_sock+0xba/0x1f0
 [18455.319577]  tcp_rcv_state_process+0x1b4e/0x2380
 [18455.319593]  ? lock_downgrade+0x3a0/0x3a0
 [18455.319617]  ? tcp_finish_connect+0x1e0/0x1e0
 [18455.319631]  ? sk_reset_timer+0x15/0x70
 [18455.319646]  ? tcp_schedule_loss_probe+0x1b2/0x240
 [18455.319663]  ? lock_release+0xb2/0x3f0
 [18455.319676]  ? __release_sock+0x8a/0x1b0
 [18455.319690]  ? lock_downgrade+0x3a0/0x3a0
 [18455.319704]  ? lock_release+0x3f0/0x3f0
 [18455.319717]  ? __tcp_close+0x2c6/0x790
 [18455.319736]  ? tcp_v4_do_rcv+0x168/0x370
 [18455.319750]  tcp_v4_do_rcv+0x168/0x370
 [18455.319767]  __release_sock+0xbc/0x1b0
 [18455.319785]  __tcp_close+0x2ee/0x790
 [18455.319805]  tcp_close+0x20/0x80

This currently happens because on redirect case we do skb_set_owner_r()
with the original sock. This increments the fwd_alloc memory accounting
on the original sock. Then on redirect we may push this into the queue
of the psock we are redirecting to. When the skb is flushed from the
queue we give the memory back to the original sock. The problem is if
the original sock is destroyed/closed with skbs on another psocks queue
then the original sock will not have a way to reclaim the memory before
being destroyed. Then above warning will be thrown

  sockA                          sockB

  sk_psock_strp_read()
   sk_psock_verdict_apply()
     -- SK_REDIRECT --
     sk_psock_skb_redirect()
                                skb_queue_tail(psock_other->ingress_skb..)

  sk_close()
   sock_map_unref()
     sk_psock_put()
       sk_psock_drop()
         sk_psock_zap_ingress()

At this point we have torn down our own psock, but have the outstanding
skb in psock_other. Note that SK_PASS doesn't have this problem because
the sk_psock_drop() logic releases the skb, its still associated with
our psock.

To resolve lets only account for sockets on the ingress queue that are
still associated with the current socket. On the redirect case we will
check memory limits per 6fa9201a89, but will omit fwd_alloc accounting
until skb is actually enqueued. When the skb is sent via skb_send_sock_locked
or received with sk_psock_skb_ingress memory will be claimed on psock_other.

Fixes: 6fa9201a89 ("bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/161731444013.68884.4021114312848535993.stgit@john-XPS-13-9370
2021-04-07 01:29:06 +02:00
John Fastabend
1c84b33101 bpf, sockmap: Fix sk->prot unhash op reset
In '4da6a196f93b1' we fixed a potential unhash loop caused when
a TLS socket in a sockmap was removed from the sockmap. This
happened because the unhash operation on the TLS ctx continued
to point at the sockmap implementation of unhash even though the
psock has already been removed. The sockmap unhash handler when a
psock is removed does the following,

 void sock_map_unhash(struct sock *sk)
 {
	void (*saved_unhash)(struct sock *sk);
	struct sk_psock *psock;

	rcu_read_lock();
	psock = sk_psock(sk);
	if (unlikely(!psock)) {
		rcu_read_unlock();
		if (sk->sk_prot->unhash)
			sk->sk_prot->unhash(sk);
		return;
	}
        [...]
 }

The unlikely() case is there to handle the case where psock is detached
but the proto ops have not been updated yet. But, in the above case
with TLS and removed psock we never fixed sk_prot->unhash() and unhash()
points back to sock_map_unhash resulting in a loop. To fix this we added
this bit of code,

 static inline void sk_psock_restore_proto(struct sock *sk,
                                          struct sk_psock *psock)
 {
       sk->sk_prot->unhash = psock->saved_unhash;

This will set the sk_prot->unhash back to its saved value. This is the
correct callback for a TLS socket that has been removed from the sock_map.
Unfortunately, this also overwrites the unhash pointer for all psocks.
We effectively break sockmap unhash handling for any future socks.
Omitting the unhash operation will leave stale entries in the map if
a socket transition through unhash, but does not do close() op.

To fix set unhash correctly before calling into tls_update. This way the
TLS enabled socket will point to the saved unhash() handler.

Fixes: 4da6a196f9 ("bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loop")
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Reported-by: Lorenz Bauer <lmb@cloudflare.com>
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/161731441904.68884.15593917809745631972.stgit@john-XPS-13-9370
2021-04-07 01:29:06 +02:00