linux

Author	SHA1	Message	Date
Ian Abbott	9fea3a40f6	staging: comedi: ni_routes: allow partial routing information This patch fixes a regression on setting up asynchronous commands to use external trigger sources when board-specific routing information is missing. `ni_find_device_routes()` (called via `ni_assign_device_routes()`) finds the table of register values for the device family and the set of valid routes for the specific board. If both are found, `tables->route_values` is set to point to the table of register values for the device family and `tables->valid_routes` is set to point to the list of valid routes for the specific board. If either is not found, both `tables->route_values` and `tables->valid_routes` are left set at their initial null values (initialized by `ni_assign_device_routes()`) and the function returns `-ENODATA`. Returning an error results in some routing functionality being disabled. Unfortunately, leaving `table->route_values` set to `NULL` also breaks the setting up of asynchronous commands that are configured to use external trigger sources. Calls to `ni_check_trigger_arg()` or `ni_check_trigger_arg_roffs()` while checking the asynchronous command set-up would result in a null pointer dereference if `table->route_values` is `NULL`. The null pointer dereference is fixed in another patch, but it now results in failure to set up the asynchronous command. That is a regression from the behavior prior to commit `347e244884` ("staging: comedi: tio: implement global tio/ctr routing") and commit `56d0b826d3` ("staging: comedi: ni_mio_common: implement new routing for TRIG_EXT"). Change `ni_find_device_routes()` to set `tables->route_values` and/or `tables->valid_routes` to valid information even if the other one can only be set to `NULL` due to missing information. The function will still return an error in that case. This should result in `tables->valid_routes` being valid for all currently supported device families even if the board-specific routing information is missing. That should be enough to fix the regression on setting up asynchronous commands to use external triggers for boards with missing routing information. Fixes: `347e244884` ("staging: comedi: tio: implement global tio/ctr routing") Fixes: `56d0b826d3` ("staging: comedi: ni_mio_common: implement new routing for TRIG_EXT"). Cc: <stable@vger.kernel.org> # 4.20+ Cc: Spencer E. Olson <olsonse@umich.edu> Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20200114182532.132058-3-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-15 13:30:09 +01:00
Ian Abbott	01e20b664f	staging: comedi: ni_routes: fix null dereference in ni_find_route_source() In `ni_find_route_source()`, `tables->route_values` gets dereferenced. However it is possible that `tables->route_values` is `NULL`, leading to a null pointer dereference. `tables->route_values` will be `NULL` if the call to `ni_assign_device_routes()` during board initialization returned an error due to missing device family routing information or missing board-specific routing information. For example, there is currently no board-specific routing information provided for the PCIe-6251 board and several other boards, so those are affected by this bug. The bug is triggered when `ni_find_route_source()` is called via `ni_check_trigger_arg()` or `ni_check_trigger_arg_roffs()` when checking the arguments for setting up asynchronous commands. Fix it by returning `-EINVAL` if `tables->route_values` is `NULL`. Even with this fix, setting up asynchronous commands to use external trigger sources for boards with missing routing information will still fail gracefully. Since `ni_find_route_source()` only depends on the device family routing information, it would be better if that was made available even if the board-specific routing information is missing. That will be addressed by another patch. Fixes: `4bb90c87ab` ("staging: comedi: add interface to ni routing table information") Cc: <stable@vger.kernel.org> # 4.20+ Cc: Spencer E. Olson <olsonse@umich.edu> Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20200114182532.132058-2-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-15 13:30:08 +01:00
David S. Miller	8b792f84c6	Merge branch 'mlxsw-Various-fixes' Ido Schimmel says: ==================== mlxsw: Various fixes This patch set contains various fixes for mlxsw. Patch #1 splits the init() callback between Spectrum-2 and Spectrum-3 in order to avoid enforcing the same firmware version for both ASICs, as this can't possibly work. Without this patch the driver cannot boot with the Spectrum-3 ASIC. Patches #2-#3 fix a long standing race condition that was recently exposed while testing the driver on an emulator, which is very slow compared to the actual hardware. The problem is explained in detail in the commit messages. Patch #4 fixes a selftest. Patch #5 prevents offloaded qdiscs from presenting a non-zero backlog to the user when the netdev is down. This is done by clearing the cached backlog in the driver when the netdev goes down. Patch #6 fixes qdisc statistics (backlog and tail drops) to also take into account the multicast traffic classes. v2: * Patches #2-#3: use skb_cow_head() instead of skb_unshare() as suggested by Jakub. Remove unnecessary check regarding headroom * Patches #5-#6: new ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Petr Machata	85005b82e5	mlxsw: spectrum_qdisc: Include MC TCs in Qdisc counters mlxsw configures Spectrum in such a way that BUM traffic is passed not through its nominal traffic class TC, but through its MC counterpart TC+8. However, when collecting statistics, Qdiscs only look at the nominal TC and ignore the MC TC. Add two helpers to compute the value for logical TC from the constituents, one for backlog, the other for tail drops. Use them throughout instead of going through the xstats pointer directly. Counters for TX bytes and packets are deduced from packet priority counters, and therefore already include BUM traffic. wred_drop counter is irrelevant on MC TCs, because RED is not enabled on them. Fixes: `7b81953066` ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Petr Machata	ca7609ff36	mlxsw: spectrum: Wipe xstats.backlog of down ports Per-port counter cache used by Qdiscs is updated periodically, unless the port is down. The fact that the cache is not updated for down ports is no problem for most counters, which are relative in nature. However, backlog is absolute in nature, and if there is a non-zero value in the cache around the time that the port goes down, that value just stays there. This value then leaks to offloaded Qdiscs that report non-zero backlog even if there (obviously) is no traffic. The HW does not keep backlog of a downed port, so do likewise: as the port goes down, wipe the backlog value from xstats. Fixes: `075ab8adaf` ("mlxsw: spectrum: Collect tclass related stats periodically") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Petr Machata	fef6d67049	selftests: mlxsw: qos_mc_aware: Fix mausezahn invocation Mausezahn does not recognize "own" as a keyword on source IP address. As a result, the MC stream is not running at all, and therefore no UC degradation can be observed even in principle. Fix the invocation, and tighten the test: due to the minimum shaper configured at the MC TCs, we always expect about 20% degradation. Fail the test if it is lower. Fixes: `573363a68f` ("selftests: mlxsw: Add qos_lib.sh") Signed-off-by: Petr Machata <petrm@mellanox.com> Reported-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Ido Schimmel	63963d0f9d	mlxsw: switchx2: Do not modify cloned SKBs during xmit The driver needs to prepend a Tx header to each packet it is transmitting. The header includes information such as the egress port and traffic class. The addition of the header requires the driver to modify the SKB's header and therefore it must not be shared. Otherwise, we risk hitting various race conditions. For example, when a packet is flooded (cloned) by the bridge driver to two switch ports swp1 and swp2: t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with swp1's port number t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with swp2's port number, overwriting swp1's port number t2 - The device processes data buffer from t0. Packet is transmitted via swp2 t3 - The device processes data buffer from t1. Packet is transmitted via swp2 Usually, the device is fast enough and transmits the packet before its Tx header is overwritten, but this is not the case in emulated environments. Fix this by making sure the SKB's header is writable by calling skb_cow_head(). Since the function ensures we have headroom to push the Tx header, the check further in the function can be removed. v2: * Use skb_cow_head() instead of skb_unshare() as suggested by Jakub * Remove unnecessary check regarding headroom Fixes: `31557f0f97` ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Ido Schimmel	2da51ce75d	mlxsw: spectrum: Do not modify cloned SKBs during xmit The driver needs to prepend a Tx header to each packet it is transmitting. The header includes information such as the egress port and traffic class. The addition of the header requires the driver to modify the SKB's header and therefore it must not be shared. Otherwise, we risk hitting various race conditions. For example, when a packet is flooded (cloned) by the bridge driver to two switch ports swp1 and swp2: t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with swp1's port number t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with swp2's port number, overwriting swp1's port number t2 - The device processes data buffer from t0. Packet is transmitted via swp2 t3 - The device processes data buffer from t1. Packet is transmitted via swp2 Usually, the device is fast enough and transmits the packet before its Tx header is overwritten, but this is not the case in emulated environments. Fix this by making sure the SKB's header is writable by calling skb_cow_head(). Since the function ensures we have headroom to push the Tx header, the check further in the function can be removed. v2: * Use skb_cow_head() instead of skb_unshare() as suggested by Jakub * Remove unnecessary check regarding headroom Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Ido Schimmel	d58c35ca52	mlxsw: spectrum: Do not enforce same firmware version for multiple ASICs In commit `a72afb6879` ("mlxsw: Enforce firmware version for Spectrum-2") I added a required firmware version for Spectrum-2, but missed the fact that mlxsw_sp2_init() is used by both Spectrum-2 and Spectrum-3. This means that the same firmware version will be used for both, which is wrong. Fix this by creating a new init() callback for Spectrum-3. Fixes: `a72afb6879` ("mlxsw: Enforce firmware version for Spectrum-2") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Shalom Toledo <shalomt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Keiya Nobuta	9c06ac4c83	usb: core: hub: Improved device recognition on remote wakeup If hub_activate() is called before D+ has stabilized after remote wakeup, the following situation might occur: __ ___________________ / \ / D+ __/ \__/ Hub _______________________________ \| ^ ^ ^ \| \| \| \| Host _____v__\|___\|___________\|______ \| \| \| \| \| \| \| \-- Interrupt Transfer (3) \| \| \-- ClearPortFeature (2) \| \-- GetPortStatus (1) \-- Host detects remote wakeup - D+ goes high, Host starts running by remote wakeup - D+ is not stable, goes low - Host requests GetPortStatus at (1) and gets the following hub status: - Current Connect Status bit is 0 - Connect Status Change bit is 1 - D+ stabilizes, goes high - Host requests ClearPortFeature and thus Connect Status Change bit is cleared at (2) - After waiting 100 ms, Host starts the Interrupt Transfer at (3) - Since the Connect Status Change bit is 0, Hub returns NAK. In this case, port_event() is not called in hub_event() and Host cannot recognize device. To solve this issue, flag change_bits even if only Connect Status Change bit is 1 when got in the first GetPortStatus. This issue occurs rarely because it only if D+ changes during a very short time between GetPortStatus and ClearPortFeature. However, it is fatal if it occurs in embedded system. Signed-off-by: Keiya Nobuta <nobuta.keiya@fujitsu.com> Cc: stable <stable@vger.kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20200109051448.28150-1-nobuta.keiya@fujitsu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-15 13:14:28 +01:00
David S. Miller	eb507906fe	A few fixes: * -O3 enablement fallout, thanks to Arnd who ran this * fixes for a few leaks, thanks to Felix * channel 12 regulatory fix for custom regdomains * check for a crash reported by syzbot (NULL function is called on drivers that don't have it) * fix TKIP replay protection after setup with some APs (from Jouni) * restrict obtaining some mesh data to avoid WARN_ONs * fix deadlocks with auto-disconnect (socket owner) * fix radar detection events with multiple devices -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl4e4voACgkQB8qZga/f l8RVFg/+Ngjzje8xU6Ymh7VSwvBBz6CmSVwH/nRSsIAszDrQLoDCo7+eWeUHnAKG 2ViFQ4Lf5CP2J9kSgi7pbD0yGQRDGxH1CVSxMB64uZZRErlOCZYLRTIeBn7HGb8w JhqAVXBjskzP9iB5jUZIkOgJ6L2ESM2KYKQVGtgn8p9fFdHKitec8UEKiPU3tcX1 zx94EbKldynuvDNb7VkTm4MdoY8LK5KI8ovVdVKwJ05qCtLvc6xvEBYhm33Fl6Pe 66cWklCrU093jQXUDTk0yBkwlcXGxJc143y1nXuurI8ZLH6q2wdDUAMHUEKdL6mz ktHxABK4JA4Z/ylphR2Tp1TNh8xnz/ZZ5IdAe4P9K0/PerrjH8Yk7zf/48lrHTSv LKb86Oaoq4MY4k5UNyjDUdgPB/q8CgxN2xDQaeQX69u740wrlxYtrwgHmqHrKI69 ppCzL7a3wmJzAz/97TMFhSseCH1WEQfWY+tlzktrvRofSgOGsIUtcPP10NmvGZHj zdF79E9TiUPX4KVWssRb9A2MjBbvKA31hCQbH/OVKSWt/fItoBUbMQKtPxfnHcam /W/gCWrOpw5Qqhc/BdFbYYsgY9J06KzifsV5e2Z6he+6Ex+rtdZNoLXc6/4QH3+9 HzsVYBP+ZUaqrMJi64KDN59C3ni4jDVs9zfYnVkC9LhzHASI5EE= =i20C -----END PGP SIGNATURE----- Merge tag 'mac80211-for-net-2020-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A few fixes: * -O3 enablement fallout, thanks to Arnd who ran this * fixes for a few leaks, thanks to Felix * channel 12 regulatory fix for custom regdomains * check for a crash reported by syzbot (NULL function is called on drivers that don't have it) * fix TKIP replay protection after setup with some APs (from Jouni) * restrict obtaining some mesh data to avoid WARN_ONs * fix deadlocks with auto-disconnect (socket owner) * fix radar detection events with multiple devices ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:12:00 -08:00
Ulrich Weber	4e4362d2bf	xfrm: support output_mark for offload ESP packets Commit `9b42c1f179` ("xfrm: Extend the output_mark") added output_mark support but missed ESP offload support. xfrm_smark_get() is not called within xfrm_input() for packets coming from esp4_gro_receive() or esp6_gro_receive(). Therefore call xfrm_smark_get() directly within these functions. Fixes: `9b42c1f179` ("xfrm: Extend the output_mark to support input direction and masking.") Signed-off-by: Ulrich Weber <ulrich.weber@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2020-01-15 12:18:35 +01:00
Chuansheng Liu	978370956d	x86/mce/therm_throt: Do not access uninitialized therm_work It is relatively easy to trigger the following boot splat on an Ice Lake client platform. The call stack is like: kernel BUG at kernel/timer/timer.c:1152! Call Trace: __queue_delayed_work queue_delayed_work_on therm_throt_process intel_thermal_interrupt ... The reason is that a CPU's thermal interrupt is enabled prior to executing its hotplug onlining callback which will initialize the throttling workqueues. Such a race can lead to therm_throt_process() accessing an uninitialized therm_work, leading to the above BUG at a very early bootup stage. Therefore, unmask the thermal interrupt vector only after having setup the workqueues completely. [ bp: Heavily massage commit message and correct comment formatting. ] Fixes: `f6656208f0` ("x86/mce/therm_throt: Optimize notifications of thermal throttle") Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200107004116.59353-1-chuansheng.liu@intel.com	2020-01-15 11:31:33 +01:00
Kevin Hao	a564ac35d6	Revert "gpio: thunderx: Switch to GPIOLIB_IRQCHIP" This reverts commit `a7fc89f9d5` because there are some bugs in this commit, and we don't have a simple way to fix these bugs. So revert this commit to make the thunderx gpio work on the stable kernel at least. We will switch to GPIOLIB_IRQCHIP for thunderx gpio by following patches. Fixes: `a7fc89f9d5` ("gpio: thunderx: Switch to GPIOLIB_IRQCHIP") Signed-off-by: Kevin Hao <haokexin@gmail.com> Link: https://lore.kernel.org/r/20200114082821.14015-2-haokexin@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2020-01-15 11:17:21 +01:00
Eric Dumazet	de95a991bb	tick/sched: Annotate lockless access to last_jiffies_update syzbot (KCSAN) reported a data-race in tick_do_update_jiffies64(): BUG: KCSAN: data-race in tick_do_update_jiffies64 / tick_do_update_jiffies64 write to 0xffffffff8603d008 of 8 bytes by interrupt on cpu 1: tick_do_update_jiffies64+0x100/0x250 kernel/time/tick-sched.c:73 tick_sched_do_timer+0xd4/0xe0 kernel/time/tick-sched.c:138 tick_sched_timer+0x43/0xe0 kernel/time/tick-sched.c:1292 __run_hrtimer kernel/time/hrtimer.c:1514 [inline] __hrtimer_run_queues+0x274/0x5f0 kernel/time/hrtimer.c:1576 hrtimer_interrupt+0x22a/0x480 kernel/time/hrtimer.c:1638 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1110 [inline] smp_apic_timer_interrupt+0xdc/0x280 arch/x86/kernel/apic/apic.c:1135 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 arch_local_irq_restore arch/x86/include/asm/paravirt.h:756 [inline] kcsan_setup_watchpoint+0x1d4/0x460 kernel/kcsan/core.c:436 check_access kernel/kcsan/core.c:466 [inline] __tsan_read1 kernel/kcsan/core.c:593 [inline] __tsan_read1+0xc2/0x100 kernel/kcsan/core.c:593 kallsyms_expand_symbol.constprop.0+0x70/0x160 kernel/kallsyms.c:79 kallsyms_lookup_name+0x7f/0x120 kernel/kallsyms.c:170 insert_report_filterlist kernel/kcsan/debugfs.c:155 [inline] debugfs_write+0x14b/0x2d0 kernel/kcsan/debugfs.c:256 full_proxy_write+0xbd/0x100 fs/debugfs/file.c:225 __vfs_write+0x67/0xc0 fs/read_write.c:494 vfs_write fs/read_write.c:558 [inline] vfs_write+0x18a/0x390 fs/read_write.c:542 ksys_write+0xd5/0x1b0 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __x64_sys_write+0x4c/0x60 fs/read_write.c:620 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffffffff8603d008 of 8 bytes by task 0 on cpu 0: tick_do_update_jiffies64+0x2b/0x250 kernel/time/tick-sched.c:62 tick_nohz_update_jiffies kernel/time/tick-sched.c:505 [inline] tick_nohz_irq_enter kernel/time/tick-sched.c:1257 [inline] tick_irq_enter+0x139/0x1c0 kernel/time/tick-sched.c:1274 irq_enter+0x4f/0x60 kernel/softirq.c:354 entering_irq arch/x86/include/asm/apic.h:517 [inline] entering_ack_irq arch/x86/include/asm/apic.h:523 [inline] smp_apic_timer_interrupt+0x55/0x280 arch/x86/kernel/apic/apic.c:1133 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:60 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571 default_idle_call+0x1e/0x40 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x1af/0x280 kernel/sched/idle.c:263 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:355 rest_init+0xec/0xf6 init/main.c:452 arch_call_rest_init+0x17/0x37 start_kernel+0x838/0x85e init/main.c:786 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490 x86_64_start_kernel+0x72/0x76 arch/x86/kernel/head64.c:471 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Use READ_ONCE() and WRITE_ONCE() to annotate this expected race. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191205045619.204946-1-edumazet@google.com	2020-01-15 10:54:12 +01:00
Felix Fietkau	81c044fc3b	cfg80211: fix page refcount issue in A-MSDU decap The fragments attached to a skb can be part of a compound page. In that case, page_ref_inc will increment the refcount for the wrong page. Fix this by using get_page instead, which calls page_ref_inc on the compound head and also checks for overflow. Fixes: `2b67f944f8` ("cfg80211: reuse existing page fragments in A-MSDU rx") Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200113182107.20461-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:53:35 +01:00
Johannes Berg	24953de0a5	cfg80211: check for set_wiphy_params Check if set_wiphy_params is assigned and return an error if not, some drivers (e.g. virt_wifi where syzbot reported it) don't have it. Reported-by: syzbot+e8a797964a4180eb57d5@syzkaller.appspotmail.com Reported-by: syzbot+34b582cf32c1db008f8e@syzkaller.appspotmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200113125358.ac07f276efff.Ibd85ee1b12e47b9efb00a2adc5cd3fac50da791a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:53:24 +01:00
Felix Fietkau	df16737d43	cfg80211: fix memory leak in cfg80211_cqm_rssi_update The per-tid statistics need to be released after the call to rdev_get_station Cc: stable@vger.kernel.org Fixes: `8689c051a2` ("cfg80211: dynamically allocate per-tid stats for station info") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200108170630.33680-2-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:53:12 +01:00
Felix Fietkau	2a279b3416	cfg80211: fix memory leak in nl80211_probe_mesh_link The per-tid statistics need to be released after the call to rdev_get_station Cc: stable@vger.kernel.org Fixes: `5ab92e7fe4` ("cfg80211: add support to probe unexercised mesh link") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200108170630.33680-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:53:02 +01:00
Markus Theil	5a128a088a	cfg80211: fix deadlocks in autodisconnect work Use methods which do not try to acquire the wdev lock themselves. Cc: stable@vger.kernel.org Fixes: `37b1c00468` ("cfg80211: Support all iftypes in autodisconnect_wk") Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de> Link: https://lore.kernel.org/r/20200108115536.2262-1-markus.theil@tu-ilmenau.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:52:49 +01:00
Arnd Bergmann	e16119655c	wireless: wext: avoid gcc -O3 warning After the introduction of CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3, the wext code produces a bogus warning: In function 'iw_handler_get_iwstats', inlined from 'ioctl_standard_call' at net/wireless/wext-core.c:1015:9, inlined from 'wireless_process_ioctl' at net/wireless/wext-core.c:935:10, inlined from 'wext_ioctl_dispatch.part.8' at net/wireless/wext-core.c:986:8, inlined from 'wext_handle_ioctl': net/wireless/wext-core.c:671:3: error: argument 1 null where non-null expected [-Werror=nonnull] memcpy(extra, stats, sizeof(struct iw_statistics)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from arch/x86/include/asm/string.h:5, net/wireless/wext-core.c: In function 'wext_handle_ioctl': arch/x86/include/asm/string_64.h:14:14: note: in a call to function 'memcpy' declared here The problem is that ioctl_standard_call() sometimes calls the handler with a NULL argument that would cause a problem for iw_handler_get_iwstats. However, iw_handler_get_iwstats never actually gets called that way. Marking that function as noinline avoids the warning and leads to slightly smaller object code as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20200107200741.3588770-1-arnd@arndb.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:52:20 +01:00
Jouni Malinen	6f60126521	mac80211: Fix TKIP replay protection immediately after key setup TKIP replay protection was skipped for the very first frame received after a new key is configured. While this is potentially needed to avoid dropping a frame in some cases, this does leave a window for replay attacks with group-addressed frames at the station side. Any earlier frame sent by the AP using the same key would be accepted as a valid frame and the internal RSC would then be updated to the TSC from that frame. This would allow multiple previously transmitted group-addressed frames to be replayed until the next valid new group-addressed frame from the AP is received by the station. Fix this by limiting the no-replay-protection exception to apply only for the case where TSC=0, i.e., when this is for the very first frame protected using the new key, and the local RSC had not been set to a higher value when configuring the key (which may happen with GTK). Signed-off-by: Jouni Malinen <j@w1.fi> Link: https://lore.kernel.org/r/20200107153545.10934-1-j@w1.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:52:12 +01:00
Orr Mazor	26ec17a1dc	cfg80211: Fix radar event during another phy CAC In case a radar event of CAC_FINISHED or RADAR_DETECTED happens during another phy is during CAC we might need to cancel that CAC. If we got a radar in a channel that another phy is now doing CAC on then the CAC should be canceled there. If, for example, 2 phys doing CAC on the same channels, or on comptable channels, once on of them will finish his CAC the other might need to cancel his CAC, since it is no longer relevant. To fix that the commit adds an callback and implement it in mac80211 to end CAC. This commit also adds a call to said callback if after a radar event we see the CAC is no longer relevant Signed-off-by: Orr Mazor <Orr.Mazor@tandemg.com> Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com> Link: https://lore.kernel.org/r/20191222145449.15792-1-Orr.Mazor@tandemg.com [slightly reformat/reword commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:50:48 +01:00
Ganapathi Bhat	c4b9d655e4	wireless: fix enabling channel 12 for custom regulatory domain Commit `e33e2241e2` ("Revert "cfg80211: Use 5MHz bandwidth by default when checking usable channels"") fixed a broken regulatory (leaving channel 12 open for AP where not permitted). Apply a similar fix to custom regulatory domain processing. Signed-off-by: Cathy Luo <xiaohua.luo@nxp.com> Signed-off-by: Ganapathi Bhat <ganapathi.bhat@nxp.com> Link: https://lore.kernel.org/r/1576836859-8945-1-git-send-email-ganapathi.bhat@nxp.com [reword commit message, fix coding style, add a comment] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:47:52 +01:00
Al Viro	508c877276	fix autofs regression caused by follow_managed() changes we need to reload ->d_flags after the call of ->d_manage() - the thing might've been called with dentry still negative and have the damn thing turned positive while we'd waited. Fixes: `d41efb522e` "fs/namei.c: pull positivity check into follow_managed()" Reported-by: Ian Kent <raven@themaw.net> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-01-15 01:36:46 -05:00
Al Viro	c64cd6e34e	reimplement path_mountpoint() with less magic ... and get rid of a bunch of bugs in it. Background: the reason for path_mountpoint() is that umount() really doesn't want attempts to revalidate the root of what it's trying to umount. The thing we want to avoid actually happen from complete_walk(); solution was to do something parallel to normal path_lookupat() and it both went overboard and got the boilerplate subtly (and not so subtly) wrong. A better solution is to do pretty much what the normal path_lookupat() does, but instead of complete_walk() do unlazy_walk(). All it takes to avoid that ->d_weak_revalidate() call... mountpoint_last() goes away, along with everything it got wrong, and so does the magic around LOOKUP_NO_REVAL. Another source of bugs is that when we traverse mounts at the final location (and we need to do that - umount . expects to get whatever's overmounting ., if any, out of the lookup) we really ought to take care of ->d_manage() - as it is, manual umount of autofs automount in progress can lead to unpleasant surprises for the daemon. Easily solved by using handle_lookup_down() instead of follow_mount(). Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-01-15 01:36:06 -05:00
Jens Axboe	78912934f4	io_uring: be consistent in assigning next work from handler If we pass back dependent work in case of links, we need to always ensure that we call the link setup and work prep handler. If not, we might be missing some setup for the next work item. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-01-14 22:09:06 -07:00
Jens Axboe	e0bbb3461a	io-wq: cancel work if we fail getting a mm reference If we require mm and user context, mark the request for cancellation if we fail to acquire the desired mm. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-01-14 22:06:11 -07:00
Ley Foon Tan	051d75d3bb	MAINTAINERS: Update Ley Foon Tan's email address @altera.com email is going to removed. Change to @intel.com email. Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>	2020-01-15 11:11:22 +08:00
Lorenzo Bianconi	8c4df83fbe	net: mvneta: fix dma sync size in mvneta_run_xdp Page pool API will start syncing (if requested) starting from page->dma_addr + pool->p.offset. Fix dma sync length in mvneta_run_xdp since we do not need to account xdp headroom Fixes: `07e13edbb6` ("net: mvneta: get rid of huge dma sync in mvneta_rx_refill") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-14 18:48:08 -08:00
Johan Hovold	86f3f4cd53	r8152: add missing endpoint sanity check Add missing endpoint sanity check to probe in order to prevent a NULL-pointer dereference (or slab out-of-bounds access) when retrieving the interrupt-endpoint bInterval on ndo_open() in case a device lacks the expected endpoints. Fixes: `40a82917b1` ("net/usb/r8152: enable interrupt transfer") Cc: hayeswang <hayeswang@realtek.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-14 18:43:21 -08:00
Masami Hiramatsu	aeed8aa387	tracing: trigger: Replace unneeded RCU-list traversals With CONFIG_PROVE_RCU_LIST, I had many suspicious RCU warnings when I ran ftracetest trigger testcases. ----- # dmesg -c > /dev/null # ./ftracetest test.d/trigger ... # dmesg \| grep "RCU-list traversed" \| cut -f 2 -d ] \| cut -f 2 -d " " kernel/trace/trace_events_hist.c:6070 kernel/trace/trace_events_hist.c:1760 kernel/trace/trace_events_hist.c:5911 kernel/trace/trace_events_trigger.c:504 kernel/trace/trace_events_hist.c:1810 kernel/trace/trace_events_hist.c:3158 kernel/trace/trace_events_hist.c:3105 kernel/trace/trace_events_hist.c:5518 kernel/trace/trace_events_hist.c:5998 kernel/trace/trace_events_hist.c:6019 kernel/trace/trace_events_hist.c:6044 kernel/trace/trace_events_trigger.c:1500 kernel/trace/trace_events_trigger.c:1540 kernel/trace/trace_events_trigger.c:539 kernel/trace/trace_events_trigger.c:584 ----- I investigated those warnings and found that the RCU-list traversals in event trigger and hist didn't need to use RCU version because those were called only under event_mutex. I also checked other RCU-list traversals related to event trigger list, and found that most of them were called from event_hist_trigger_func() or hist_unregister_trigger() or register/unregister functions except for a few cases. Replace these unneeded RCU-list traversals with normal list traversal macro and lockdep_assert_held() to check the event_mutex is held. Link: http://lkml.kernel.org/r/157680910305.11685.15110237954275915782.stgit@devnote2 Cc: stable@vger.kernel.org Fixes: `30350d65ac` ("tracing: Add variable support to hist triggers") Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-01-14 17:12:04 -05:00
Linus Torvalds	95e20af9fb	NFS Client Bugfixes for Linux 5.5-rc7 xprtrdma bugfixes: - Fix create_qp crash on device unload - Fix completion wait during device removal - Fix oops in receive handler after device removal -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAl4eJKUACgkQ18tUv7Cl QOuZsRAAxmW8hxaAFySFxiRI1o2u0heahZfWY0F0hsMTCk1YpAXUf70+cBh+5Poe V6/5bCmHKdFYSstwuJOxLvhsCMxfAUc5JJMrBCawWOKbuo3tWj9yUmBoBicj3s9w 1uomBDOl+o1G3SVJ44on0QU+b84HAcvEyCMuNYGCCrItud2vogEZcnsr1emgBHSL hvv1RRQWVT8wJEj68mf2sN0B1SvGDzTEtyJj4iXTq502L3e4WfeMDNxJx9c7tiqE auZR0jrms5pLlMFtXqYjkCYuMSPzdtWFu36SLLMAELbMKSOxnJ+y42uNOSBMBTf1 SL6aXJloMI/vWqjyQhg63s9tW3SULTLd4wKW4JthJnAaQ0P3afMtxCrMv8bHOOnf 0EHHyl0lMT18UtVz1GLO+21MlY3iGeA9qV2uJrf4KnPig/OYhu6Y02MQoseN4fSI u1DMDkZKLIHgse1Yb9H57GhjA9EIMF6kaSi5BC+pSMssQ+hBxYeIPSj/RRBn9q2Z XVjXw1EWVEkBuUe245Uo5wkZnufIWIeM6rVCKf15+bS3UD/bv6nWXXR+XA3yXd68 MeoTIyRf0uhYIeaB3fFRhZaDtU2Uxitm4LXhdPOvdNEqm/A1uvYm2E6G/jdnA+Bk DDVabgP7DSQ13eXcfvEmwp3xzXesU3EUUK98++f82E30LMCGzIs= =Hw2/ -----END PGP SIGNATURE----- Merge tag 'nfs-for-5.5-2' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client bugfixes from Anna Schumaker: "Three NFS over RDMA fixes for bugs Chuck found that can be hit during device removal: - Fix create_qp crash on device unload - Fix completion wait during device removal - Fix oops in receive handler after device removal" * tag 'nfs-for-5.5-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: xprtrdma: Fix oops in Receive handler after device removal xprtrdma: Fix completion wait during device removal xprtrdma: Fix create_qp crash on device unload	2020-01-14 13:33:14 -08:00
Masami Hiramatsu	99c9a923e9	tracing/uprobe: Fix double perf_event linking on multiprobe uprobe Fix double perf_event linking to trace_uprobe_filter on multiple uprobe event by moving trace_uprobe_filter under trace_probe_event. In uprobe perf event, trace_uprobe_filter data structure is managing target mm filters (in perf_event) related to each uprobe event. Since commit `60d53e2c3b` ("tracing/probe: Split trace_event related data from trace_probe") left the trace_uprobe_filter data structure in trace_uprobe, if a trace_probe_event has multiple trace_uprobe (multi-probe event), a perf_event is added to different trace_uprobe_filter on each trace_uprobe. This leads a linked list corruption. To fix this issue, move trace_uprobe_filter to trace_probe_event and link it once on each event instead of each probe. Link: http://lkml.kernel.org/r/157862073931.1800.3800576241181489174.stgit@devnote2 Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S . Miller" <davem@davemloft.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?b?w7hyZ2Vuc2Vu?= <thoiland@redhat.com> Cc: Jean-Tsung Hsiao <jhsiao@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: stable@vger.kernel.org Fixes: `60d53e2c3b` ("tracing/probe: Split trace_event related data from trace_probe") Link: https://lkml.kernel.org/r/20200108171611.GA8472@kernel.org Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-01-14 15:57:59 -05:00
Changbin Du	d0695e2351	tracing: xen: Ordered comparison of function pointers Just as commit `0566e40ce7` ("tracing: initcall: Ordered comparison of function pointers"), this patch fixes another remaining one in xen.h found by clang-9. In file included from arch/x86/xen/trace.c:21: In file included from ./include/trace/events/xen.h:475: In file included from ./include/trace/define_trace.h:102: In file included from ./include/trace/trace_events.h:473: ./include/trace/events/xen.h:69:7: warning: ordered comparison of function \ pointers ('xen_mc_callback_fn_t' (aka 'void ()(void )') and 'xen_mc_callback_fn_t') [-Wordered-compare-function-pointers] __field(xen_mc_callback_fn_t, fn) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/trace/trace_events.h:421:29: note: expanded from macro '__field' ^ ./include/trace/trace_events.h:407:6: note: expanded from macro '__field_ext' is_signed_type(type), filter_type); \ ^ ./include/linux/trace_events.h:554:44: note: expanded from macro 'is_signed_type' ^ Fixes: `c796f213a6` ("xen/trace: add multicall tracing") Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-01-14 15:55:57 -05:00
Ming Lei	4a2f704eb2	block: fix get_max_segment_size() overflow on 32bit arch Commit `429120f3df` starts to take account of segment's start dma address when computing max segment size, and data type of 'unsigned long' is used to do that. However, the segment mask may be 0xffffffff, so the figured out segment size may be overflowed in case of zero physical address on 32bit arch. Fix the issue by returning queue_max_segment_size() directly when that happens. Fixes: `429120f3df` ("block: fix splitting segments on boundary masks") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Christoph Hellwig <hch@lst.de> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-01-14 13:37:40 -07:00
Sunil Muthuswamy	c742c59e1f	hv_sock: Remove the accept port restriction Currently, hv_sock restricts the port the guest socket can accept connections on. hv_sock divides the socket port namespace into two parts for server side (listening socket), 0-0x7FFFFFFF & 0x80000000-0xFFFFFFFF (there are no restrictions on client port namespace). The first part (0-0x7FFFFFFF) is reserved for sockets where connections can be accepted. The second part (0x80000000-0xFFFFFFFF) is reserved for allocating ports for the peer (host) socket, once a connection is accepted. This reservation of the port namespace is specific to hv_sock and not known by the generic vsock library (ex: af_vsock). This is problematic because auto-binds/ephemeral ports are handled by the generic vsock library and it has no knowledge of this port reservation and could allocate a port that is not compatible with hv_sock (and legitimately so). The issue hasn't surfaced so far because the auto-bind code of vsock (__vsock_bind_stream) prior to the change 'VSOCK: bind to random port for VMADDR_PORT_ANY' would start walking up from LAST_RESERVED_PORT (1023) and start assigning ports. That will take a large number of iterations to hit 0x7FFFFFFF. But, after the above change to randomize port selection, the issue has started coming up more frequently. There has really been no good reason to have this port reservation logic in hv_sock from the get go. Reserving a local port for peer ports is not how things are handled generally. Peer ports should reflect the peer port. This fixes the issue by lifting the port reservation, and also returns the right peer port. Since the code converts the GUID to the peer port (by using the first 4 bytes), there is a possibility of conflicts, but that seems like a reasonable risk to take, given this is limited to vsock and that only applies to all local sockets. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-14 11:50:53 -08:00
Eric Dumazet	f8d7408a4d	net: usb: lan78xx: limit size of local TSO packets lan78xx_tx_bh() makes sure to not exceed MAX_SINGLE_PACKET_SIZE bytes in the aggregated packets it builds, but does nothing to prevent large GSO packets being submitted. Pierre-Francois reported various hangs when/if TSO is enabled. For localy generated packets, we can use netif_set_gso_max_size() to limit the size of TSO packets. Note that forwarded packets could still hit the issue, so a complete fix might require implementing .ndo_features_check for this driver, forcing a software segmentation if the size of the TSO packet exceeds MAX_SINGLE_PACKET_SIZE. Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: RENARD Pierre-Francois <pfrenard@gmail.com> Tested-by: RENARD Pierre-Francois <pfrenard@gmail.com> Cc: Stefan Wahren <stefan.wahren@i2se.com> Cc: Woojung Huh <woojung.huh@microchip.com> Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-14 11:04:42 -08:00
Vladis Dronov	75718584cb	ptp: free ptp device pin descriptors properly There is a bug in ptp_clock_unregister(), where ptp_cleanup_pin_groups() first frees ptp->pin_{,dev_}attr, but then posix_clock_unregister() needs them to destroy a related sysfs device. These functions can not be just swapped, as posix_clock_unregister() frees ptp which is needed in the ptp_cleanup_pin_groups(). Fix this by calling ptp_cleanup_pin_groups() in ptp_clock_release(), right before ptp is freed. This makes this patch fix an UAF bug in a patch which fixes an UAF bug. Reported-by: Antti Laakso <antti.laakso@intel.com> Fixes: `a33121e548` ("ptp: fix the race between the release of ptp_clock and cdev") Link: https://lore.kernel.org/netdev/3d2bd09735dbdaf003585ca376b7c1e5b69a19bd.camel@intel.com/ Signed-off-by: Vladis Dronov <vdronov@redhat.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-14 10:58:57 -08:00
Wayne Lin	7617e9621b	drm/dp_mst: clear time slots for ports invalid [Why] When change the connection status in a MST topology, mst device which detect the event will send out CONNECTION_STATUS_NOTIFY messgae. e.g. src-mst-mst-sst => src-mst (unplug) mst-sst Currently, under the above case of unplugging device, ports which have been allocated payloads and are no longer in the topology still occupy time slots and recorded in proposed_vcpi[] of topology manager. If we don't clean up the proposed_vcpi[], when code flow goes to try to update payload table by calling drm_dp_update_payload_part1(), we will fail at checking port validation due to there are ports with proposed time slots but no longer in the mst topology. As the result of that, we will also stop updating the DPCD payload table of down stream port. [How] While handling the CONNECTION_STATUS_NOTIFY message, add a detection to see if the event indicates that a device is unplugged to an output port. If the detection is true, then iterrate over all proposed_vcpi[] to see whether a port of the proposed_vcpi[] is still in the topology or not. If the port is invalid, set its num_slots to 0. Thereafter, when try to update payload table by calling drm_dp_update_payload_part1(), we can successfully update the DPCD payload table of down stream port and clear the proposed_vcpi[] to NULL. Changes since v1:(https://patchwork.kernel.org/patch/11275801/) * Invert the conditional to reduce the indenting Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Lyude Paul <lyude@redhat.com> [removed cc for stable - there's too many patches this depends on for this to backport cleanly] Link: https://patchwork.freedesktop.org/patch/msgid/20200106102158.28261-1-Wayne.Lin@amd.com	2020-01-14 13:57:24 -05:00
Chuck Lever	671c450b6f	xprtrdma: Fix oops in Receive handler after device removal Since v5.4, a device removal occasionally triggered this oops: Dec 2 17:13:53 manet kernel: BUG: unable to handle page fault for address: 0000000c00000219 Dec 2 17:13:53 manet kernel: #PF: supervisor read access in kernel mode Dec 2 17:13:53 manet kernel: #PF: error_code(0x0000) - not-present page Dec 2 17:13:53 manet kernel: PGD 0 P4D 0 Dec 2 17:13:53 manet kernel: Oops: 0000 [#1] SMP Dec 2 17:13:53 manet kernel: CPU: 2 PID: 468 Comm: kworker/2:1H Tainted: G W 5.4.0-00050-g53717e43af61 #883 Dec 2 17:13:53 manet kernel: Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015 Dec 2 17:13:53 manet kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] Dec 2 17:13:53 manet kernel: RIP: 0010:rpcrdma_wc_receive+0x7c/0xf6 [rpcrdma] Dec 2 17:13:53 manet kernel: Code: 6d 8b 43 14 89 c1 89 45 78 48 89 4d 40 8b 43 2c 89 45 14 8b 43 20 89 45 18 48 8b 45 20 8b 53 14 48 8b 30 48 8b 40 10 48 8b 38 <48> 8b 87 18 02 00 00 48 85 c0 75 18 48 8b 05 1e 24 c4 e1 48 85 c0 Dec 2 17:13:53 manet kernel: RSP: 0018:ffffc900035dfe00 EFLAGS: 00010246 Dec 2 17:13:53 manet kernel: RAX: ffff888467290000 RBX: ffff88846c638400 RCX: 0000000000000048 Dec 2 17:13:53 manet kernel: RDX: 0000000000000048 RSI: 00000000f942e000 RDI: 0000000c00000001 Dec 2 17:13:53 manet kernel: RBP: ffff888467611b00 R08: ffff888464e4a3c4 R09: 0000000000000000 Dec 2 17:13:53 manet kernel: R10: ffffc900035dfc88 R11: fefefefefefefeff R12: ffff888865af4428 Dec 2 17:13:53 manet kernel: R13: ffff888466023000 R14: ffff88846c63f000 R15: 0000000000000010 Dec 2 17:13:53 manet kernel: FS: 0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000 Dec 2 17:13:53 manet kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 2 17:13:53 manet kernel: CR2: 0000000c00000219 CR3: 0000000002009002 CR4: 00000000001606e0 Dec 2 17:13:53 manet kernel: Call Trace: Dec 2 17:13:53 manet kernel: __ib_process_cq+0x5c/0x14e [ib_core] Dec 2 17:13:53 manet kernel: ib_cq_poll_work+0x26/0x70 [ib_core] Dec 2 17:13:53 manet kernel: process_one_work+0x19d/0x2cd Dec 2 17:13:53 manet kernel: ? cancel_delayed_work_sync+0xf/0xf Dec 2 17:13:53 manet kernel: worker_thread+0x1a6/0x25a Dec 2 17:13:53 manet kernel: ? cancel_delayed_work_sync+0xf/0xf Dec 2 17:13:53 manet kernel: kthread+0xf4/0xf9 Dec 2 17:13:53 manet kernel: ? kthread_queue_delayed_work+0x74/0x74 Dec 2 17:13:53 manet kernel: ret_from_fork+0x24/0x30 The proximal cause is that this rpcrdma_rep has a rr_rdmabuf that is still pointing to the old ib_device, which has been freed. The only way that is possible is if this rpcrdma_rep was not destroyed by rpcrdma_ia_remove. Debugging showed that was indeed the case: this rpcrdma_rep was still in use by a completing RPC at the time of the device removal, and thus wasn't on the rep free list. So, it was not found by rpcrdma_reps_destroy(). The fix is to introduce a list of all rpcrdma_reps so that they all can be found when a device is removed. That list is used to perform only regbuf DMA unmapping, replacing that call to rpcrdma_reps_destroy(). Meanwhile, to prevent corruption of this list, I've moved the destruction of temp rpcrdma_rep objects to rpcrdma_post_recvs(). rpcrdma_xprt_drain() ensures that post_recvs (and thus rep_destroy) is not invoked while rpcrdma_reps_unmap is walking rb_all_reps, thus protecting the rb_all_reps list. Fixes: `b0b227f071` ("xprtrdma: Use an llist to manage free rpcrdma_reps") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2020-01-14 13:30:24 -05:00
Chuck Lever	13cb886c59	xprtrdma: Fix completion wait during device removal I've found that on occasion, "rmmod <dev>" will hang while if an NFS is under load. Ensure that ri_remove_done is initialized only just before the transport is woken up to force a close. This avoids the completion possibly getting initialized again while the CM event handler is waiting for a wake-up. Fixes: `bebd031866` ("xprtrdma: Support unplugging an HCA from under an NFS mount") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2020-01-14 13:30:24 -05:00
Chuck Lever	b32b9ed493	xprtrdma: Fix create_qp crash on device unload On device re-insertion, the RDMA device driver crashes trying to set up a new QP: Nov 27 16:32:06 manet kernel: BUG: kernel NULL pointer dereference, address: 00000000000001c0 Nov 27 16:32:06 manet kernel: #PF: supervisor write access in kernel mode Nov 27 16:32:06 manet kernel: #PF: error_code(0x0002) - not-present page Nov 27 16:32:06 manet kernel: PGD 0 P4D 0 Nov 27 16:32:06 manet kernel: Oops: 0002 [#1] SMP Nov 27 16:32:06 manet kernel: CPU: 1 PID: 345 Comm: kworker/u28:0 Tainted: G W 5.4.0 #852 Nov 27 16:32:06 manet kernel: Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015 Nov 27 16:32:06 manet kernel: Workqueue: xprtiod xprt_rdma_connect_worker [rpcrdma] Nov 27 16:32:06 manet kernel: RIP: 0010:atomic_try_cmpxchg+0x2/0x12 Nov 27 16:32:06 manet kernel: Code: ff ff 48 8b 04 24 5a c3 c6 07 00 0f 1f 40 00 c3 31 c0 48 81 ff 08 09 68 81 72 0c 31 c0 48 81 ff 83 0c 68 81 0f 92 c0 c3 8b 06 <f0> 0f b1 17 0f 94 c2 84 d2 75 02 89 06 88 d0 c3 53 ba 01 00 00 00 Nov 27 16:32:06 manet kernel: RSP: 0018:ffffc900035abbf0 EFLAGS: 00010046 Nov 27 16:32:06 manet kernel: RAX: 0000000000000000 RBX: 00000000000001c0 RCX: 0000000000000000 Nov 27 16:32:06 manet kernel: RDX: 0000000000000001 RSI: ffffc900035abbfc RDI: 00000000000001c0 Nov 27 16:32:06 manet kernel: RBP: ffffc900035abde0 R08: 000000000000000e R09: ffffffffffffc000 Nov 27 16:32:06 manet kernel: R10: 0000000000000000 R11: 000000000002e800 R12: ffff88886169d9f8 Nov 27 16:32:06 manet kernel: R13: ffff88886169d9f4 R14: 0000000000000246 R15: 0000000000000000 Nov 27 16:32:06 manet kernel: FS: 0000000000000000(0000) GS:ffff88846fa40000(0000) knlGS:0000000000000000 Nov 27 16:32:06 manet kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 27 16:32:06 manet kernel: CR2: 00000000000001c0 CR3: 0000000002009006 CR4: 00000000001606e0 Nov 27 16:32:06 manet kernel: Call Trace: Nov 27 16:32:06 manet kernel: do_raw_spin_lock+0x2f/0x5a Nov 27 16:32:06 manet kernel: create_qp_common.isra.47+0x856/0xadf [mlx4_ib] Nov 27 16:32:06 manet kernel: ? slab_post_alloc_hook.isra.60+0xa/0x1a Nov 27 16:32:06 manet kernel: ? __kmalloc+0x125/0x139 Nov 27 16:32:06 manet kernel: mlx4_ib_create_qp+0x57f/0x972 [mlx4_ib] The fix is to copy the qp_init_attr struct that was just created by rpcrdma_ep_create() instead of using the one from the previous connection instance. Fixes: `98ef77d1aa` ("xprtrdma: Send Queue size grows after a reconnect") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2020-01-14 13:30:24 -05:00
Linus Torvalds	452424cdcb	Merge branch 'parisc-5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "A boot crash fix by Mike Rapoport and a printk fix by Krzysztof Kozlowski" * 'parisc-5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: fix map_pages() to actually populate upper directory parisc: Use proper printk format for resource_size_t	2020-01-14 10:22:10 -08:00
Linus Torvalds	67373994d2	asm-generic: fixes for v5.5 Here are two bugfixes from Mike Rapoport, both fixing compile-time errors for the nds32 architecture that were recently introduced. Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJeHehEAAoJEGCrR//JCVIncBIQAKmOG0xrS7YnUiyi0MqddSaj wszJg5Y1edyMhrrYXMkMz3G0AJ3esLAXpwxk/d3thPjKwgY5TaWGNgcpYKiNG0Mm ehSaeJHdXlXiAZIfjxM2POuJZ1na98CQJwBN+8jpWyXmlvzTAm3YJZyE65gZBAmP VYZtaLCD1lAs10Pp8101r7Z0gQ+Fhv06Unf144eJ0ZF33sG5/xxJW57LCv/Yfy00 XeiSbN7xJ/2gPuz8JcsaJUF8VwJzf1UhNganFUUSoaQ9n03saytbuS2UhhWDoc5z S9felivdKqo6l8ASc21ymW8WeSMwCO+mAOPuWg3T1K51Q/hasZ/CmGPBceHi0Pf9 zXtAigBc/GGANhu9w7kKl1IdKe5nTugk7EL+wm/4HAR8Z8JmdZT5SkE54SjB3maw E1tvIMF8TS/KjXHJCLhyksB/kmV3+TKqwQl0y+ZI4hmdVECE301PaTO3xNhb4oS4 dOnEMPsIPQutqmKOyoN+S6T0IYaBL0XJ+5UXeFVRC2LGUWfGx096jScMYqCwadDy MA49oK8xtKhXxdPa2E7yEzQIKx91EHCpavIy5Pg2WF1nvejGfH/amFgpvYaaJ1N1 uKYgavDJb6TeIylYfnaVawkB/WwM5uwWBANByPen86V9XqGhED3x10ceyvwNT6WF jD27rkrHRT9VN/x38VOb =ZMNN -----END PGP SIGNATURE----- Merge tag 'asm-generic-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground Pull asm-generic fixes from Arnd Bergmann: "Here are two bugfixes from Mike Rapoport, both fixing compile-time errors for the nds32 architecture that were recently introduced" * tag 'asm-generic-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground: nds32: fix build failure caused by page table folding updates asm-generic/nds32: don't redefine cacheflush primitives	2020-01-14 10:17:15 -08:00
Linus Torvalds	c21ed4d9a6	SCSI fixes on 20200114 Two simple fixes in the upper drivers (so both fairly core), one in enclosures, which fixes replugging a device into an enclosure slot and one in the disk driver which fixes revalidating a drive with protection information (PI) to make it a non-PI drive ... previously we were still remembering the old PI state. Both fixed issues are quite rare in the field. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXh3ociYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishUvhAQDcb5gL fuNT0jNkQ54sKUjVKvvJP1ArmfJ1ZIub4bvkMwEA0D+Ho3iE28KOSW1NRtgTe5mz 4Rrq64iJcAnt1PQ776U= =+ANJ -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two simple fixes in the upper drivers (so both fairly core), one in enclosures, which fixes replugging a device into an enclosure slot and one in the disk driver which fixes revalidating a drive with protection information (PI) to make it a non-PI drive ... previously we were still remembering the old PI state. Both fixed issues are quite rare in the field" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: enclosure: Fix stale device oops with hot replug scsi: sd: Clear sdkp->protection_type if disk is reformatted without PI	2020-01-14 10:14:06 -08:00
Linus Torvalds	e033e7d4a8	Merge branch 'dhowells' (patches from DavidH) Merge misc fixes from David Howells. Two afs fixes and a key refcounting fix. * dhowells: afs: Fix afs_lookup() to not clobber the version on a new dentry afs: Fix use-after-loss-of-ref keys: Fix request_key() cache	2020-01-14 09:56:31 -08:00
David Howells	f52b83b0b1	afs: Fix afs_lookup() to not clobber the version on a new dentry Fix afs_lookup() to not clobber the version set on a new dentry by afs_do_lookup() - especially as it's using the wrong version of the version (we need to use the one given to us by whatever op the dir contents correspond to rather than what's in the afs_vnode). Fixes: `9dd0b82ef5` ("afs: Fix missing dentry data version updating") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-01-14 09:40:06 -08:00
David Howells	40a708bd62	afs: Fix use-after-loss-of-ref afs_lookup() has a tracepoint to indicate the outcome of d_splice_alias(), passing it the inode to retrieve the fid from. However, the function gave up its ref on that inode when it called d_splice_alias(), which may have failed and dropped the inode. Fix this by caching the fid. Fixes: `80548b0399` ("afs: Add more tracepoints") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-01-14 09:40:06 -08:00
David Howells	8379bb84be	keys: Fix request_key() cache When the key cached by request_key() and co. is cleaned up on exit(), the code looks in the wrong task_struct, and so clears the wrong cache. This leads to anomalies in key refcounting when doing, say, a kernel build on an afs volume, that then trigger kasan to report a use-after-free when the key is viewed in /proc/keys. Fix this by making exit_creds() look in the passed-in task_struct rather than in current (the task_struct cleanup code is deferred by RCU and potentially run in another task). Fixes: `7743c48e54` ("keys: Cache result of request_key*() temporarily in task_struct") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-01-14 09:40:06 -08:00

... 5 6 7 8 9 ...

888890 Commits