Don't print a misleading header length mismatch error if the i2c call
returns an error. Instead just return the error code without any error
message.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are packets which doesn't have a payload. In that case, the second
i2c_master_read() will have a zero length. But because the NFC
controller doesn't have any data left, it will NACK the I2C read and
-ENXIO will be returned. In case there is no payload, just skip the
second i2c master read.
Fixes: 6be88670fc ("NFC: nxp-nci_i2c: Add I2C support to NXP NCI driver")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Free sk in case tipc_sk_insert() fails.
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
As of commit 5801f064e3 ("net: ipv6: unexport __init-annotated seg6_hmac_init()"),
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.
This remove the EXPORT_SYMBOL to fix modpost warning:
WARNING: modpost: vmlinux.o(___ksymtab+seg6_hmac_net_init+0x0): Section mismatch in reference from the variable __ksymtab_seg6_hmac_net_init to the function .init.text:seg6_hmac_net_init()
The symbol seg6_hmac_net_init is exported and annotated __init
Fix this by removing the __init annotation of seg6_hmac_net_init or drop the export.
Fixes: bf355b8d2c ("ipv6: sr: add core files for SR HMAC support")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20220628033134.21088-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When kcalloc fails, ipip6_tunnel_get_prl() should return -ENOMEM.
Move the position of label "out" to return correctly.
Addresses-Coverity: ("Unused value")
Fixes: 300aaeeaab ("[IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.")
Signed-off-by: katrinzhou <katrinzhou@tencent.com>
Reviewed-by: Eric Dumazet<edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220628035030.1039171-1-zys.zljxml@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mat Martineau says:
====================
mptcp: Fixes for 5.19
Several categories of fixes from the mptcp tree:
Patches 1-3 are fixes related to MP_FAIL and FASTCLOSE, to make sure
MIBs are accurate, and to handle MP_FAIL transmission and responses at
the correct times. sk_timer conflicts are also resolved.
Patches 4 and 6 handle two separate race conditions, one at socket
shutdown and one with unaccepted subflows.
Patch 5 makes sure read operations are not blocked during fallback to
TCP.
Patch 7 improves the diag selftest, which were incorrectly failing on
slow machines (like the VMs used for CI testing).
Patch 8 avoids possible symbol redefinition errors in the userspace
mptcp.h file.
Patch 9 fixes a selftest build issue with gcc 12.
====================
Link: https://lore.kernel.org/r/20220628010243.166605-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In a few MPTCP selftest tools, gcc 12 complains that the 'sock' variable
might be used uninitialized. This is a false positive because the only
code path that could lead to uninitialized access is where getaddrinfo()
fails, but the local xgetaddrinfo() wrapper exits if such a failure
occurs.
Initialize the 'sock' variable anyway to allow the tools to build with
gcc 12.
Fixes: 048d19d444 ("mptcp: add basic kselftest for mptcp")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Including <linux/mptcp.h> before the C library <netinet/in.h> header
causes symbol redefinition errors at compile-time due to duplicate
declarations and definitions in the <linux/in.h> header included by
<linux/mptcp.h>.
Explicitly include <netinet/in.h> before <linux/in.h> in
<linux/mptcp.h> when __KERNEL__ is not defined so that the C library
compatibility logic in <linux/libc-compat.h> is enabled when including
<linux/mptcp.h> in user space code.
Fixes: c11c5906bc ("mptcp: add MPTCP_SUBFLOW_ADDRS getsockopt support")
Signed-off-by: Ossama Othman <ossama.othman@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The mentioned test-case still use an hard-coded-len sleep to
wait for a relative large number of connection to be established.
On very slow VM and with debug build such timeout could be exceeded,
causing failures in our CI.
Address the issue polling for the expected condition several times,
up to an unreasonable high amount of time. On reasonably fast system
the self-tests will be faster then before, on very slow one we will
still catch the correct condition.
Fixes: df62f2ec3d ("selftests/mptcp: add diag interface tests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the listener socket owning the relevant request is closed,
it frees the unaccepted subflows and that causes later deletion
of the paired MPTCP sockets.
The mptcp socket's worker can run in the time interval between such delete
operations. When that happens, any access to msk->first will cause an UaF
access, as the subflow cleanup did not cleared such field in the mptcp
socket.
Address the issue explicitly traversing the listener socket accept
queue at close time and performing the needed cleanup on the pending
msk.
Note that the locking is a bit tricky, as we need to acquire the msk
socket lock, while still owning the subflow socket one.
Fixes: 86e39e0448 ("mptcp: keep track of local endpoint still available for each msk")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the MPTCP receive path reach a non fatal fall-back condition, e.g.
when the MPC sockets must fall-back to TCP, the existing code is a little
self-inconsistent: it reports that new data is available - return true -
but sets the MPC flag to the opposite value.
As the consequence read operations in some exceptional scenario may block
unexpectedly.
Address the issue setting the correct MPC read status. Additionally avoid
some code duplication in the fatal fall-back scenario.
Fixes: 9c81be0dbc ("mptcp: add MP_FAIL response support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If the MPTCP socket shutdown happens before a fallback
to TCP, and all the pending data have been already spooled,
we never close the TCP connection.
Address the issue explicitly checking for critical condition
at fallback time.
Fixes: 1e39e5a32a ("mptcp: infinite mapping sending")
Fixes: 0348c690ed ("mptcp: add the fallback check")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mptcp_mp_fail_no_response shouldn't be invoked on each worker run, it
should be invoked only when MP_FAIL response timeout occurs.
This patch refactors the MP_FAIL response logic.
It leverages the fact that only the MPC/first subflow can gracefully
fail to avoid unneeded subflows traversal: the failing subflow can
be only msk->first.
A new 'fail_tout' field is added to the subflow context to record the
MP_FAIL response timeout and use such field to reliably share the
timeout timer between the MP_FAIL event and the MPTCP socket close
timeout.
Finally, a new ack is generated to send out MP_FAIL notification as soon
as we hit the relevant condition, instead of waiting a possibly unbound
time for the next data packet.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/281
Fixes: d9fb797046 ("mptcp: Do not traverse the subflow connection list without lock")
Co-developed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This allow moving a couple of conditional out of the fast path,
making the code more easy to follow and will simplify the next
patch.
Fixes: ae66fb2ba6 ("mptcp: Do TCP fallback on early DSS checksum failure")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The current accounting for MP_FAIL and FASTCLOSE is not very
accurate: both can be increased even when the related option is
not really sent. Move the accounting into the correct place.
Fixes: eb7f33654d ("mptcp: add the mibs for MP_FAIL")
Fixes: 1e75629cb9 ("mptcp: add the mibs for MP_FASTCLOSE")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When a XSK pool gets mapped, xp_check_dma_contiguity() adds bit 0x1
to pages' DMA addresses that go in ascending order and at 4K stride.
The problem is that the bit does not get cleared before doing unmap.
As a result, a lot of warnings from iommu_dma_unmap_page() are seen
in dmesg, which indicates that lookups by iommu_iova_to_phys() fail.
Fixes: 2b43470add ("xsk: Introduce AF_XDP buffer allocation API")
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220628091848.534803-1-ivan.malov@oktetlabs.ru
After system resume the hp-wmi driver may complain:
[ 702.620180] hp_wmi: Unknown event_id - 23 - 0x0
According to HP it means 'Sanitization Mode' and it's harmless to just
ignore the event.
Cc: Jorge Lopez <jorge.lopez2@hp.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Link: https://lore.kernel.org/r/20220628123726.250062-1-kai.heng.feng@canonical.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
PSC platform profile mode is only supported on Linux for AMD platforms.
Some older Intel platforms (e.g T490) are advertising it's capability
as Windows uses it - but on Linux we should only be using MMC profile
for Intel systems.
Add a check to prevent it being enabled incorrectly.
Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Link: https://lore.kernel.org/r/20220627181449.3537-1-markpearson@lenovo.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Currently the active mode (PSC/MMC) is stored in an enum and queried
throughout the driver.
Other driver changes will enumerate additional submodes that are relevant
to be tracked, so instead track PSC/MMC in a single integer variable.
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Link: https://lore.kernel.org/r/20220603170212.164963-1-markpearson@lenovo.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
On some Panasonic models the volume up/down/mute keypresses get
reported both through the Panasonic ACPI HKEY interface as well as
through the atkbd device.
Filter out the atkbd scan-codes for these to avoid reporting presses
twice.
Note normally we would leave the filtering of these to userspace by mapping
the scan-codes to KEY_UNKNOWN through /lib/udev/hwdb.d/60-keyboard.hwdb.
However in this case that would cause regressions since we were filtering
the Panasonic ACPI HKEY events before, so filter these in the kernel.
Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reported-and-tested-by: Kenneth Chan <kenneth.t.chan@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-7-hdegoede@redhat.com
The brightness key-presses might also get reported by the ACPI video bus,
check for this and in this case don't report the presses to avoid reporting
2 presses for a single key-press.
Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reported-and-tested-by: Kenneth Chan <kenneth.t.chan@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-6-hdegoede@redhat.com
In hindsight blindly throwing away most of the key-press events is not
a good idea. So revert commit ed83c91718 ("platform/x86:
panasonic-laptop: Resolve hotkey double trigger bug").
Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reported-and-tested-by: Kenneth Chan <kenneth.t.chan@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-5-hdegoede@redhat.com
Sort includes alphabetically, small cleanup patch in preparation of
further changes.
Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-4-hdegoede@redhat.com
In the definition of panasonic_keymap[] the key codes are given in
decimal, later checks are done with hexadecimal values, which does
not help in understanding the code.
Additionally use two helper variables to shorten the code and make
the logic more obvious.
Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Signed-off-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-3-hdegoede@redhat.com
Some systems have an ACPI video bus but not ACPI video devices with
backlight capability. On these devices brightness key-presses are
(logically) not reported through the ACPI video bus.
Change how acpi_video_handles_brightness_key_presses() determines if
brightness key-presses are handled by the ACPI video driver to avoid
vendor specific drivers/platform/x86 drivers filtering out their
brightness key-presses even though they are the only ones reporting
these presses.
Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reported-and-tested-by: Kenneth Chan <kenneth.t.chan@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-2-hdegoede@redhat.com
The BPF subsystem consists of a large number of pieces. There is not a
single person that understands it all. Yet reviews are crucially important
for the BPF community to provide productive quality feedback to contributors
in a timely manner and therefore to ultimately expand the number of active
developers in the community.
So far, the BPF community had a two-stage review system, that is, a weekly
rotation among 7 developers (Alexei, Daniel, Andrii, Martin, Song, Yonghong,
John) as a first-level review of all inbound patches accompanied by a BPF CI
system which runs the in-tree BPF selftests to check for regressions for
every new patch, and then, a final check by Alexei, Daniel, Andrii to apply
the patches to either bpf or bpf-next trees.
This system worked well for the last ~3.5 years, but clearly reaches its
limits these days as it does not scale enough. Especially, as we also need
to allow enough room for every developer to contribute patches themselves,
integrate with their day to day job, and in particular avoid burnout. We
want to better scale both horizontally and vertically going forward.
On the horizontal scale, we are adding more developers (KP, Stan, Hao, Jiri)
to the overall core reviewer team, thus growing to 11 people in total. The
weekly rotation for the horizontal oncall reviewer is shortened to 1/2 week
(Mo - Wed and Thur - Fri). Instead of just patches, the coverage however
extends also generally to triage and reply to mailing list traffic (e.g. RFCs,
questions, etc).
On the vertical scale, there is clearly a need for deep expertise areas to
assign dedicated maintainer/reviewer teams that are responsible for code
reviews and help with design of individual building blocks. To some degree
we have been doing this implicitly, but the point is to formalize the teams
and commitment.
There is an overlap between areas and boundaries are intentionally grey. These
additional entries provide a guidance on who has to look at the patches. The
patch series which span multiple areas will be looked at by multiple people.
The vertical review with areas of deep expertise are bundled at the same time
with the horizontal side.
This patch cleans up a bit the BPF entries, adds mentioned developers to
the horizontal scale and creates new sub-entries with teams for developers
committing to the above outlined vertical scale. Also, pw.git tools we use
for BPF tree maintenance have been updated with a new pw-schedule script to
semi-automate vertical oncall review rotation.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@linux.dev>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Hao Luo <haoluo@google.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://git.kernel.org/pub/scm/linux/kernel/git/dborkman/pw.git
Link: https://lore.kernel.org/bpf/5bdc73e7f5a087299589944fa074563cdf2c2c1a.1656353995.git.daniel@iogearbox.net
Commit ceaf69f8ea ("fanotify: do not allow setting dirent events in
mask of non-dir") added restrictions about setting dirent events in the
mask of a non-dir inode mark, which does not make any sense.
For backward compatibility, these restictions were added only to new
(v5.17+) APIs.
It also does not make any sense to set the flags FAN_EVENT_ON_CHILD or
FAN_ONDIR in the mask of a non-dir inode. Add these flags to the
dir-only restriction of the new APIs as well.
Move the check of the dir-only flags for new APIs into the helper
fanotify_events_supported(), which is only called for FAN_MARK_ADD,
because there is no need to error on an attempt to remove the dir-only
flags from non-dir inode.
Fixes: ceaf69f8ea ("fanotify: do not allow setting dirent events in mask of non-dir")
Link: https://lore.kernel.org/linux-fsdevel/20220627113224.kr2725conevh53u4@quack3.lan/
Link: https://lore.kernel.org/r/20220627174719.2838175-1-amir73il@gmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
When routes corresponding to addresses are restored by
fixup_permanent_addr(), the dst_nopolicy parameter was not set.
The typical use case is a user that configures an address on a down
interface and then put this interface up.
Let's take care of this flag in addrconf_f6i_alloc(), so that every callers
benefit ont it.
CC: stable@kernel.org
CC: David Forster <dforster@brocade.com>
Fixes: df789fe752 ("ipv6: Provide ipv6 version of "disable_policy" sysctl")
Reported-by: Siwar Zitouni <siwar.zitouni@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220623120015.32640-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We should respect link partner capabilities and not force flow control
support on every link. Even more, in current state the MAC driver do not
advertises pause support so we should not keep flow control enabled at
all.
Fixes: e532a096be ("net: usb: asix: ax88772: add phylib support")
Reported-by: Anton Lundin <glance@acc.umu.se>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Anton Lundin <glance@acc.umu.se>
Link: https://lore.kernel.org/r/20220624075139.3139300-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If cable is attached after probe sequence, the usbnet framework would
not automatically start processing RX packets except at least one
packet was transmitted.
On systems with any kind of address auto configuration this issue was
not detected, because some packets are send immediately after link state
is changed to "running".
With this patch we will notify usbnet about link status change provided by the
PHYlib.
Fixes: e532a096be ("net: usb: asix: ax88772: add phylib support")
Reported-by: Anton Lundin <glance@acc.umu.se>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Anton Lundin <glance@acc.umu.se>
Link: https://lore.kernel.org/r/20220624075139.3139300-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mails to Charles get an auto reply, that he is no longer working at
Eff'Innov technologies. Drop the entry and mark the driver as orphaned.
Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220626200039.4062784-1-michael@walle.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Victor Nogueira says:
====================
Notify user space if any actions were flushed before error
This patch series fixes the behaviour of actions flush so that the
kernel always notifies user space whenever it deletes actions during a
flush operation, even if it didn't flush all the actions. This series
also introduces tdc tests to verify this new behaviour.
====================
Link: https://lore.kernel.org/r/20220623140742.684043-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tdc test cases to verify new flush behaviour is correct, which do
the following:
- Try to flush only one action which is being referenced by a filter
- Try to flush three actions where the last one (index 3) is being
referenced by a filter
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If during an action flush operation one of the actions is still being
referenced, the flush operation is aborted and the kernel returns to
user space with an error. However, if the kernel was able to flush, for
example, 3 actions and failed on the fourth, the kernel will not notify
user space that it deleted 3 actions before failing.
This patch fixes that behaviour by notifying user space of how many
actions were deleted before flush failed and by setting extack with a
message describing what happened.
Fixes: 55334a5db5 ("net_sched: act: refuse to remove bound action outside")
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
epic_close() calls epic_rx() and uses dma buffer, but in epic_remove_one()
we already freed the dma buffer. To fix this issue, reorder function calls
like in the .probe function.
BUG: KASAN: use-after-free in epic_rx+0xa6/0x7e0 [epic100]
Call Trace:
epic_rx+0xa6/0x7e0 [epic100]
epic_close+0xec/0x2f0 [epic100]
unregister_netdev+0x18/0x20
epic_remove_one+0xaa/0xf0 [epic100]
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Yilun Wu <yiluwu@cs.stonybrook.edu>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Reviewed-by: Francois Romieu <romieu@fr.zoreil.com>
Link: https://lore.kernel.org/r/20220627043351.25615-1-ztong0001@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While looking at a syzbot report I noticed the NAPI only gets
disabled before it's deleted. I think that user can detach
the queue before destroying the device and the NAPI will never
be stopped.
Fixes: 943170998b ("tun: enable NAPI for TUN/TAP driver")
Acked-by: Petar Penkov <ppenkov@aviatrix.com>
Link: https://lore.kernel.org/r/20220623042105.2274812-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fixes all over the place, most notably we are disabling
IRQ hardening (again!).
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmK5naIPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpwPYIAJpkb2XCs/NzPHFhO6LbjGH0CsRWQJm/T2NP
cG0MQFFK3kIRUlRQXMDOE+Rv40QOnr0WqwlyeVYApmiYCbHvl0Nu/IIzzjwEL6w8
Yq9fWJ+l8SMYrOJ5IVl/SDr2U8HP9qZZGj/OokEVA7fZX6whWZJ8+540ua4CAOM3
mb5+ujaa2pjIR1AJG3yW4hOMtLD+c382efFZu7noewJ1+TCORCOPxyBc1/yFRxvZ
qmeGTqDx93kzNI2UEf+hroQSlpQEZJRktS35CGFBplZ93UxPSeNcq6M4jtPv8ZHc
AX+5mHkOvuw5gZbECzSRHo2y/MfJ28KrFLysxvfw/mM0Rcshj1E=
=6wpG
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Fixes all over the place, most notably we are disabling
IRQ hardening (again!)"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_ring: make vring_create_virtqueue_split prettier
vhost-vdpa: call vhost_vdpa_cleanup during the release
virtio_mmio: Restore guest page size on resume
virtio_mmio: Add missing PM calls to freeze/restore
caif_virtio: fix race between virtio_device_ready() and ndo_open()
virtio-net: fix race between ndo_open() and virtio_device_ready()
virtio: disable notification hardening by default
virtio: Remove unnecessary variable assignments
virtio_ring : keep used_wrap_counter in vq->last_used_idx
vduse: Tie vduse mgmtdev and its device
vdpa/mlx5: Initialize CVQ vringh only once
vdpa/mlx5: Update Control VQ callback information
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.
modpost used to detect it, but it had been broken for a decade.
Commit 28438794ab ("modpost: fix section mismatch check for exported
init/exit sections") fixed it so modpost started to warn it again, then
this showed up:
MODPOST vmlinux.symvers
WARNING: modpost: vmlinux.o(___ksymtab_gpl+tick_nohz_full_setup+0x0): Section mismatch in reference from the variable __ksymtab_tick_nohz_full_setup to the function .init.text:tick_nohz_full_setup()
The symbol tick_nohz_full_setup is exported and annotated __init
Fix this by removing the __init annotation of tick_nohz_full_setup or drop the export.
Drop the export because tick_nohz_full_setup() is only called from the
built-in code in kernel/sched/isolation.c.
Fixes: ae9e557b5b ("time: Export tick start/stop functions for rcutorture")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When br_netfilter module is loaded, skbs may be diverted to the
ipv4/ipv6 hooks, just like as if we were routing.
Unfortunately, bridge filter hooks with priority 0 may be skipped
in this case.
Example:
1. an nftables bridge ruleset is loaded, with a prerouting
hook that has priority 0.
2. interface is added to the bridge.
3. no tcp packet is ever seen by the bridge prerouting hook.
4. flush the ruleset
5. load the bridge ruleset again.
6. tcp packets are processed as expected.
After 1) the only registered hook is the bridge prerouting hook, but its
not called yet because the bridge hasn't been brought up yet.
After 2), hook order is:
0 br_nf_pre_routing // br_netfilter internal hook
0 chain bridge f prerouting // nftables bridge ruleset
The packet is diverted to br_nf_pre_routing.
If call-iptables is off, the nftables bridge ruleset is called as expected.
But if its enabled, br_nf_hook_thresh() will skip it because it assumes
that all 0-priority hooks had been called previously in bridge context.
To avoid this, check for the br_nf_pre_routing hook itself, we need to
resume directly after it, even if this hook has a priority of 0.
Unfortunately, this still results in different packet flow.
With this fix, the eval order after in 3) is:
1. br_nf_pre_routing
2. ip(6)tables (if enabled)
3. nftables bridge
but after 5 its the much saner:
1. nftables bridge
2. br_nf_pre_routing
3. ip(6)tables (if enabled)
Unfortunately I don't see a solution here:
It would be possible to move br_nf_pre_routing to a higher priority
so that it will be called later in the pipeline, but this also impacts
ebtables evaluation order, and would still result in this very ordering
problem for all nftables-bridge hooks with the same priority as the
br_nf_pre_routing one.
Searching back through the git history I don't think this has
ever behaved in any other way, hence, no fixes-tag.
Reported-by: Radim Hrazdil <rhrazdil@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When verdict is NF_STOLEN, the skb might have been freed.
When tracing is enabled, this can result in a use-after-free:
1. access to skb->nf_trace
2. access to skb->mark
3. computation of trace id
4. dump of packet payload
To avoid 1, keep a cached copy of skb->nf_trace in the
trace state struct.
Refresh this copy whenever verdict is != STOLEN.
Avoid 2 by skipping skb->mark access if verdict is STOLEN.
3 is avoided by precomputing the trace id.
Only dump the packet when verdict is not "STOLEN".
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch fixes a race condition.
nft_rhash_update() might fail for two reasons:
- Element already exists in the hashtable.
- Another packet won race to insert an entry in the hashtable.
In both cases, new() has already bumped the counter via atomic_add_unless(),
therefore, decrement the set element counter.
Fixes: 22fe54d5fe ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add some spaces to vring_alloc_queue(make it look prettier).
Signed-off-by: Deming Wang <wangdeming@inspur.com>
Message-Id: <20220622192306.4371-1-wangdeming@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Before commit 3d56987938 ("vhost-vdpa: introduce asid based IOTLB")
we call vhost_vdpa_iotlb_free() during the release to clean all regions
mapped in the iotlb.
That commit removed vhost_vdpa_iotlb_free() and added vhost_vdpa_cleanup()
to do some cleanup, including deleting all mappings, but we forgot to call
it in vhost_vdpa_release().
This causes that if an application does not remove all mappings explicitly
(or it crashes), the mappings remain in the iotlb and subsequent
applications may fail if they map the same addresses.
Calling vhost_vdpa_cleanup() also fixes a memory leak since we are not
freeing `v->vdev.vqs` during the release from the same commit.
Since vhost_vdpa_cleanup() calls vhost_dev_cleanup() we can remove its
call from vhost_vdpa_release().
Fixes: 3d56987938 ("vhost-vdpa: introduce asid based IOTLB")
Cc: gautam.dawar@xilinx.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220622151407.51232-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Virtio devices might lose their state when the VMM is restarted
after a suspend to disk (hibernation) cycle. This means that the
guest page size register must be restored for the virtio_mmio legacy
interface, since otherwise the virtio queues are not functional.
This is particularly problematic for QEMU that currently still defaults
to using the legacy interface for virtio_mmio. Write the guest page
size register again in virtio_mmio_restore() to make legacy virtio_mmio
devices work correctly after hibernation.
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Message-Id: <20220621110621.3638025-3-stephan.gerhold@kernkonzept.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>