array_index_nospec ensures that an out-of-bounds value is set to zero
on the transient path. Decreasing the value by one afterwards causes
a transient integer underflow. vsa.console should be decreased first
and then sanitized with array_index_nospec.
Kasper Acknowledgements: Jakob Koschel, Brian Johannesmeyer, Kaveh
Razavi, Herbert Bos, Cristiano Giuffrida from the VUSec group at VU
Amsterdam.
Co-developed-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com>
Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com>
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Link: https://lore.kernel.org/r/20220127144406.3589293-1-jakobkoschel@gmail.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UPF_MAGIC_MULTIPLIER is userspace available bit and can be changed
at any time. There is no sense to rely on it to be always present.
This reverts commit b4ccaf5aa2.
Note, that code was not reliably worked before, hence it implies
no functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Fixes: b4ccaf5aa2 ("serial: 8250_pericom: Re-enable higher baud rates")
Link: https://lore.kernel.org/r/20220203150026.19087-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The polling loop for the register change in iommu_ga_log_enable() needs
to have a udelay() in it. Otherwise the CPU might be faster than the
IOMMU hardware and wrongly trigger the WARN_ON() further down the code
stream. Use a 10us for udelay(), has there is some hardware where
activation of the GA log can take more than a 100ms.
A future optimization should move the activation check of the GA log
to the point where it gets used for the first time. But that is a
bigger change and not suitable for a fix.
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220204115537.3894-1-joro@8bytes.org
From 4.17 onwards the ixgbevf driver uses build_skb() to build an skb
around new data in the page buffer shared with the ixgbe PF.
This uses either a 2K or 3K buffer, and offsets the DMA mapping by
NET_SKB_PAD + NET_IP_ALIGN. When using a smaller buffer RXDCTL is set to
ensure the PF does not write a full 2K bytes into the buffer, which is
actually 2K minus the offset.
However on the 82599 virtual function, the RXDCTL mechanism is not
available. The driver attempts to work around this by using the SET_LPE
mailbox method to lower the maximm frame size, but the ixgbe PF driver
ignores this in order to keep the PF and all VFs in sync[0].
This means the PF will write up to the full 2K set in SRRCTL, causing it
to write NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the buffer.
With 4K pages split into two buffers, this means it either writes
NET_SKB_PAD + NET_IP_ALIGN bytes past the first buffer (and into the
second), or NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the DMA
mapping.
Avoid this by only enabling build_skb when using "large" buffers (3K).
These are placed in each half of an order-1 page, preventing the PF from
writing past the end of the mapping.
[0]: Technically it only ever raises the max frame size, see
ixgbe_set_vf_lpe() in ixgbe_sriov.c
Fixes: f15c5ba5b6 ("ixgbevf: add support for using order 1 pages to receive large frames")
Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent overhaul of pci_irq_get_affinity() introduced a regression when
pci_irq_get_affinity() is called for an MSI-X interrupt which was not
allocated with affinity descriptor information.
The original code just returned a NULL pointer in that case, but the rework
added a WARN_ON() under the assumption that the corresponding WARN_ON() in
the MSI case can be applied to MSI-X as well.
In fact the MSI warning in the original code does not make sense either
because it's legitimate to invoke pci_irq_get_affinity() for a MSI
interrupt which was not allocated with affinity descriptor information.
Remove it and just return NULL as the original code did.
Fixes: f482359001 ("PCI/MSI: Simplify pci_irq_get_affinity()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/87ee4n38sm.ffs@tglx
Since the correct gpio pin is used for enabling tf-io regulator the
system did not boot correctly after calling reboot.
[ 36.862443] reboot: Restarting system
bl31 reboot reason: 0xd
bl31 reboot reason: 0x0
system cmd 1.
SM1:BL:511f6b:81ca2f;FEAT:A0F83180:20282000;POC:B;RCY:0;SPINOR:0;CHK:1F;EMMC:800;NAND:81;SD?:0;SD:0;READ:0;0.0;CHK:0;
bl2_stage_init 0x01
bl2_stage_init 0x81
hw id:
SM1:BL:511f6b:81ca2f;FEAT:A0F83180:20282000;POC:B;RCY:0;SPINOR:0;CHK:1F;EMMC:800;NAND:81;SD?:0;SD:400;USB:8;LOOP:1;...
Setting the gpio to open drain solves the issue.
Fixes: 1f80a5cf74 ("arm64: dts: meson-sm1-odroid: add missing enable gpio and supply for tf_io regulator")
Signed-off-by: Lutz Koschorreck <theleks@ko-hh.de>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
[narmstrong: reduced serial log & removed invalid character in commit message]
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20220128193150.GA1304381@odroid-VirtualBox
The BL32/TEE reserved-memory region is now inherited from the common
family dtsi (meson-g12-common) so we can drop it from board files.
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20220126044954.19069-4-christianshewitt@gmail.com
Add an additional reserved memory region for the BL32 trusted firmware
present in many devices that boot from Amlogic vendor u-boot.
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20220126044954.19069-3-christianshewitt@gmail.com
Add an additional reserved memory region for the BL32 trusted firmware
present in many devices that boot from Amlogic vendor u-boot.
Suggested-by: Mateusz Krzak <kszaquitto@gmail.com>
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20220126044954.19069-2-christianshewitt@gmail.com
GPIOE_2 is in AO domain and "<&gpio GPIOE_2 ...>" changes the state of
TF_PWR_EN of 'FC8731' on BPI-M5
Fixes: 976e920183 ("arm64: dts: meson-sm1: add Banana PI BPI-M5 board dts")
Signed-off-by: Dongjin Kim <tobetter@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20220127151656.GA2419733@paju
CPUID.(EAX=7,ECX=0):EBX.FDP_EXCPTN_ONLY[bit 6] and
CPUID.(EAX=7,ECX=0):EBX.ZERO_FCS_FDS[bit 13] are "defeature"
bits. Unlike most of the other CPUID feature bits, these bits are
clear if the features are present and set if the features are not
present. These bits should be reported in KVM_GET_SUPPORTED_CPUID,
because if these bits are set on hardware, they cannot be cleared in
the guest CPUID. Doing so would claim guest support for a feature that
the hardware doesn't support and that can't be efficiently emulated.
Of course, any software (e.g WIN87EM.DLL) expecting these features to
be present likely predates these CPUID feature bits and therefore
doesn't know to check for them anyway.
Aaron Lewis added the corresponding X86_FEATURE macros in
commit cbb99c0f58 ("x86/cpufeatures: Add FDP_EXCPTN_ONLY and
ZERO_FCS_FDS"), with the intention of reporting these bits in
KVM_GET_SUPPORTED_CPUID, but I was unable to find a proposed patch on
the kvm list.
Opportunistically reordered the CPUID_7_0_EBX capability bits from
least to most significant.
Cc: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220204001348.2844660-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
06f6c4c6c3 ("ata: libata: add missing ata_identify_page_supported() calls")
introduced additional calls to ata_identify_page_supported(), thus also
adding indirectly accesses to the device log directory log page through
ata_log_supported(). Reading this log page causes SATADOM-ML 3ME devices
to lock up.
Introduce the horkage flag ATA_HORKAGE_NO_LOG_DIR to prevent accesses to
the log directory in ata_log_supported() and add a blacklist entry
with this flag for "SATADOM-ML 3ME" devices.
Fixes: 636f6e2af4 ("libata: add horkage for missing Identify Device log")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Anton Lundin <glance@acc.umu.se>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Add myself as a reviewer for the Renesas R-Car SATA driver -- I don't have
the hardware anymore (Geert Uytterhoeven does have a lot of hardware!) but
I do have the manuals still! :-)
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
When mounting cifs client, can see the following warning message.
CIFS: decode_ntlmssp_challenge: authentication has been weakened as server
does not support key exchange
To remove this warning message, Add support for key exchange feature to
ksmbd. This patch decrypts 16-byte ciphertext value sent by the client
using RC4 with session key. The decrypted value is the recovered secondary
key that will use instead of the session key for signing and sealing.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd does not support more than one Buffer Descriptor V1 element in
an smbdirect protocol request. Reducing the maximum read/write size to
about 512KB allows interoperability with Windows over a wider variety
of RDMA NICs, as an interim workaround.
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
When checking smb2 query directory packets from other servers,
OutputBufferLength is different with ksmbd. Other servers add an unaligned
next offset to OutputBufferLength for the last entry.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd sets the inode number to UniqueId. However, the same UniqueId for
dot and dotdot entry is set to the inode number of the parent inode.
This patch set them using the current inode and parent inode.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Check ChannelInfoOffset and ChannelInfoLength
to validate buffer descriptor structures.
And add a debug log to print the structures'
content.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Fix GitLab issue #4698: DP monitor through Type-C dock(Dell DA310) doesn't work.
Fixes for inconsistent engine busyness value and read timeout with GuC.
Fix to use ALLOW_FAIL for error capture buffer allocation. Don't use
interruptible lock on error path. Smatch fix to reject zero sized overlays.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YfuiG8SKMKP5V/Dm@jlahtine-mobl.ger.corp.intel.com
When userspace, e.g. conntrackd, inserts an entry with a specified helper,
its possible that the helper is lost immediately after its added:
ctnetlink_create_conntrack
-> nf_ct_helper_ext_add + assign helper
-> ctnetlink_setup_nat
-> ctnetlink_parse_nat_setup
-> parse_nat_setup -> nfnetlink_parse_nat_setup
-> nf_nat_setup_info
-> nf_conntrack_alter_reply
-> __nf_ct_try_assign_helper
... and __nf_ct_try_assign_helper will zero the helper again.
Set IPS_HELPER bit to bypass auto-assign logic, its unwanted, just like
when helper is assigned via ruleset.
Dropped old 'not strictly necessary' comment, it referred to use of
rcu_assign_pointer() before it got replaced by RCU_INIT_POINTER().
NB: Fixes tag intentionally incorrect, this extends the referenced commit,
but this change won't build without IPS_HELPER introduced there.
Fixes: 6714cf5465 ("netfilter: nf_conntrack: fix explicit helper attachment and NAT")
Reported-by: Pham Thanh Tuyen <phamtyn@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
TCP conntrack assumes that a syn-ack retransmit is identical to the
previous syn-ack. This isn't correct and causes stuck 3whs in some more
esoteric scenarios. tcpdump to illustrate the problem:
client > server: Flags [S] seq 1365731894, win 29200, [mss 1460,sackOK,TS val 2083035583 ecr 0,wscale 7]
server > client: Flags [S.] seq 145824453, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215367629 ecr 2082921663]
Note the invalid/outdated synack ack number.
Conntrack marks this syn-ack as out-of-window/invalid, but it did
initialize the reply direction parameters based on this packets content.
client > server: Flags [S] seq 1365731894, win 29200, [mss 1460,sackOK,TS val 2083036623 ecr 0,wscale 7]
... retransmit...
server > client: Flags [S.], seq 145824453, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215368644 ecr 2082921663]
and another bogus synack. This repeats, then client re-uses for a new
attempt:
client > server: Flags [S], seq 2375731741, win 29200, [mss 1460,sackOK,TS val 2083100223 ecr 0,wscale 7]
server > client: Flags [S.], seq 145824453, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215430754 ecr 2082921663]
... but still gets a invalid syn-ack.
This repeats until:
server > client: Flags [S.], seq 145824453, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215437785 ecr 2082921663]
server > client: Flags [R.], seq 145824454, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215443451 ecr 2082921663]
client > server: Flags [S], seq 2375731741, win 29200, [mss 1460,sackOK,TS val 2083115583 ecr 0,wscale 7]
server > client: Flags [S.], seq 162602410, ack 2375731742, win 65535, [mss 8952,wscale 5,TS val 3215445754 ecr 2083115583]
This syn-ack has the correct ack number, but conntrack flags it as
invalid: The internal state was created from the first syn-ack seen
so the sequence number of the syn-ack is treated as being outside of
the announced window.
Don't assume that retransmitted syn-ack is identical to previous one.
Treat it like the first syn-ack and reinit state.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
It seems more readable to use a common helper in the followup fix rather
than copypaste or goto.
No functional change intended. The function is only called for syn-ack
or syn in repy direction in case of simultaneous open.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Loads relative to ->thoff naturally expect that this points to the
transport header, but this is only true if pkt->fragoff == 0.
This has little effect for rulesets with connection tracking/nat because
these enable ip defra. For other rulesets this prevents false matches.
Fixes: 96518518cc ("netfilter: add nftables")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Vivek Thrivikraman reported:
An SCTP server application which is accessed continuously by client
application.
When the session disconnects the client retries to establish a connection.
After restart of SCTP server application the session is not established
because of stale conntrack entry with connection state CLOSED as below.
(removing this entry manually established new connection):
sctp 9 CLOSED src=10.141.189.233 [..] [ASSURED]
Just skip timeout update of closed entries, we don't want them to
stay around forever.
Reported-and-tested-by: Vivek Thrivikraman <vivek.thrivikraman@est.tech>
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1579
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Prior to ztailpacking feature, it's enough that each lcluster has
two pclusters at most, and the last pcluster should be turned into
an uncompressed pcluster when necessary. For example,
_________________________________________________
|_ pcluster n-2 _|_ pcluster n-1 _|____ EOFed ____|
which should be converted into:
_________________________________________________
|_ pcluster n-2 _|_ pcluster n-1 (uncompressed)' _|
That is fine since either pcluster n-1 or (uncompressed)' takes one
physical block.
However, after ztailpacking was supported, the game is changed since
the last pcluster can be inlined now. And such case above is quite
common for inlining small files. Therefore, in order to inline more
effectively, special EOF lclusters are now supported which can have
three parts at most, as illustrated below:
_________________________________________________
|_ pcluster n-2 _|_ pcluster n-1 _|____ EOFed ____|
^ i_size
Actually similar code exists in Yue Hu's original patchset [1], but I
removed this part on purpose. After evaluating more real cases with
small files, I've changed my mind.
[1] https://lore.kernel.org/r/20211215094449.15162-1-huyue2@yulong.com
Link: https://lore.kernel.org/r/20220203190203.30794-1-xiang@kernel.org
Fixes: ab92184ff8 ("erofs: add on-disk compressed tail-packing inline support")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Commit 309a62fa3a ("bio-integrity: bio_integrity_advance must update
integrity seed") added code to update the integrity seed value when
advancing a bio. However, it failed to take into account that the
integrity interval might be larger than the 512-byte block layer
sector size. This broke bio splitting on PI devices with 4KB logical
blocks.
The seed value should be advanced by bio_integrity_intervals() and not
the number of sectors.
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: stable@vger.kernel.org
Fixes: 309a62fa3a ("bio-integrity: bio_integrity_advance must update integrity seed")
Tested-by: Dmitry Ivanov <dmitry.ivanov2@hpe.com>
Reported-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220204034209.4193-1-martin.petersen@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This problem was found with Sparx5 when the tcpdump tool requests the
do_get_stats64 (sparx5_get_stats64) statistic.
The portstats pointer was incorrectly incremented when fetching priority
based statistics.
Fixes: af4b11022e (net: sparx5: add ethtool configuration and statistics support)
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Link: https://lore.kernel.org/r/20220203102900.528987-1-steen.hegelund@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While the stackleak plugin was already using notrace, objtool is now a
bit more picky. Update the notrace uses to noinstr. Silences the
following objtool warnings when building with:
CONFIG_DEBUG_ENTRY=y
CONFIG_STACK_VALIDATION=y
CONFIG_VMLINUX_VALIDATION=y
CONFIG_GCC_PLUGIN_STACKLEAK=y
vmlinux.o: warning: objtool: do_syscall_64()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_int80_syscall_32()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: exc_general_protection()+0x22: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: fixup_bad_iret()+0x20: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_machine_check()+0x27: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: .text+0x5346e: call to stackleak_erase() leaves .noinstr.text section
vmlinux.o: warning: objtool: .entry.text+0x143: call to stackleak_erase() leaves .noinstr.text section
vmlinux.o: warning: objtool: .entry.text+0x10eb: call to stackleak_erase() leaves .noinstr.text section
vmlinux.o: warning: objtool: .entry.text+0x17f9: call to stackleak_erase() leaves .noinstr.text section
Note that the plugin's addition of calls to stackleak_track_stack() from
noinstr functions is expected to be safe, as it isn't runtime
instrumentation and is self-contained.
Cc: Alexander Popov <alex.popov@linux.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and ieee802154.
Current release - regressions:
- Partially revert "net/smc: Add netlink net namespace support",
fix uABI breakage
- netfilter:
- nft_ct: fix use after free when attaching zone template
- nft_byteorder: track register operations
Previous releases - regressions:
- ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
- phy: qca8081: fix speeds lower than 2.5Gb/s
- sched: fix use-after-free in tc_new_tfilter()
Previous releases - always broken:
- tcp: fix mem under-charging with zerocopy sendmsg()
- tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
- neigh: do not trigger immediate probes on NUD_FAILED from
neigh_managed_work, avoid a deadlock
- bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
false-positives
- netfilter: nft_reject_bridge: fix for missing reply from prerouting
- smc: forward wakeup to smc socket waitqueue after fallback
- ieee802154:
- return meaningful error codes from the netlink helpers
- mcr20a: fix lifs/sifs periods
- at86rf230, ca8210: stop leaking skbs on error paths
- macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
- ax25: add refcount in ax25_dev to avoid UAF bugs
- eth: mlx5e:
- fix SFP module EEPROM query
- fix broken SKB allocation in HW-GRO
- IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
- eth: amd-xgbe:
- fix skb data length underflow
- ensure reset of the tx_timer_active flag, avoid Tx timeouts
- eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
- eth: e1000e: handshake with CSME starts from Alder Lake platforms
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmH8X9UACgkQMUZtbf5S
IrsxuhAAlAvFHGL6y5Y2gAmhKvVUvCYjiIJBcvk7R66CwYVRxofvlhmxi6GM/Czs
9SrVSaN4RXu3p3d7UtAl1gAQwHqzLIHH3m2g5dSKVvHZWQgkm/+n74x0aZQ9Fll7
mWs9uu5fWsQr/qZBnnjoQTvUxRUNVd4trBy7nXGzkNqJL5j0+2TT4BhH4qalhE28
iPc9YFCyKPdjoWFksteZqD3hAQbXxK/xRRr6xuvFHENlZdEHM6ARftHnJthTG/fY
32rdn9YUkQ9lNtOBJNMN9yP2z1B7TcxASBqjjk55I7XtT1QAI9/PskszavHC0hOk
BCSMX779bLNW4+G0wiSKVB4tq4tvswtawq8Hxa6zdU4TKIzfQ84ZL/Nf66GtH+4W
C0mbZohmyJV9hQFkNT0ZLeihljd7i4BkDttlbK3uz2IL9tHeX3uSo5V7AgS/Xaf6
frXgbGgjQTaR6IL9AUhfN3GTCx60mzpH/aRpFho8A5xAl3EtHWCJcRhbY/CEhQBR
zyCndcLcG5mUzbhx/TxlKrrpRCLxqCUG/Tsb2wCh5jMxO1zonW9Hhv4P1ie6EFuI
h+XiJT2WWObS/KTze9S86WOR0zcqrtRqaOGJlNB+/+K8ClZU8UsDTFXLQ0dqpVZF
Mvp7VchBzyFFJrrvO8WkkJgLTKdaPJmM9wuWUZb4J6d2MWlmDkE=
=qKvf
-----END PGP SIGNATURE-----
Merge tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, netfilter, and ieee802154.
Current release - regressions:
- Partially revert "net/smc: Add netlink net namespace support", fix
uABI breakage
- netfilter:
- nft_ct: fix use after free when attaching zone template
- nft_byteorder: track register operations
Previous releases - regressions:
- ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
- phy: qca8081: fix speeds lower than 2.5Gb/s
- sched: fix use-after-free in tc_new_tfilter()
Previous releases - always broken:
- tcp: fix mem under-charging with zerocopy sendmsg()
- tcp: add missing tcp_skb_can_collapse() test in
tcp_shift_skb_data()
- neigh: do not trigger immediate probes on NUD_FAILED from
neigh_managed_work, avoid a deadlock
- bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
false-positives
- netfilter: nft_reject_bridge: fix for missing reply from prerouting
- smc: forward wakeup to smc socket waitqueue after fallback
- ieee802154:
- return meaningful error codes from the netlink helpers
- mcr20a: fix lifs/sifs periods
- at86rf230, ca8210: stop leaking skbs on error paths
- macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
- ax25: add refcount in ax25_dev to avoid UAF bugs
- eth: mlx5e:
- fix SFP module EEPROM query
- fix broken SKB allocation in HW-GRO
- IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
- eth: amd-xgbe:
- fix skb data length underflow
- ensure reset of the tx_timer_active flag, avoid Tx timeouts
- eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
- eth: e1000e: handshake with CSME starts from Alder Lake platforms"
* tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
ax25: fix reference count leaks of ax25_dev
net: stmmac: ensure PTP time register reads are consistent
net: ipa: request IPA register values be retained
dt-bindings: net: qcom,ipa: add optional qcom,qmp property
tools/resolve_btfids: Do not print any commands when building silently
bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work
tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
net: sparx5: do not refer to skb after passing it on
Partially revert "net/smc: Add netlink net namespace support"
net/mlx5e: Avoid field-overflowing memcpy()
net/mlx5e: Use struct_group() for memcpy() region
net/mlx5e: Avoid implicit modify hdr for decap drop rule
net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic
net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
net/mlx5e: Don't treat small ceil values as unlimited in HTB offload
net/mlx5: E-Switch, Fix uninitialized variable modact
net/mlx5e: Fix handling of wrong devices during bond netevent
net/mlx5e: Fix broken SKB allocation in HW-GRO
net/mlx5e: Fix wrong calculation of header index in HW_GRO
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmH8VkAUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNyLQ/9GAsvvB7PYeUpj0CGLMaAT9Hys5l+
WPjP+NU+HF+r+AUsSCJwKkK4yKnpEDK9nidOdkwiYjAO/83yl7kkBPGRgisQep1A
tEbuJ5vqZnR59jLxNKCmQE0gY+gByjk3jZIVFLSWwG/ho7s1LQyNoYpm7rbFIgAz
6qe7IR1nsATzxRhDoJI3RIPlQjzhM1qEX9PBEtwW+LLieShtvMc+ijdiUw7bqNl9
RTM6hRf4fTX4jLHtxfZYZ99bHEjIseksFbSAnjKxxkt0W5EFha73VX8hjwnG24J/
XZQAhsyvpQmcZKJGZPWUSa+UFcytoauMnNdgJOQw7TcMT4Y2mMuvcoZ/KkFtDjdr
30qhp46/gml2yqnByXRfzshGQm9E4ZoqSCn+lFWAfjlrhcqdgZFKpILpwMixbdin
NgTA/pbwXovrlho8UflB0sbDMrbyV3qNGZXD/4hRg66Vm3F7ipgqPBbM89qoDniG
CXiQnmRQ+rwcftyeE7me7+kD6djYTWOfEY5HRNiCf9NhnQG8GP7YzZ4KACxJ2PwQ
R9+Egc9nAl4UG6PrEjZeud81rLzc+ws2SJLokxOcIGnid8lZidf83HfWekAmRloA
J5+tmpx5q26ug/j2uXV/rp36xaQWhjJrrnhEKamIYYAVioXa9srRhtz3qRI8r/13
mrZ5hu4le8aC/5s=
=AyIM
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One small SELinux patch to ensure that a policy structure field is
properly reset after freeing so that we don't inadvertently do a
double-free on certain error conditions"
* tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix double free of cond_list on error paths
This Kselftest fixes update for Linux 5.17-rc3 consists of important
fixes to several tests and documentation clarification on running
mainline kselftest on stable releases. A few notable fixes:
- fix kselftest run hang due to child processes that haven't been
terminated. Fix signals all child processes
- fix false pass/fail results from vdso_test_abi, openat2, mincore
- build failures when using -j (multiple jobs) option
- exec test build failure due to incorrect build rule for a run-time
created "pipe"
- zram test fixes related to interaction with zram-generator to
make sure zram test to coordinate deleted with zram-generator
- zram test compression ratio calculation fix and skipping
max_comp_streams.
- increasing rtc test timeout
- cpufreq test to write test results to stdout which will necessary on
automated test systems
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmH8OcAACgkQCwJExA0N
QxzTkg//YF+iSc1aao8nmUOvsK6oc1RBwIj3hkLUHjP3H1qFkm9OYxzTcLYGcnyo
JahxNjeVoDuVESYx/AyLZ568aCCxRXJEDzNm5eIVNfBrGtVTfFPM19HwC/R3I1Ew
KJUruxRx++8AvI1RYMEzsDumKpLVe3bor7sj3CcO1E9/qkOoUAukxt7FVmSNMZlW
qYCDgc3yBa/XrImHCbJdZc4CUhbmh+l05sZgG3V3fxQSlgfIClY0Qg8W7Ucu+r4S
6W5nwoEJIG32Zl2avaZ2VTF4T+CTQB70g/n4OBEX8TAxuIIi9W12N6zMZ76q8qbp
iRs7UqgUSqPWdz/3ZHiQ5gy0WsCJ/W1379TtiG0doEeU2vwZ6fR8NMn+2FrEH18W
xHBPOWeN+PkVAFjUeoyt1c5OGNprK6EEE2kQ6CLoBTwlKWLDQ87ZNPf84uAsez1x
G0m7AX/T5adeTLoZfEXNXVY4OROs0nxbAkGC5ghVtKQu1giMcUKj+KHUowgj5OIJ
Zaj+uSPiN3hnwj5L2fk+orOEC+3bZxVoQzqSB2Bs6stQOQFZLP18xHIjQIDZoQuC
O512ZY+dwMzSyTi2KoQmb/M0Ft3gSfhRVXc7gfEFfOvC3ZbqRGFfeES1ZILup3rZ
izMTJBDOe+BGSm/GCFPHxu36YfdPqiAyHBTVSYy5EhLGpxafZvI=
=rH2m
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
"Important fixes to several tests and documentation clarification on
running mainline kselftest on stable releases. A few notable fixes:
- fix kselftest run hang due to child processes that haven't been
terminated. Fix signals all child processes
- fix false pass/fail results from vdso_test_abi, openat2, mincore
- build failures when using -j (multiple jobs) option
- exec test build failure due to incorrect build rule for a run-time
created "pipe"
- zram test fixes related to interaction with zram-generator to make
sure zram test to coordinate deleted with zram-generator
- zram test compression ratio calculation fix and skipping
max_comp_streams.
- increasing rtc test timeout
- cpufreq test to write test results to stdout which will necessary
on automated test systems"
* tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftest: Fix vdso_test_abi return status
selftests: skip mincore.check_file_mmap when fs lacks needed support
selftests: openat2: Skip testcases that fail with EOPNOTSUPP
selftests: openat2: Add missing dependency in Makefile
selftests: openat2: Print also errno in failure messages
selftests: futex: Use variable MAKE instead of make
selftests/exec: Remove pipe from TEST_GEN_FILES
selftests/zram: Adapt the situation that /dev/zram0 is being used
selftests/zram01.sh: Fix compression ratio calculation
selftests/zram: Skip max_comp_streams interface on newer kernel
docs/kselftest: clarify running mainline tests on stables
kselftest: signal all child processes
selftests: cpufreq: Write test output to stdout as well
selftests: rtc: Increase test timeout so that all tests run
The previous commit d01ffb9eee ("ax25: add refcount in ax25_dev
to avoid UAF bugs") introduces refcount into ax25_dev, but there
are reference leak paths in ax25_ctl_ioctl(), ax25_fwd_ioctl(),
ax25_rt_add(), ax25_rt_del() and ax25_rt_opt().
This patch uses ax25_dev_put() and adjusts the position of
ax25_addr_ax25dev() to fix reference cout leaks of ax25_dev.
Fixes: d01ffb9eee ("ax25: add refcount in ax25_dev to avoid UAF bugs")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220203150811.42256-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Even if protected from preemption and interrupts, a small time window
remains when the 2 register reads could return inconsistent values,
each time the "seconds" register changes. This could lead to an about
1-second error in the reported time.
Add logic to ensure the "seconds" and "nanoseconds" values are consistent.
Fixes: 92ba688851 ("stmmac: add the support for PTP hw clock driver")
Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220203160025.750632-1-yannick.vignon@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2022-02-03
We've added 6 non-merge commits during the last 10 day(s) which contain
a total of 7 files changed, 11 insertions(+), 236 deletions(-).
The main changes are:
1) Fix BPF ringbuf to allocate its area with VM_MAP instead of VM_ALLOC
flag which otherwise trips over KASAN, from Hou Tao.
2) Fix unresolved symbol warning in resolve_btfids due to LSM callback
rename, from Alexei Starovoitov.
3) Fix a possible race in inc_misses_counter() when IRQ would trigger
during counter update, from He Fengqing.
4) Fix tooling infra for cross-building with clang upon probing whether
gcc provides the standard libraries, from Jean-Philippe Brucker.
5) Fix silent mode build for resolve_btfids, from Nathan Chancellor.
6) Drop unneeded and outdated lirc.h header copy from tooling infra as
BPF does not require it anymore, from Sean Young.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
tools/resolve_btfids: Do not print any commands when building silently
bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
tools: Ignore errors from `which' when searching a GCC toolchain
tools headers UAPI: remove stale lirc.h
bpf: Fix possible race in inc_misses_counter
bpf: Fix renaming task_getsecid_subj->current_getsecid_subj.
====================
Link: https://lore.kernel.org/r/20220203155815.25689-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- fix a use-after-free in rdm and tcp controller reset (Sagi Grimberg)
- fix the state check in nvmf_ctlr_matches_baseopts (Uday Shankar)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmH8LisLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPHzQ//X1QK3AyDl2X8jcpnDr/FJB2Pqp/dvmHNF7EFImSq
aizW282ngBxZR5HUQxFxt/sFNjq07Zcz3ZImgCa3x8fW6HR/dPn8puM3bIMsWem7
cZGr10mReujIa6mQdmYPuzwUCI5ycRqIKuA50Ug4MpBevILRO2xOhRGgYahoMsrm
j6YWazIHFvfIEtniVGoCaX5PnI17hQIS4bgkd8PwU3RnuYg3QDNf/vIMybnhq60Q
kAwhm7i/BRWclVoETLriWgctdcEl1LSWcragyiTnfAMRmGS6vmdiWsG5bkul+KJn
wah6Gy2X0/bPBAGORptk5XvkwreJXJqh9WRWZKeKRAskMHusq/FLIosAuugXnCnF
jJFJTH+k1odLO4KOK30eVNwvaVtVX2O2zcCjHMwW1U3TXiJBebkkJE+x4Zqb2ACj
PookjdS0nt+aWr2DmfXCC89L1gw5hbZotfBucK9o61LYQn38h52utmP3tbBm5EvE
K/pOnFmvTxQxp5UMBRRTSNB4OSQbTNyrdCz8TFJ+UKTVYELexNvbszVv7Sp6eDXF
tvEFTZRrcunRuA5jT6LnuTNG08nTeJUKfJHFm5zN6u/s1L8/Aty3wxuNmE9wHZ+A
jHDn2eh9vf1OMRzzVVfBYIE0Jei/VVreIBVh1G2L74YxnFdl8Y/D7vqUzkbK0G4c
UPQ=
=uezq
-----END PGP SIGNATURE-----
Merge tag 'nvme-5.17-2022-02-03' of git://git.infradead.org/nvme into block-5.17
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.17
- fix a use-after-free in rdm and tcp controller reset (Sagi Grimberg)
- fix the state check in nvmf_ctlr_matches_baseopts (Uday Shankar)"
* tag 'nvme-5.17-2022-02-03' of git://git.infradead.org/nvme:
nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts()
nvme-rdma: fix possible use-after-free in transport error_recovery work
nvme-tcp: fix possible use-after-free in transport error_recovery work
nvme: fix a possible use-after-free in controller reset during load
The move of proc_dointvec_minmax_sysadmin() from kernel/sysctl.c to
kernel/printk/sysctl.c introduced an incorrect __user attribute to the
buffer argument. I spotted this change in [1] as well as the kernel
test robot. Revert this change to please sparse:
kernel/printk/sysctl.c:20:51: warning: incorrect type in argument 3 (different address spaces)
kernel/printk/sysctl.c:20:51: expected void *
kernel/printk/sysctl.c:20:51: got void [noderef] __user *buffer
Fixes: faaa357a55 ("printk: move printk sysctl to printk/sysctl.c")
Link: https://lore.kernel.org/r/20220104155024.48023-2-mic@digikod.net [1]
Reported-by: kernel test robot <lkp@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20220203145029.272640-1-mic@digikod.net
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 774a1221e8.
We need to finish all async code before the module init sequence is
done. In the reverted commit the PF_USED_ASYNC flag was added to mark a
thread that called async_schedule(). Then the PF_USED_ASYNC flag was
used to determine whether or not async_synchronize_full() needs to be
invoked. This works when modprobe thread is calling async_schedule(),
but it does not work if module dispatches init code to a worker thread
which then calls async_schedule().
For example, PCI driver probing is invoked from a worker thread based on
a node where device is attached:
if (cpu < nr_cpu_ids)
error = work_on_cpu(cpu, local_pci_probe, &ddi);
else
error = local_pci_probe(&ddi);
We end up in a situation where a worker thread gets the PF_USED_ASYNC
flag set instead of the modprobe thread. As a result,
async_synchronize_full() is not invoked and modprobe completes without
waiting for the async code to finish.
The issue was discovered while loading the pm80xx driver:
(scsi_mod.scan=async)
modprobe pm80xx worker
...
do_init_module()
...
pci_call_probe()
work_on_cpu(local_pci_probe)
local_pci_probe()
pm8001_pci_probe()
scsi_scan_host()
async_schedule()
worker->flags |= PF_USED_ASYNC;
...
< return from worker >
...
if (current->flags & PF_USED_ASYNC) <--- false
async_synchronize_full();
Commit 21c3c5d280 ("block: don't request module during elevator init")
fixed the deadlock issue which the reverted commit 774a1221e8
("module, async: async_synchronize_full() on module init iff async is
used") tried to fix.
Since commit 0fdff3ec6d ("async, kmod: warn on synchronous
request_module() from async workers") synchronous module loading from
async is not allowed.
Given that the original deadlock issue is fixed and it is no longer
allowed to call synchronous request_module() from async we can remove
PF_USED_ASYNC flag to make module init consistently invoke
async_synchronize_full() unless async module probe is requested.
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Reviewed-by: Changyuan Lyu <changyuanl@google.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull MD fix from Song:
"Please consider pulling the following fix on top of your block-5.17
branch. It fixes a NULL ptr deref case with nowait."
* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md: fix NULL pointer deref with nowait but no mddev->queue
Pull cgroup fixes from Tejun Heo:
- Eric's fix for a long standing cgroup1 permission issue where it only
checks for uid 0 instead of CAP which inadvertently allows
unprivileged userns roots to modify release_agent userhelper
- Fixes for the fallout from Waiman's recent cpuset work
* 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning
cgroup-v1: Require capabilities to set release_agent
cpuset: Fix the bug that subpart_cpus updated wrongly in update_cpumask()
cgroup/cpuset: Make child cpusets restrict parents on v1 hierarchy
Alex Elder says:
====================
net: ipa: enable register retention
With runtime power management in place, we sometimes need to issue
a command to enable retention of IPA register values before power
collapse. This requires a new Device Tree property, whose presence
will also be used to signal that the command is required.
====================
Link: https://lore.kernel.org/r/20220201150205.468403-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In some cases, the IPA hardware needs to request the always-on
subsystem (AOSS) to coordinate with the IPA microcontroller to
retain IPA register values at power collapse. This is done by
issuing a QMP request to the AOSS microcontroller. A similar
request ondoes that request.
We must get and hold the "QMP" handle early, because we might get
back EPROBE_DEFER for that. But the actual request should be sent
while we know the IPA clock is active, and when we know the
microcontroller is operational.
Fixes: 1aac309d32 ("net: ipa: use autosuspend")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For some systems, the IPA driver must make a request to ensure that
its registers are retained across power collapse of the IPA hardware.
On such systems, we'll use the existence of the "qcom,qmp" property
as a signal that this request is required.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It was found that a "suspicious RCU usage" lockdep warning was issued
with the rcu_read_lock() call in update_sibling_cpumasks(). It is
because the update_cpumasks_hier() function may sleep. So we have
to release the RCU lock, call update_cpumasks_hier() and reacquire
it afterward.
Also add a percpu_rwsem_assert_held() in update_sibling_cpumasks()
instead of stating that in the comment.
Fixes: 4716909cc5 ("cpuset: Track cpusets that use parent's effective_cpus")
Signed-off-by: Waiman Long <longman@redhat.com>
Tested-by: Phil Auld <pauld@redhat.com>
Reviewed-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>