Pull vfs fix from Al Viro:
"Fix a use-after-free in do_last() handling of sysctl_protected_...
checks.
The use-after-free normally doesn't happen there, but race with
rename() and it becomes possible"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
do_last(): fetch directory ->i_mode and ->i_uid before it's too late
The afs filesystem needs to prohibit certain characters from cell names,
such as '/', as these are used to form filenames in procfs, leading to
the following warning being generated:
WARNING: CPU: 0 PID: 3489 at fs/proc/generic.c:178
Fix afs_alloc_cell() to disallow nonprintable characters, '/', '@' and
names that begin with a dot.
Remove the check for "@cell" as that is then redundant.
This can be tested by running:
echo add foo/.bar 1.2.3.4 >/proc/fs/afs/cells
Note that we will also need to deal with:
- Names ending in ".invalid" shouldn't be passed to the DNS.
- Names that contain non-valid domainname chars shouldn't be passed to
the DNS.
- DNS replies that say "your-dns-needs-immediate-attention.<gTLD>" and
replies containing A records that say 127.0.53.53 should be
considered invalid.
[https://www.icann.org/en/system/files/files/name-collision-mitigation-01aug14-en.pdf]
but these need to be dealt with by the kafs-client DNS program rather
than the kernel.
Reported-by: syzbot+b904ba7c947a37b4b291@syzkaller.appspotmail.com
Cc: stable@kernel.org
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
may_create_in_sticky() call is done when we already have dropped the
reference to dir.
Fixes: 30aba6656f (namei: allow restricted O_CREAT of FIFOs and regular files)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull networking fixes from David Miller:
1) Off by one in mt76 airtime calculation, from Dan Carpenter.
2) Fix TLV fragment allocation loop condition in iwlwifi, from Luca
Coelho.
3) Don't confirm neigh entries when doing ipsec pmtu updates, from Xu
Wang.
4) More checks to make sure we only send TSO packets to lan78xx chips
that they can actually handle. From James Hughes.
5) Fix ip_tunnel namespace move, from William Dauchy.
6) Fix unintended packet reordering due to cooperation between
listification done by GRO and non-GRO paths. From Maxim
Mikityanskiy.
7) Add Jakub Kicincki formally as networking co-maintainer.
8) Info leak in airo ioctls, from Michael Ellerman.
9) IFLA_MTU attribute needs validation during rtnl_create_link(), from
Eric Dumazet.
10) Use after free during reload in mlxsw, from Ido Schimmel.
11) Dangling pointers are possible in tp->highest_sack, fix from Eric
Dumazet.
12) Missing *pos++ in various networking seq_next handlers, from Vasily
Averin.
13) CHELSIO_GET_MEM operation neds CAP_NET_ADMIN check, from Michael
Ellerman.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (109 commits)
firestream: fix memory leaks
net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM
net: bcmgenet: Use netif_tx_napi_add() for TX NAPI
tipc: change maintainer email address
net: stmmac: platform: fix probe for ACPI devices
net/mlx5e: kTLS, Do not send decrypted-marked SKBs via non-accel path
net/mlx5e: kTLS, Remove redundant posts in TX resync flow
net/mlx5e: kTLS, Fix corner-case checks in TX resync flow
net/mlx5e: Clear VF config when switching modes
net/mlx5: DR, use non preemptible call to get the current cpu number
net/mlx5: E-Switch, Prevent ingress rate configuration of uplink rep
net/mlx5: DR, Enable counter on non-fwd-dest objects
net/mlx5: Update the list of the PCI supported devices
net/mlx5: Fix lowest FDB pool size
net: Fix skb->csum update in inet_proto_csum_replace16().
netfilter: nf_tables: autoload modules from the abort path
netfilter: nf_tables: add __nft_chain_type_get()
netfilter: nf_tables_offload: fix check the chain offload flag
netfilter: conntrack: sctp: use distinct states for new SCTP connections
ipv6_route_seq_next should increase position index
...
A couple of fixes have come in that would be good to include in this
release:
- A fix for amount of memory on Beaglebone Black. Surfaced now since
GRUB2 doesn't update memory size in the booted kernel.
- A fix to make SPI interfaces work on am43x-epos-evm.
- Small Kconfig fix for OPTEE (adds a depend on MMU) to avoid build
failures.
-----BEGIN PGP SIGNATURE-----
iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAl4stI8PHG9sb2ZAbGl4
b20ubmV0AAoJEIwa5zzehBx3AtAP/1Z8PYhL6nXlCvXZLi0SttwZagKl6ZYL+VkW
5a+8vu0gQXTNCmfux0S0gLKPJUe7nVc7vJswz5KAtDKCYFdjppPch22GrvjFXFJM
+kGmb9jgyPhpAu4Ja4TEM8Ovqcg+n+mcFfzQQkxpg+OjXTQ8PNX3i18220KScc32
2bmezmxMh/b54tC9mtXZeihFghWeVmApGYmcsp/fuxKo3Q/v/DxceEqyT6lNGVS2
Z1gpp7TWi8eh1etsT/++jdX7IJCDfcJv3kYt72NqJ9yUk4gAkY/c8ezc2UW+JAfa
VsxFfQw9jQ3pzTgnoxrTk2g53XnxKUGK/PKuhg5h6+ZH1dz+oc0BoKHIZciwboi0
djfD6Gv5v0yQMClmSThDbLaocXsHtvv1dDnm9i1AK68WnDnd5hVYxBiwVKMUO/UL
FiXMKqB5npSLShs2YuFBOEB8W2DA1HtFylr5EGz4+OtITmkv7G4rKYlsdhwyeBIF
rm27zwiN0wI7ft1uKezw0tsuxZbl8dxcDkinjAuYfJFJZSgKTkwUrfeiztW9m9M7
ifDO0SYc/w1W/DWYoWJYd0OY7ZwPMRXqgVcPGOckGVYQj/TuavV96gV81IsUlh+t
QQVnmAqxvYV4AwoaCMKo1WKg3hLlFNvC5NOGSgALG77ZmJpF7kKhR25THzJStt8a
ATUgh1An
=xQ63
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"A couple of fixes have come in that would be good to include in this
release:
- A fix for amount of memory on Beaglebone Black. Surfaced now since
GRUB2 doesn't update memory size in the booted kernel.
- A fix to make SPI interfaces work on am43x-epos-evm.
- Small Kconfig fix for OPTEE (adds a depend on MMU) to avoid build
failures"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: dts: am43x-epos-evm: set data pin directions for spi0 and spi1
tee: optee: Fix compilation issue with nommu
ARM: dts: am335x-boneblack-common: fix memory size
In fs_open(), 'vcc' is allocated through kmalloc() and assigned to
'atm_vcc->dev_data.' In the following execution, if an error occurs, e.g.,
there is no more free channel, an error code EBUSY or ENOMEM will be
returned. However, 'vcc' is not deallocated, leading to memory leaks. Note
that, in normal cases where fs_open() returns 0, 'vcc' will be deallocated
in fs_close(). But, if fs_open() fails, there is no guarantee that
fs_close() will be invoked.
To fix this issue, deallocate 'vcc' before the error code is returned.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Missing netlink attribute sanity check for NFTA_OSF_DREG,
from Florian Westphal.
2) Use bitmap infrastructure in ipset to fix KASAN slab-out-of-bounds
reads, from Jozsef Kadlecsik.
3) Missing initial CLOSED state in new sctp connection through
ctnetlink events, from Jiri Wiesner.
4) Missing check for NFT_CHAIN_HW_OFFLOAD in nf_tables offload
indirect block infrastructure, from wenxu.
5) Add __nft_chain_type_get() to sanity check family and chain type.
6) Autoload modules from the nf_tables abort path to fix races
reported by syzbot.
7) Remove unnecessary skb->csum update on inet_proto_csum_replace16(),
from Praveen Chaudhary.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl4sLasACgkQxWXV+ddt
WDsegg/8CBQ1/pGj+8mvf+ws6f71Av8jspY2Ebr+HCjaGhD2MG3HI1kA5gC9Qnbb
fQVd12M5ma2BTrIcszxwm+VMIMlDotRFzfAp8uuFJtW0aAEGMCboX6VRYWa/4I0o
SmgJg0RYh926VL73qSe3S72pfIYjar30RwjVIVTmsHxL/D/lEkrHg6IGKRCe/MaN
eQipth3iuFtcWmGm1+DxEySsOs7AMPg3wL8KVnQcYoDI2kg3BXFH9a4wTE6VmWsU
ZjonJBA/Rl8oA2YOVDum4mL5j2c5RulWEymdVKyo1oH+8kLDOQ8snd7Bxp3qtJ1C
gdVbS8gi7gT5/C+yex+ZWlAdfmCSGWj7dr7jjiELZhTrsBhtS7y+GM52GivSrJ3z
TciNQtF/Y0SrZGprPMgVGAHuIKWWwSmWJPmkRB4zv/5efFFdKg8/UmcRmh6dMo83
IF4VPEBQgJLj3ja9Wns5yvW9asKNcynGeFK7aV+BlGW/wuvBW9o017c4Q04dXSAK
iFpipJaR/6ZGmXlRQLa1uyKWVHNIfSFT47WJqa6Dbo6iWRE/S/MhfkZU42z2A3H9
O2qMWmZikZnPCkha6fWyNJEDxF3imC+/LBsYoEuVPR7kZ/irDnI1cJNsTocOlyj1
kgFtL5MnCBHCop9/tPGiVdin9ilHJs3q2kAkR5BNCSEqhC8mo4g=
=IPUk
-----END PGP SIGNATURE-----
Merge tag 'for-5.5-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"Here's a last minute fix for a regression introduced in this
development cycle.
There's a small chance of a silent corruption when device replace and
NOCOW data writes happen at the same time in one block group. Metadata
or COW data writes are unaffected.
The extra fixup patch is there to silence an unnecessary warning"
* tag 'for-5.5-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: dev-replace: remove warning for unknown return codes when finished
btrfs: scrub: Require mandatory block group RO for dev-replace
that makes the interrupts work properly on it.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl4sHZcACgkQQRCzN7AZ
XXMoBxAAsgUg2OpxtGyyevERgDMfsPHuhhniIP4NoVuZWHqch61hWubSds529Vka
Ly4ikV94WW7BwxK9ndOTqy8tUgjcDWGn1+oO8nWbSby/pX3G6MA+q5R/BJRFBrE2
THvyIT1fPhfDTmdvqJEULh+FrqLe3Bz6v1G/VqLaYgy2yCg0gNxHnaQwAcvV2Mlz
GRhHULoPfIvwJG1P40+R27Q7YQUV+yxpoaITvqUfOlN7unD01Thb89uCYyzGOFRh
gi3Q2UnOeW7W32R8pwnmXzJv6dMTucM10xE7k5Clsrot8wUtT1APbHeWPnY/Li+W
NPWBxfSRzjsbRsRbLz1sBpcv72hYsMz0ELYqQMlY++fF3fs9sD2sPArgEnwxvk2U
t7JQ7IWd16l1DVVksVyp6QNv/Q8oLdNRjbp8feBV7SIvuPGCrLZ9q7Ao5v2p7aMT
M5WNpBqZqziObtq3kaR0ospg00UfNtt/YFpEWhwzR+cfj/YsvYgitjMf+z9olWhy
QEnzws3yT6TTQ2Ccvtc21VxHp6mDx69SZQhlD3+G2Y5V1XtSpl1oiyDXW1/eToGQ
UONRkjD7dbvRlunL6FxGAIMndhIRQ5WXcvvlzCzTYX1l8HO8V98c6jjxign+jbXt
bzv2x+fgH7YF8A941yhpwYej/jxVg3X0C6RQCeizQfBY8Nq7nhI=
=0GUt
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fix from Linus Walleij:
"A single fix for the Intel Sunrisepoint pin controller that makes the
interrupts work properly on it"
* tag 'pinctrl-v5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sunrisepoint: Add missing Interrupt Status register offset
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl4rTdoACgkQSD+KveBX
+j5wiAgAtlGxIVKMpbJN7DIJNQ3rLGwTjPdVUFSRvHB+krVV9/ym02bIBIEphSPu
ZEF53eJMtJ0XIdUjvAyKA+zJg9p12vewbGpYV/LdjBNikHatMPGvR4grDgB9pnqT
tIxH6b4vzh5p/bMcP5l4yuVG6pf/ERdxQqsa93Op9xJzOkMH+A2KA/G32VHL2OmH
619hrWRVA/UVG0wUcOaX6ra+QA5EQ3lhcrvqL702k4JhIZPtG7Ndmm0+Z0nrDiIm
oXfc1236nc3emkijQczEZ+6oYiuYQ7WSKxZeS+8Bnmc3ZZ2w2RZrC7UCOadGzu12
+0XYfMsD2FWwEim52IZK/VMPtykjZw==
=i5Au
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2020-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-01-24
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
Merge conflict: once merge with net-next, a contextual conflict will
appear in drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
since the code moved in net-next.
To resolve, just delete ALL of the conflicting hunk from net.
So sorry for the small mess ..
For -stable v5.4:
('net/mlx5: Update the list of the PCI supported devices')
('net/mlx5: Fix lowest FDB pool size')
('net/mlx5e: kTLS, Fix corner-case checks in TX resync flow')
('net/mlx5e: kTLS, Do not send decrypted-marked SKBs via non-accel path')
('net/mlx5: Eswitch, Prevent ingress rate configuration of uplink rep')
('net/mlx5e: kTLS, Remove redundant posts in TX resync flow')
('net/mlx5: DR, Enable counter on non-fwd-dest objects')
('net/mlx5: DR, use non preemptible call to get the current cpu number')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The fstests btrfs/011 triggered a warning at the end of device replace,
[ 1891.998975] BTRFS warning (device vdd): failed setting block group ro: -28
[ 1892.038338] BTRFS error (device vdd): btrfs_scrub_dev(/dev/vdd, 1, /dev/vdb) failed -28
[ 1892.059993] ------------[ cut here ]------------
[ 1892.063032] WARNING: CPU: 2 PID: 2244 at fs/btrfs/dev-replace.c:506 btrfs_dev_replace_start.cold+0xf9/0x140 [btrfs]
[ 1892.074346] CPU: 2 PID: 2244 Comm: btrfs Not tainted 5.5.0-rc7-default+ #942
[ 1892.079956] RIP: 0010:btrfs_dev_replace_start.cold+0xf9/0x140 [btrfs]
[ 1892.096576] RSP: 0018:ffffbb58c7b3fd10 EFLAGS: 00010286
[ 1892.098311] RAX: 00000000ffffffe4 RBX: 0000000000000001 RCX: 8888888888888889
[ 1892.100342] RDX: 0000000000000001 RSI: ffff9e889645f5d8 RDI: ffffffff92821080
[ 1892.102291] RBP: ffff9e889645c000 R08: 000001b8878fe1f6 R09: 0000000000000000
[ 1892.104239] R10: ffffbb58c7b3fd08 R11: 0000000000000000 R12: ffff9e88a0017000
[ 1892.106434] R13: ffff9e889645f608 R14: ffff9e88794e1000 R15: ffff9e88a07b5200
[ 1892.108642] FS: 00007fcaed3f18c0(0000) GS:ffff9e88bda00000(0000) knlGS:0000000000000000
[ 1892.111558] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1892.113492] CR2: 00007f52509ff420 CR3: 00000000603dd002 CR4: 0000000000160ee0
[ 1892.115814] Call Trace:
[ 1892.116896] btrfs_dev_replace_by_ioctl+0x35/0x60 [btrfs]
[ 1892.118962] btrfs_ioctl+0x1d62/0x2550 [btrfs]
caused by the previous patch ("btrfs: scrub: Require mandatory block
group RO for dev-replace"). Hitting ENOSPC is possible and could happen
when the block group is set read-only, preventing NOCOW writes to the
area that's being accessed by dev-replace.
This has happend with scratch devices of size 12G but not with 5G and
20G, so this is depends on timing and other activity on the filesystem.
The whole replace operation is restartable, the space state should be
examined by the user in any case.
The error code is propagated back to the ioctl caller so the kernel
warning is causing false alerts.
Signed-off-by: David Sterba <dsterba@suse.com>
The cxgb3 driver for "Chelsio T3-based gigabit and 10Gb Ethernet
adapters" implements a custom ioctl as SIOCCHIOCTL/SIOCDEVPRIVATE in
cxgb_extension_ioctl().
One of the subcommands of the ioctl is CHELSIO_GET_MEM, which appears
to read memory directly out of the adapter and return it to userspace.
It's not entirely clear what the contents of the adapter memory
contains, but the assumption is that it shouldn't be accessible to all
users.
So add a CAP_NET_ADMIN check to the CHELSIO_GET_MEM case. Put it after
the is_offload() check, which matches two of the other subcommands in
the same function which also check for is_offload() and CAP_NET_ADMIN.
Found by Ilja by code inspection, not tested as I don't have the
required hardware.
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before commit 7587935cfa ("net: bcmgenet: move NAPI initialization to
ring initialization") moved the code, this used to be
netif_tx_napi_add(), but we lost that small semantic change in the
process, restore that.
Fixes: 7587935cfa ("net: bcmgenet: move NAPI initialization to ring initialization")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use generic device API to get phy mode to fix probe failure
with ACPI based devices.
Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Looks like we have wrong default memory size for beaglebone black,
it has at least 512 MB of RAM and not 256 MB. This causes an issue
when booted with GRUB2 that does not seem to pass memory info to
the kernel.
And for am43x-epos-evm the SPI pin directions need to be configured
for SPI to work.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAl4rSQkRHHRvbnlAYXRv
bWlkZS5jb20ACgkQG9Q+yVyrpXMSeg//f607jg+6RqkJ6dMczxBSHpkAYSdGBjpB
nwa8O7jNbHfjtUGXTm+9Xx7sKXTR+600lobgMYguRlitTYNtC2jQG5ux92/6sTU/
5qB6m0qkhMqOKDOI2AZ1wIcc6PP067M7x+il0Rs9/PRdIaDNUMHsNZ6+sxEWUR7N
1vcYMz3PL8KW/ucAWxBO4NeMDwg5vi6dIILBl63sBlrH6AgYTVT8LqNhSysZWmXY
Gj5JNIt+FjvhKGYNnD1jeFYX/+TQH10B8Ouepj6ixdcXaA4v0553S7S7R3HYR+l8
+AFcWB1TMuvlNv5vbnVd+PUXhhyByVDQv15/92i+rKMXUxxdLZDuKeqa3/GbL/vI
oU5ShNXWQ7C7g+SrxYfHT0OQz38j7TUOKrRJKBxFYykneoC33uwpSaAUTSN0PuXT
MA7RJVBtjHe8tfbkII3UqqvHxmVB1TAkj9TE9y21Dznlq2fyq7w/13UNS67VCZ57
4vO722ULOHVLnTmwH6gA+e1Cjhiz17fJJq3wpDTqqn0As4cuLOAio7PmHgNcm2YG
Qg9MVQV2Rj3YGdxGVDKCfTOLt4tQzJ8qbN0o7XIp0CgyK7Sg2djcjzUMtJDx++Ez
6tRmyfcY/GD5ykPfEUXvJcDWw0tIFSRUQMtXMi3vnuNh5mYFQ2ciQB5KkPBhyTST
CaAAg8vCvek=
=XXZc
-----END PGP SIGNATURE-----
Merge tag 'omap-for-fixes-whenever-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes
Few minor fixes for omaps
Looks like we have wrong default memory size for beaglebone black,
it has at least 512 MB of RAM and not 256 MB. This causes an issue
when booted with GRUB2 that does not seem to pass memory info to
the kernel.
And for am43x-epos-evm the SPI pin directions need to be configured
for SPI to work.
* tag 'omap-for-fixes-whenever-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am43x-epos-evm: set data pin directions for spi0 and spi1
ARM: dts: am335x-boneblack-common: fix memory size
Link: https://lore.kernel.org/r/pull-1579895109-287828@atomide.com
Signed-off-by: Olof Johansson <olof@lixom.net>
When TCP out-of-order is identified (unexpected tcp seq mismatch), driver
analyzes the packet and decides what handling should it get:
1. go to accelerated path (to be encrypted in HW),
2. go to regular xmit path (send w/o encryption),
3. drop.
Packets marked with skb->decrypted by the TLS stack in the TX flow skips
SW encryption, and rely on the HW offload.
Verify that such packets are never sent un-encrypted on the wire.
Add a WARN to catch such bugs, and prefer dropping the packet in these cases.
Fixes: 46a3ea9807 ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The call to tx_post_resync_params() is done earlier in the flow,
the post of the control WQEs is unnecessarily repeated. Remove it.
Fixes: 700ec49742 ("net/mlx5e: kTLS, Fix missing SQ edge fill")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
There are the following cases:
1. Packet ends before start marker: bypass offload.
2. Packet starts before start marker and ends after it: drop,
not supported, breaks contract with kernel.
3. packet ends before tls record info starts: drop,
this packet was already acknowledged and its record info
was released.
Add the above as comment in code.
Mind possible wraparounds of the TCP seq, replace the simple comparison
with a call to the TCP before() method.
In addition, remove logic that handles negative sync_len values,
as it became impossible.
Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Fixes: 46a3ea9807 ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently VF in LEGACY mode are not able to go up. Also in OFFLOADS
mode, when switching to it first time, VF can go up independently to
his representor, which is not expected.
Perform clearing of VF config when switching modes and set link state
to AUTO as default value. Also, when switching to OFFLOADS mode set
link state to DOWN, which allow VF link state to be controlled by its
REP.
Fixes: 1ab2068a4c ("net/mlx5: Implement vports admin state backup/restore")
Fixes: 556b9d16d3 ("net/mlx5: Clear VF's configuration on disabling SRIOV")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use raw_smp_processor_id instead of smp_processor_id() otherwise we will
get the following trace in debug-kernel:
BUG: using smp_processor_id() in preemptible [00000000] code: devlink
caller is dr_create_cq.constprop.2+0x31d/0x970 [mlx5_core]
Call Trace:
dump_stack+0x9a/0xf0
debug_smp_processor_id+0x1f3/0x200
dr_create_cq.constprop.2+0x31d/0x970
genl_family_rcv_msg+0x5fd/0x1170
genl_rcv_msg+0xb8/0x160
netlink_rcv_skb+0x11e/0x340
Fixes: 297cccebdc ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Since the implementation relies on limiting the VF transmit rate to
simulate ingress rate limiting, and since either uplink representor or
ecpf are not associated with a VF, we limit the rate limit configuration
for those ports.
Fixes: fcb64c0f56 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The current code handles only counters that attached to dest, we still
have the cases where we have counter on non-dest, like over drop etc.
Fixes: 6a48faeeca ("net/mlx5: Add direct rule fs_cmd implementation")
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add the upcoming ConnectX-7 device ID.
Fixes: 85327a9c41 ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The pool sizes represent the pool sizes in the fw. when we request
a pool size from fw, it will return the next possible group.
We track how many pools the fw has left and start requesting groups
from the big to the small.
When we start request 4k group, which doesn't exists in fw, fw
wants to allocate the next possible size, 64k, but will fail since
its exhausted. The correct smallest pool size in fw is 128 and not 4k.
Fixes: e52c280240 ("net/mlx5: E-Switch, Add chains and priorities")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
skb->csum is updated incorrectly, when manipulation for
NF_NAT_MANIP_SRC\DST is done on IPV6 packet.
Fix:
There is no need to update skb->csum in inet_proto_csum_replace16(),
because update in two fields a.) IPv6 src/dst address and b.) L4 header
checksum cancels each other for skb->csum calculation. Whereas
inet_proto_csum_replace4 function needs to update skb->csum, because
update in 3 fields a.) IPv4 src/dst address, b.) IPv4 Header checksum
and c.) L4 header checksum results in same diff as L4 Header checksum
for skb->csum calculation.
[ pablo@netfilter.org: a few comestic documentation edits ]
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Signed-off-by: Andy Stracner <astracner@linkedin.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch introduces a list of pending module requests. This new module
list is composed of nft_module_request objects that contain the module
name and one status field that tells if the module has been already
loaded (the 'done' field).
In the first pass, from the preparation phase, the netlink command finds
that a module is missing on this list. Then, a module request is
allocated and added to this list and nft_request_module() returns
-EAGAIN. This triggers the abort path with the autoload parameter set on
from nfnetlink, request_module() is called and the module request enters
the 'done' state. Since the mutex is released when loading modules from
the abort phase, the module list is zapped so this is iteration occurs
over a local list. Therefore, the request_module() calls happen when
object lists are in consistent state (after fulling aborting the
transaction) and the commit list is empty.
On the second pass, the netlink command will find that it already tried
to load the module, so it does not request it again and
nft_request_module() returns 0. Then, there is a look up to find the
object that the command was missing. If the module was successfully
loaded, the command proceeds normally since it finds the missing object
in place, otherwise -ENOENT is reported to userspace.
This patch also updates nfnetlink to include the reason to enter the
abort phase, which is required for this new autoload module rationale.
Fixes: ec7470b834 ("netfilter: nf_tables: store transaction list locally while requesting module")
Reported-by: syzbot+29125d208b3dae9a7019@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This new helper function validates that unknown family and chain type
coming from userspace do not trigger an out-of-bound array access. Bail
out in case __nft_chain_type_get() returns NULL from
nft_chain_parse_hook().
Fixes: 9370761c56 ("netfilter: nf_tables: convert built-in tables/chains to chain types")
Reported-by: syzbot+156a04714799b1d480bc@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In the nft_indr_block_cb the chain should check the flag with
NFT_CHAIN_HW_OFFLOAD.
Fixes: 9a32669fec ("netfilter: nf_tables_offload: support indr block call")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Two Fixes:
- Fix NULL-ptr dereference bug in Intel IOMMU driver
- Properly safe and restore AMD IOMMU performance counter
registers when testing if they are writable.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl4rKJ0ACgkQK/BELZcB
GuM8BxAAse7i4EAB0DZ26aDVU/ZZAGBTc5pH+7IOvuGFWpu3Ik7siltMZKTa56XD
5reQl1hUHyBesdwNcd+ZQz/0QzgMu0zEcxJtzLD5nbk2Rp1pDYab84+wA0oqftmN
7TBqPJjqGYfnfkobPW+4SaiCSxM6XKEXYaJvXqF6dsz5fLHHqTVjGyEXuu0pWey5
1AG1nNWey5TobR39ZUNVIGUEf4ftN8tOGutjGKch29OmMg1UkHZjDyYpVJurjSUl
1YDf0WA2X5cUWzRg519o9Uxs+5y/tBItS2qMJ/Uwh1fp/kOYrGoEPXRH9brU3owh
j8kVKyz/cNlvhdJN/QPVq/A+GBgzAq+u9FUGH68LBVx05cuCRGWb6qP/blEZWUSA
zQvbHPIPqPBK0pYtbVCgJFCpOS5C6X33RMgbRr3XncAItmfEp2FH7tcHIsDZUzmO
EGBgSPrhRFJrgfa3hopc93L4OEHmGy+20o7oT4kk0SBcHVDSyDZVBu1lX+/gAZKd
ZoV5FyF9KNB+qQQ6E+vpL5gLh7BWiBgOI41Bdw5ut09I0gYURPdFWUPHRFwHK1/1
KTSWLyuLkS4VxdhcwhA30eVmSrbfocPPwuUxcGn2p1YnaycFkDNoJnhifvFiOS6w
tFOJ3wEVL2OX7On6U3jviVy7MQapX3MvWrWKlgg/UoMv36dbqRA=
=ViDp
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"Two fixes:
- Fix NULL-ptr dereference bug in Intel IOMMU driver
- Properly save and restore AMD IOMMU performance counter registers
when testing if they are writable"
* tag 'iommu-fixes-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix IOMMU perf counter clobbering during init
iommu/vt-d: Call __dmar_remove_one_dev_info with valid pointer
Fix our hash MMU code to avoid having overlapping ids between user and kernel,
which isn't as bad as it sounds but led to crashes on some machines.
A fix for the Power9 XIVE interrupt code, which could return the wrong interrupt
state in obscure error conditions.
A minor Kconfig fix for the recently added CONFIG_PPC_UV code.
Thanks to:
Aneesh Kumar K.V, Bharata B Rao, Cédric Le Goater, Frederic Barrat.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl4q3iMTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgM1fEAC7YefQZsuIstV2bNH9xFJe/oEOSLWt
DqZYfWZRBRZRdRMAP7+hmXpsLfHAB3QSGv4yEc2Pgm5Y5vYuv8MWqEKkpJoivIPW
/Fu7mNGI06IWinMyl59dLA5fXPNeOzSNcdPnBdIUC/62XU414IGsy273X3LN4YiM
hfXSpPA3tpSjVBTeqfKTJowXWMOvbkwiHMs2ziAu4Y9GDauJsU0UiOxMkl2MYT7Z
PULO/SrjYSbP8+MazD/YdJ5k7F4rNV3bVDPUCU305tKIkrqRmBs+8DXjeUAMWVjj
y2huUZ/0XXsr8QZyyGfMhEC1sJKAD3colG4LHUysbWz1oInY93CvXELdPAMXhqrH
xXTzx9ae74Xk7t7WaGgq5xJGHJCDSg1aA2HPkUg6Gbk6iJIztOCkKMdSPvFMo18S
p+Kjis8db83RPXQm3ooW7s/xhp6AaDiXd8khQ7A2efgt4SDTSPFvgv2q4CMDC1lP
njchgzgPJ635MSd3CC/u6i7eViuUylb6SfhFVYY0DqOOvNJVrj1W6M4N7tpWSrRS
PXJI2/NwlL3WIajKO3KfqTRI7QaSLtccTrXGLiGBxx/ooAowP94/WL3CYJAeuO2F
NmA1GXH6/WXvzMMNm5NB0kCOUOOhtJk/MFuO7Agly6YL9p36fqVz4WIJVAqtwJLO
uPu0xASJCgp9Mg==
=r/RM
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.5-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.5:
- Fix our hash MMU code to avoid having overlapping ids between user
and kernel, which isn't as bad as it sounds but led to crashes on
some machines.
- A fix for the Power9 XIVE interrupt code, which could return the
wrong interrupt state in obscure error conditions.
- A minor Kconfig fix for the recently added CONFIG_PPC_UV code.
Thanks to Aneesh Kumar K.V, Bharata B Rao, Cédric Le Goater, Frederic
Barrat"
* tag 'powerpc-5.5-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/hash: Fix sharing context ids between kernel & userspace
powerpc/xive: Discard ESB load value when interrupt is invalid
powerpc: Ultravisor: Fix the dependencies for CONFIG_PPC_UV
core/mst:
- Fix SST branch device handling
amdgpu:
- enable renoir outside experimental
i915:
- Avoid overflow with huge userptr objects
- uAPI fix to correctly handle negative values in
engine->uabi_class/instance (cc: stable)
panfrost:
- Fix mapping of globally visible BO's (Boris)
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJeKoezAAoJEAx081l5xIa+hRMP/jC45Om4YDtsoH+75v41mYfO
yUt1jBsCDXHAvPjBbRcoM2kbZU+ZcaDR4EZS7lA7QTW5CAosqzxrGdkzGobjmj5v
TN/0BWWyQYtQ+lbt/pUxM+kktaV9IkjZIAUlW3msCwKsYE+nPIWfXDnW2ioFpitk
sXk3vZYC4Lh6IVJ5PG879nz5XPpo2eb4sYYQ4PTFj65u436NEd8AFqmcDzp64Jhb
xh1jLSxeS/JROyb/AmTHt0PgrGB5B08U8V7CAJXhIgf7zM28iiA3ynanoO3PauYb
3uPg8eVdP2tOqcpplV5CBkiw/in2sPgRprMEg66cKYzo9rieWu7nPL+1zmaxD16s
NIXSmov4Sm7yOS4LAGAckS12ejniy7L6OcP5N2S2Gz0HpIMXluBfLw3d4ZiIW5Q5
yKdpmrwTfl0IfTiblweaHi8uT4K19vSqoXUf4bFa4UxqeYOtipHnNP907+nDs0JN
GiHjIvrQJ4KH/A/JAkjEyNzHRAqJhJke84E519UP+mI+8paqzftsbDvf0ADA7nzZ
q6TbZpTbEPV+NqnaH6n8volfh/WougW2BpxozMO3UP/474AZq1bXJIXi45ksatRW
Cq0KQP8iI//C0xhCDPs7LQ/XzgkcQtDPZ1Nx6vWjrn6O3DMjKCm0RxZgfKj5CfbJ
PovYqZQXJ01989DfXfQv
=K7TY
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2020-01-24' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"This one has a core mst fix and two i915 fixes. amdgpu just enables
some hw outside experimental.
The panfrost fix is a little bigger than I'd like at this stage but it
fixes a fairly fundamental problem with global shared buffers in that
driver, and since it's confined to that driver and I've taken a look
at it, I think it's fine to get into the tree now, so it can get
stable propagated as well.
core/mst:
- Fix SST branch device handling
amdgpu:
- enable renoir outside experimental
i915:
- Avoid overflow with huge userptr objects
- uAPI fix to correctly handle negative values in
engine->uabi_class/instance (cc: stable)
panfrost:
- Fix mapping of globally visible BO's (Boris)"
* tag 'drm-fixes-2020-01-24' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: remove the experimental flag for renoir
drm/panfrost: Add the panfrost_gem_mapping concept
drm/i915: Align engine->uabi_class/instance with i915_drm.h
drm/i915/userptr: fix size calculation
drm/dp_mst: Handle SST-only branch device case
The range passed to user_access_begin() by strncpy_from_user() and
strnlen_user() starts at 'src' and goes up to the limit of userspace
although reads will be limited by the 'count' param.
On 32 bits powerpc (book3s/32) access has to be granted for each
256Mbytes segment and the cost increases with the number of segments to
unlock.
Limit the range with 'count' param.
Fixes: 594cc251fd ("make 'user_access_begin()' do 'access_ok()'")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The netlink notifications triggered by the INIT and INIT_ACK chunks
for a tracked SCTP association do not include protocol information
for the corresponding connection - SCTP state and verification tags
for the original and reply direction are missing. Since the connection
tracking implementation allows user space programs to receive
notifications about a connection and then create a new connection
based on the values received in a notification, it makes sense that
INIT and INIT_ACK notifications should contain the SCTP state
and verification tags available at the time when a notification
is sent. The missing verification tags cause a newly created
netfilter connection to fail to verify the tags of SCTP packets
when this connection has been created from the values previously
received in an INIT or INIT_ACK notification.
A PROTOINFO event is cached in sctp_packet() when the state
of a connection changes. The CLOSED and COOKIE_WAIT state will
be used for connections that have seen an INIT and INIT_ACK chunk,
respectively. The distinct states will cause a connection state
change in sctp_packet().
Signed-off-by: Jiri Wiesner <jwiesner@suse.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
init_iommu_perf_ctr() clobbers the register when it checks write access
to IOMMU perf counters and fails to restore when they are writable.
Add save and restore to fix it.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Fixes: 30861ddc9c ("perf/x86/amd: Add IOMMU Performance Counter resource management")
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
It is possible for archdata.iommu to be set to
DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO so check for
those values before calling __dmar_remove_one_dev_info. Without a
check it can result in a null pointer dereference. This has been seen
while booting a kdump kernel on an HP dl380 gen9.
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org # 5.3+
Cc: linux-kernel@vger.kernel.org
Fixes: ae23bfb68f ("iommu/vt-d: Detach domain before using a private one")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
[BUG]
For dev-replace test cases with fsstress, like btrfs/06[45] btrfs/071,
looped runs can lead to random failure, where scrub finds csum error.
The possibility is not high, around 1/20 to 1/100, but it's causing data
corruption.
The bug is observable after commit b12de52896 ("btrfs: scrub: Don't
check free space before marking a block group RO")
[CAUSE]
Dev-replace has two source of writes:
- Write duplication
All writes to source device will also be duplicated to target device.
Content: Not yet persisted data/meta
- Scrub copy
Dev-replace reused scrub code to iterate through existing extents, and
copy the verified data to target device.
Content: Previously persisted data and metadata
The difference in contents makes the following race possible:
Regular Writer | Dev-replace
-----------------------------------------------------------------
^ |
| Preallocate one data extent |
| at bytenr X, len 1M |
v |
^ Commit transaction |
| Now extent [X, X+1M) is in |
v commit root |
================== Dev replace starts =========================
| ^
| | Scrub extent [X, X+1M)
| | Read [X, X+1M)
| | (The content are mostly garbage
| | since it's preallocated)
^ | v
| Write back happens for |
| extent [X, X+512K) |
| New data writes to both |
| source and target dev. |
v |
| ^
| | Scrub writes back extent [X, X+1M)
| | to target device.
| | This will over write the new data in
| | [X, X+512K)
| v
This race can only happen for nocow writes. Thus metadata and data cow
writes are safe, as COW will never overwrite extents of previous
transaction (in commit root).
This behavior can be confirmed by disabling all fallocate related calls
in fsstress (*), then all related tests can pass a 2000 run loop.
*: FSSTRESS_AVOID="-f fallocate=0 -f allocsp=0 -f zero=0 -f insert=0 \
-f collapse=0 -f punch=0 -f resvsp=0"
I didn't expect resvsp ioctl will fallback to fallocate in VFS...
[FIX]
Make dev-replace to require mandatory block group RO, and wait for current
nocow writes before calling scrub_chunk().
This patch will mostly revert commit 76a8efa171 ("btrfs: Continue replace
when set_block_ro failed") for dev-replace path.
The side effect is, dev-replace can be more strict on avaialble space, but
definitely worth to avoid data corruption.
Reported-by: Filipe Manana <fdmanana@suse.com>
Fixes: 76a8efa171 ("btrfs: Continue replace when set_block_ro failed")
Fixes: b12de52896 ("btrfs: scrub: Don't check free space before marking a block group RO")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Vasily Averin says:
====================
netdev: seq_file .next functions should increase position index
In Aug 2018 NeilBrown noticed
commit 1f4aace60b ("fs/seq_file.c: simplify seq_file iteration code and interface")
"Some ->next functions do not increment *pos when they return NULL...
Note that such ->next functions are buggy and should be fixed.
A simple demonstration is
dd if=/proc/swaps bs=1000 skip=1
Choose any block size larger than the size of /proc/swaps. This will
always show the whole last line of /proc/swaps"
Described problem is still actual. If you make lseek into middle of last output line
following read will output end of last line and whole last line once again.
$ dd if=/proc/swaps bs=1 # usual output
Filename Type Size Used Priority
/dev/dm-0 partition 4194812 97536 -2
104+0 records in
104+0 records out
104 bytes copied
$ dd if=/proc/swaps bs=40 skip=1 # last line was generated twice
dd: /proc/swaps: cannot skip to specified offset
v/dm-0 partition 4194812 97536 -2
/dev/dm-0 partition 4194812 97536 -2
3+1 records in
3+1 records out
131 bytes copied
There are lot of other affected files, I've found 30+ including
/proc/net/ip_tables_matches and /proc/sysvipc/*
This patch-set fixes files related to netdev@
https://bugzilla.kernel.org/show_bug.cgi?id=206283
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Latest commit 853697504d ("tcp: Fix highest_sack and highest_sack_seq")
apparently allowed syzbot to trigger various crashes in TCP stack [1]
I believe this commit only made things easier for syzbot to find
its way into triggering use-after-frees. But really the bugs
could lead to bad TCP behavior or even plain crashes even for
non malicious peers.
I have audited all calls to tcp_rtx_queue_unlink() and
tcp_rtx_queue_unlink_and_free() and made sure tp->highest_sack would be updated
if we are removing from rtx queue the skb that tp->highest_sack points to.
These updates were missing in three locations :
1) tcp_clean_rtx_queue() [This one seems quite serious,
I have no idea why this was not caught earlier]
2) tcp_rtx_queue_purge() [Probably not a big deal for normal operations]
3) tcp_send_synack() [Probably not a big deal for normal operations]
[1]
BUG: KASAN: use-after-free in tcp_highest_sack_seq include/net/tcp.h:1864 [inline]
BUG: KASAN: use-after-free in tcp_highest_sack_seq include/net/tcp.h:1856 [inline]
BUG: KASAN: use-after-free in tcp_check_sack_reordering+0x33c/0x3a0 net/ipv4/tcp_input.c:891
Read of size 4 at addr ffff8880a488d068 by task ksoftirqd/1/16
CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.5.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
__kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
kasan_report+0x12/0x20 mm/kasan/common.c:639
__asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:134
tcp_highest_sack_seq include/net/tcp.h:1864 [inline]
tcp_highest_sack_seq include/net/tcp.h:1856 [inline]
tcp_check_sack_reordering+0x33c/0x3a0 net/ipv4/tcp_input.c:891
tcp_try_undo_partial net/ipv4/tcp_input.c:2730 [inline]
tcp_fastretrans_alert+0xf74/0x23f0 net/ipv4/tcp_input.c:2847
tcp_ack+0x2577/0x5bf0 net/ipv4/tcp_input.c:3710
tcp_rcv_established+0x6dd/0x1e90 net/ipv4/tcp_input.c:5706
tcp_v4_do_rcv+0x619/0x8d0 net/ipv4/tcp_ipv4.c:1619
tcp_v4_rcv+0x307f/0x3b40 net/ipv4/tcp_ipv4.c:2001
ip_protocol_deliver_rcu+0x5a/0x880 net/ipv4/ip_input.c:204
ip_local_deliver_finish+0x23b/0x380 net/ipv4/ip_input.c:231
NF_HOOK include/linux/netfilter.h:307 [inline]
NF_HOOK include/linux/netfilter.h:301 [inline]
ip_local_deliver+0x1e9/0x520 net/ipv4/ip_input.c:252
dst_input include/net/dst.h:442 [inline]
ip_rcv_finish+0x1db/0x2f0 net/ipv4/ip_input.c:428
NF_HOOK include/linux/netfilter.h:307 [inline]
NF_HOOK include/linux/netfilter.h:301 [inline]
ip_rcv+0xe8/0x3f0 net/ipv4/ip_input.c:538
__netif_receive_skb_one_core+0x113/0x1a0 net/core/dev.c:5148
__netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5262
process_backlog+0x206/0x750 net/core/dev.c:6093
napi_poll net/core/dev.c:6530 [inline]
net_rx_action+0x508/0x1120 net/core/dev.c:6598
__do_softirq+0x262/0x98c kernel/softirq.c:292
run_ksoftirqd kernel/softirq.c:603 [inline]
run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595
smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165
kthread+0x361/0x430 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Allocated by task 10091:
save_stack+0x23/0x90 mm/kasan/common.c:72
set_track mm/kasan/common.c:80 [inline]
__kasan_kmalloc mm/kasan/common.c:513 [inline]
__kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:521
slab_post_alloc_hook mm/slab.h:584 [inline]
slab_alloc_node mm/slab.c:3263 [inline]
kmem_cache_alloc_node+0x138/0x740 mm/slab.c:3575
__alloc_skb+0xd5/0x5e0 net/core/skbuff.c:198
alloc_skb_fclone include/linux/skbuff.h:1099 [inline]
sk_stream_alloc_skb net/ipv4/tcp.c:875 [inline]
sk_stream_alloc_skb+0x113/0xc90 net/ipv4/tcp.c:852
tcp_sendmsg_locked+0xcf9/0x3470 net/ipv4/tcp.c:1282
tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1432
inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:672
__sys_sendto+0x262/0x380 net/socket.c:1998
__do_sys_sendto net/socket.c:2010 [inline]
__se_sys_sendto net/socket.c:2006 [inline]
__x64_sys_sendto+0xe1/0x1a0 net/socket.c:2006
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 10095:
save_stack+0x23/0x90 mm/kasan/common.c:72
set_track mm/kasan/common.c:80 [inline]
kasan_set_free_info mm/kasan/common.c:335 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
__cache_free mm/slab.c:3426 [inline]
kmem_cache_free+0x86/0x320 mm/slab.c:3694
kfree_skbmem+0x178/0x1c0 net/core/skbuff.c:645
__kfree_skb+0x1e/0x30 net/core/skbuff.c:681
sk_eat_skb include/net/sock.h:2453 [inline]
tcp_recvmsg+0x1252/0x2930 net/ipv4/tcp.c:2166
inet_recvmsg+0x136/0x610 net/ipv4/af_inet.c:838
sock_recvmsg_nosec net/socket.c:886 [inline]
sock_recvmsg net/socket.c:904 [inline]
sock_recvmsg+0xce/0x110 net/socket.c:900
__sys_recvfrom+0x1ff/0x350 net/socket.c:2055
__do_sys_recvfrom net/socket.c:2073 [inline]
__se_sys_recvfrom net/socket.c:2069 [inline]
__x64_sys_recvfrom+0xe1/0x1a0 net/socket.c:2069
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8880a488d040
which belongs to the cache skbuff_fclone_cache of size 456
The buggy address is located 40 bytes inside of
456-byte region [ffff8880a488d040, ffff8880a488d208)
The buggy address belongs to the page:
page:ffffea0002922340 refcount:1 mapcount:0 mapping:ffff88821b057000 index:0x0
raw: 00fffe0000000200 ffffea00022a5788 ffffea0002624a48 ffff88821b057000
raw: 0000000000000000 ffff8880a488d040 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8880a488cf00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8880a488cf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8880a488d000: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
^
ffff8880a488d080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880a488d100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: 853697504d ("tcp: Fix highest_sack and highest_sack_seq")
Fixes: 50895b9de1 ("tcp: highest_sack fix")
Fixes: 737ff31456 ("tcp: use sequence distance to detect reordering")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cambda Zhu <cambda@linux.alibaba.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a spelling mistake in a printk message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a spelling mistake in a pr_warn message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>