Commit Graph

1121050 Commits

Author SHA1 Message Date
Alexei Starovoitov
cb15c73487 Merge branch 'Fix incorrect pruning for ARG_CONST_ALLOC_SIZE_OR_ZERO'
Kumar Kartikeya Dwivedi says:

====================

A fix for a missing mark_chain_precision call that leads to eager pruning and
loading of invalid programs when the more permissive case is in the straight
line exploration. Please see the commit log for details, and selftest for an
example.
====================

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25 12:07:51 -07:00
Kumar Kartikeya Dwivedi
1800b2ac96 selftests/bpf: Add regression test for pruning fix
Add a test to ensure we do mark_chain_precision for the argument type
ARG_CONST_ALLOC_SIZE_OR_ZERO. For other argument types, this was already
done, but propagation for missing for this case. Without the fix, this
test case loads successfully.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220823185500.467-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25 12:07:45 -07:00
Kumar Kartikeya Dwivedi
2fc31465c5 bpf: Do mark_chain_precision for ARG_CONST_ALLOC_SIZE_OR_ZERO
Precision markers need to be propagated whenever we have an ARG_CONST_*
style argument, as the verifier cannot consider imprecise scalars to be
equivalent for the purposes of states_equal check when such arguments
refine the return value (in this case, set mem_size for PTR_TO_MEM). The
resultant mem_size for the R0 is derived from the constant value, and if
the verifier incorrectly prunes states considering them equivalent where
such arguments exist (by seeing that both registers have reg->precise as
false in regsafe), we can end up with invalid programs passing the
verifier which can do access beyond what should have been the correct
mem_size in that explored state.

To show a concrete example of the problem:

0000000000000000 <prog>:
       0:       r2 = *(u32 *)(r1 + 80)
       1:       r1 = *(u32 *)(r1 + 76)
       2:       r3 = r1
       3:       r3 += 4
       4:       if r3 > r2 goto +18 <LBB5_5>
       5:       w2 = 0
       6:       *(u32 *)(r1 + 0) = r2
       7:       r1 = *(u32 *)(r1 + 0)
       8:       r2 = 1
       9:       if w1 == 0 goto +1 <LBB5_3>
      10:       r2 = -1

0000000000000058 <LBB5_3>:
      11:       r1 = 0 ll
      13:       r3 = 0
      14:       call bpf_ringbuf_reserve
      15:       if r0 == 0 goto +7 <LBB5_5>
      16:       r1 = r0
      17:       r1 += 16777215
      18:       w2 = 0
      19:       *(u8 *)(r1 + 0) = r2
      20:       r1 = r0
      21:       r2 = 0
      22:       call bpf_ringbuf_submit

00000000000000b8 <LBB5_5>:
      23:       w0 = 0
      24:       exit

For the first case, the single line execution's exploration will prune
the search at insn 14 for the branch insn 9's second leg as it will be
verified first using r2 = -1 (UINT_MAX), while as w1 at insn 9 will
always be 0 so at runtime we don't get error for being greater than
UINT_MAX/4 from bpf_ringbuf_reserve. The verifier during regsafe just
sees reg->precise as false for both r2 registers in both states, hence
considers them equal for purposes of states_equal.

If we propagated precise markers using the backtracking support, we
would use the precise marking to then ensure that old r2 (UINT_MAX) was
within the new r2 (1) and this would never be true, so the verification
would rightfully fail.

The end result is that the out of bounds access at instruction 19 would
be permitted without this fix.

Note that reg->precise is always set to true when user does not have
CAP_BPF (or when subprog count is greater than 1 (i.e. use of any static
or global functions)), hence this is only a problem when precision marks
need to be explicitly propagated (i.e. privileged users with CAP_BPF).

A simplified test case has been included in the next patch to prevent
future regressions.

Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220823185300.406-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25 12:07:45 -07:00
Kuniyuki Iwashima
0947ae1121 bpf: Fix a data-race around bpf_jit_limit.
While reading bpf_jit_limit, it can be changed concurrently via sysctl,
WRITE_ONCE() in __do_proc_doulongvec_minmax(). The size of bpf_jit_limit
is long, so we need to add a paired READ_ONCE() to avoid load-tearing.

Fixes: ede95a63b5 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220823215804.2177-1-kuniyu@amazon.com
2022-08-24 00:27:14 +02:00
Pu Lehui
7d6620f107 bpf, cgroup: Fix kernel BUG in purge_effective_progs
Syzkaller reported a triggered kernel BUG as follows:

  ------------[ cut here ]------------
  kernel BUG at kernel/bpf/cgroup.c:925!
  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 1 PID: 194 Comm: detach Not tainted 5.19.0-14184-g69dac8e431af #8
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
  rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:__cgroup_bpf_detach+0x1f2/0x2a0
  Code: 00 e8 92 60 30 00 84 c0 75 d8 4c 89 e0 31 f6 85 f6 74 19 42 f6 84
  28 48 05 00 00 02 75 0e 48 8b 80 c0 00 00 00 48 85 c0 75 e5 <0f> 0b 48
  8b 0c5
  RSP: 0018:ffffc9000055bdb0 EFLAGS: 00000246
  RAX: 0000000000000000 RBX: ffff888100ec0800 RCX: ffffc900000f1000
  RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888100ec4578
  RBP: 0000000000000000 R08: ffff888100ec0800 R09: 0000000000000040
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff888100ec4000
  R13: 000000000000000d R14: ffffc90000199000 R15: ffff888100effb00
  FS:  00007f68213d2b80(0000) GS:ffff88813bc80000(0000)
  knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000055f74a0e5850 CR3: 0000000102836000 CR4: 00000000000006e0
  Call Trace:
   <TASK>
   cgroup_bpf_prog_detach+0xcc/0x100
   __sys_bpf+0x2273/0x2a00
   __x64_sys_bpf+0x17/0x20
   do_syscall_64+0x3b/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd
  RIP: 0033:0x7f68214dbcb9
  Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8 48 89
  f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
  f0 ff8
  RSP: 002b:00007ffeb487db68 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
  RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f68214dbcb9
  RDX: 0000000000000090 RSI: 00007ffeb487db70 RDI: 0000000000000009
  RBP: 0000000000000003 R08: 0000000000000012 R09: 0000000b00000003
  R10: 00007ffeb487db70 R11: 0000000000000246 R12: 00007ffeb487dc20
  R13: 0000000000000004 R14: 0000000000000001 R15: 000055f74a1011b0
   </TASK>
  Modules linked in:
  ---[ end trace 0000000000000000 ]---

Repetition steps:

For the following cgroup tree,

  root
   |
  cg1
   |
  cg2

  1. attach prog2 to cg2, and then attach prog1 to cg1, both bpf progs
     attach type is NONE or OVERRIDE.
  2. write 1 to /proc/thread-self/fail-nth for failslab.
  3. detach prog1 for cg1, and then kernel BUG occur.

Failslab injection will cause kmalloc fail and fall back to
purge_effective_progs. The problem is that cg2 have attached another prog,
so when go through cg2 layer, iteration will add pos to 1, and subsequent
operations will be skipped by the following condition, and cg will meet
NULL in the end.

  `if (pos && !(cg->bpf.flags[atype] & BPF_F_ALLOW_MULTI))`

The NULL cg means no link or prog match, this is as expected, and it's not
a bug. So here just skip the no match situation.

Fixes: 4c46091ee9 ("bpf: Fix KASAN use-after-free Read in compute_effective_progs")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220813134030.1972696-1-pulehui@huawei.com
2022-08-18 23:27:33 +02:00
Eyal Birger
7ec9fce4b3 ip_tunnel: Respect tunnel key's "flow_flags" in IP tunnels
Commit 451ef36bd2 ("ip_tunnels: Add new flow flags field to ip_tunnel_key")
added a "flow_flags" member to struct ip_tunnel_key which was later used by
the commit in the fixes tag to avoid dropping packets with sources that
aren't locally configured when set in bpf_set_tunnel_key().

VXLAN and GENEVE were made to respect this flag, ip tunnels like IPIP and GRE
were not.

This commit fixes this omission by making ip_tunnel_init_flow() receive
the flow flags from the tunnel key in the relevant collect_md paths.

Fixes: b8fff74852 ("bpf: Set flow flag to allow any source IP in bpf_tunnel_key")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Paul Chaignon <paul@isovalent.com>
Link: https://lore.kernel.org/bpf/20220818074118.726639-1-eyal.birger@gmail.com
2022-08-18 21:18:28 +02:00
YiFei Zhu
14b20b784f bpf: Restrict bpf_sys_bpf to CAP_PERFMON
The verifier cannot perform sufficient validation of any pointers passed
into bpf_attr and treats them as integers rather than pointers. The helper
will then read from arbitrary pointers passed into it. Restrict the helper
to CAP_PERFMON since the security model in BPF of arbitrary kernel read is
CAP_BPF + CAP_PERFMON.

Fixes: af2ac3e13e ("bpf: Prepare bpf syscall to be used from kernel and user space.")
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220816205517.682470-1-zhuyifei@google.com
2022-08-18 00:27:49 +02:00
Daniel Borkmann
3024d95a4c bpf: Partially revert flexible-array member replacement
Partially revert 94dfc73e7c ("treewide: uapi: Replace zero-length arrays
with flexible-array members") given it breaks BPF UAPI.

For example, BPF CI run reveals build breakage under LLVM:

  [...]
    CLNG-BPF [test_maps] map_ptr_kern.o
    CLNG-BPF [test_maps] btf__core_reloc_arrays___diff_arr_val_sz.o
    CLNG-BPF [test_maps] test_bpf_cookie.o
  progs/map_ptr_kern.c:314:26: error: field 'trie_key' with variable sized type 'struct bpf_lpm_trie_key' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end]
           struct bpf_lpm_trie_key trie_key;
                                   ^
    CLNG-BPF [test_maps] btf__core_reloc_type_based___diff.o
  1 error generated.
  make: *** [Makefile:521: /tmp/runner/work/bpf/bpf/tools/testing/selftests/bpf/map_ptr_kern.o] Error 1
  make: *** Waiting for unfinished jobs....
  [...]

Typical usage of the bpf_lpm_trie_key is that the struct gets embedded into
a user defined key for the LPM BPF map, from the selftest example:

  struct bpf_lpm_trie_key {                 <-- UAPI exported struct
         __u32   prefixlen;
         __u8    data[];
  };

  struct lpm_key {                          <-- BPF program defined struct
         struct bpf_lpm_trie_key trie_key;
         __u32 data;
  };

Undo this for BPF until a different solution can be found. It's the only flexible-
array member case in the UAPI header.

This was discovered in BPF CI after Dave reported that the include/uapi/linux/bpf.h
header was out of sync with tools/include/uapi/linux/bpf.h after 94dfc73e7c. And
the subsequent sync attempt failed CI.

Fixes: 94dfc73e7c ("treewide: uapi: Replace zero-length arrays with flexible-array members")
Reported-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/bpf/22aebc88-da67-f086-e620-dd4a16e2bc69@iogearbox.net
2022-08-17 23:45:47 +02:00
Liu Jian
583585e48d skmsg: Fix wrong last sg check in sk_msg_recvmsg()
Fix one kernel NULL pointer dereference as below:

[  224.462334] Call Trace:
[  224.462394]  __tcp_bpf_recvmsg+0xd3/0x380
[  224.462441]  ? sock_has_perm+0x78/0xa0
[  224.462463]  tcp_bpf_recvmsg+0x12e/0x220
[  224.462494]  inet_recvmsg+0x5b/0xd0
[  224.462534]  __sys_recvfrom+0xc8/0x130
[  224.462574]  ? syscall_trace_enter+0x1df/0x2e0
[  224.462606]  ? __do_page_fault+0x2de/0x500
[  224.462635]  __x64_sys_recvfrom+0x24/0x30
[  224.462660]  do_syscall_64+0x5d/0x1d0
[  224.462709]  entry_SYSCALL_64_after_hwframe+0x65/0xca

In commit 9974d37ea7 ("skmsg: Fix invalid last sg check in
sk_msg_recvmsg()"), we change last sg check to sg_is_last(),
but in sockmap redirection case (without stream_parser/stream_verdict/
skb_verdict), we did not mark the end of the scatterlist. Check the
sk_msg_alloc, sk_msg_page_add, and bpf_msg_push_data functions, they all
do not mark the end of sg. They are expected to use sg.end for end
judgment. So the judgment of '(i != msg_rx->sg.end)' is added back here.

Fixes: 9974d37ea7 ("skmsg: Fix invalid last sg check in sk_msg_recvmsg()")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20220809094915.150391-1-liujian56@huawei.com
2022-08-17 22:33:20 +02:00
Magnus Karlsson
58ca14ed98 xsk: Fix corrupted packets for XDP_SHARED_UMEM
Fix an issue in XDP_SHARED_UMEM mode together with aligned mode where
packets are corrupted for the second and any further sockets bound to
the same umem. In other words, this does not affect the first socket
bound to the umem. The culprit for this bug is that the initialization
of the DMA addresses for the pre-populated xsk buffer pool entries was
not performed for any socket but the first one bound to the umem. Only
the linear array of DMA addresses was populated. Fix this by populating
the DMA addresses in the xsk buffer pool for every socket bound to the
same umem.

Fixes: 94033cd8e7 ("xsk: Optimize for aligned case")
Reported-by: Alasdair McWilliam <alasdair.mcwilliam@outlook.com>
Reported-by: Intrusion Shield Team <dnevil@intrusion.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alasdair McWilliam <alasdair.mcwilliam@outlook.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/xdp-newbies/6205E10C-292E-4995-9D10-409649354226@outlook.com/
Link: https://lore.kernel.org/bpf/20220812113259.531-1-magnus.karlsson@gmail.com
2022-08-15 17:26:07 +02:00
Daniel Müller
27e23836ce selftests/bpf: Add lru_bug to s390x deny list
The lru_bug BPF selftest is failing execution on s390x machines. The
failure is due to program attachment failing in turn, similar to a bunch
of other tests. Those other tests have already been deny-listed and with
this change we do the same for the lru_bug test, adding it to the
corresponding file.

Fixes: de7b992710 ("selftests/bpf: Add test for prealloc_lru_pop bug")
Signed-off-by: Daniel Müller <deso@posteo.net>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/r/20220810200710.1300299-1-deso@posteo.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-12 09:21:28 -07:00
Li Qiong
40b4ac880e net: lan966x: fix checking for return value of platform_get_irq_byname()
The platform_get_irq_byname() returns non-zero IRQ number
or negative error number. "if (irq)" always true, chang it
to "if (irq > 0)"

Signed-off-by: Li Qiong <liqiong@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12 11:29:46 +01:00
Jason Wang
75d8620d46 net: cxgb3: Fix comment typo
The double `the' is duplicated in the comment, remove one.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12 11:28:19 +01:00
Jason Wang
0619d0fa6c bnx2x: Fix comment typo
The double `the' is duplicated in the comment, remove one.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12 11:28:19 +01:00
Jason Wang
9221b2898a net: ipa: Fix comment typo
The double `is' is duplicated in the comment, remove one.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12 11:28:14 +01:00
Michael S. Tsirkin
95bb633048 virtio_net: fix endian-ness for RSS
Using native endian-ness for device supplied fields is wrong
on BE platforms. Sparse warns about this.

Fixes: 91f41f01d2 ("drivers/net/virtio_net: Added RSS hash report.")
Cc: "Andrew Melnychenko" <andrew@daynix.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12 11:26:19 +01:00
Xin Xiong
bfc48f1b05 net/sunrpc: fix potential memory leaks in rpc_sysfs_xprt_state_change()
The issue happens on some error handling paths. When the function
fails to grab the object `xprt`, it simply returns 0, forgetting to
decrease the reference count of another object `xps`, which is
increased by rpc_sysfs_xprt_kobj_get_xprt_switch(), causing refcount
leaks. Also, the function forgets to check whether `xps` is valid
before using it, which may result in NULL-dereferencing issues.

Fix it by adding proper error handling code when either `xprt` or
`xps` is NULL.

Fixes: 5b7eb78486 ("SUNRPC: take a xprt offline using sysfs")
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12 11:21:28 +01:00
Jilin Yuan
86d2155e48 skfp/h: fix repeated words in comments
Delete the redundant word 'the'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12 11:20:29 +01:00
Mikulas Patocka
9f414eb409 rds: add missing barrier to release_refill
The functions clear_bit and set_bit do not imply a memory barrier, thus it
may be possible that the waitqueue_active function (which does not take
any locks) is moved before clear_bit and it could miss a wakeup event.

Fix this bug by adding a memory barrier after clear_bit.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-12 10:46:01 +01:00
Linus Torvalds
7ebfc85e2c Including fixes from bluetooth, bpf, can and netfilter.
A little longer PR than usual but it's all fixes, no late features.
 It's long partially because of timing, and partially because of
 follow ups to stuff that got merged a week or so before the merge
 window and wasn't as widely tested. Maybe the Bluetooth fixes are
 a little alarming so we'll address that, but the rest seems okay
 and not scary.
 
 Notably we're including a fix for the netfilter Kconfig [1], your
 WiFi warning [2] and a bluetooth fix which should unblock syzbot [3].
 
 Current release - regressions:
 
  - Bluetooth:
    - don't try to cancel uninitialized works [3]
    - L2CAP: fix use-after-free caused by l2cap_chan_put
 
  - tls: rx: fix device offload after recent rework
 
  - devlink: fix UAF on failed reload and leftover locks in mlxsw
 
 Current release - new code bugs:
 
  - netfilter:
    - flowtable: fix incorrect Kconfig dependencies [1]
    - nf_tables: fix crash when nf_trace is enabled
 
  - bpf:
    - use proper target btf when exporting attach_btf_obj_id
    - arm64: fixes for bpf trampoline support
 
  - Bluetooth:
    - ISO: unlock on error path in iso_sock_setsockopt()
    - ISO: fix info leak in iso_sock_getsockopt()
    - ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
    - ISO: fix memory corruption on iso_pinfo.base
    - ISO: fix not using the correct QoS
    - hci_conn: fix updating ISO QoS PHY
 
  - phy: dp83867: fix get nvmem cell fail
 
 Previous releases - regressions:
 
  - wifi: cfg80211: fix validating BSS pointers in
    __cfg80211_connect_result [2]
 
  - atm: bring back zatm uAPI after ATM had been removed
 
  - properly fix old bug making bonding ARP monitor mode not being
    able to work with software devices with lockless Tx
 
  - tap: fix null-deref on skb->dev in dev_parse_header_protocol
 
  - revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps
    some devices and breaks others
 
  - netfilter:
    - nf_tables: many fixes rejecting cross-object linking
      which may lead to UAFs
    - nf_tables: fix null deref due to zeroed list head
    - nf_tables: validate variable length element extension
 
  - bgmac: fix a BUG triggered by wrong bytes_compl
 
  - bcmgenet: indicate MAC is in charge of PHY PM
 
 Previous releases - always broken:
 
  - bpf:
    - fix bad pointer deref in bpf_sys_bpf() injected via test infra
    - disallow non-builtin bpf programs calling the prog_run command
    - don't reinit map value in prealloc_lru_pop
    - fix UAFs during the read of map iterator fd
    - fix invalidity check for values in sk local storage map
    - reject sleepable program for non-resched map iterator
 
  - mptcp:
    - move subflow cleanup in mptcp_destroy_common()
    - do not queue data on closed subflows
 
  - virtio_net: fix memory leak inside XDP_TX with mergeable
 
  - vsock: fix memory leak when multiple threads try to connect()
 
  - rework sk_user_data sharing to prevent psock leaks
 
  - geneve: fix TOS inheriting for ipv4
 
  - tunnels & drivers: do not use RT_TOS for IPv6 flowlabel
 
  - phy: c45 baset1: do not skip aneg configuration if clock role
    is not specified
 
  - rose: avoid overflow when /proc displays timer information
 
  - x25: fix call timeouts in blocking connects
 
  - can: mcp251x: fix race condition on receive interrupt
 
  - can: j1939:
    - replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
    - fix memory leak of skbs in j1939_session_destroy()
 
 Misc:
 
  - docs: bpf: clarify that many things are not uAPI
 
  - seg6: initialize induction variable to first valid array index
    (to silence clang vs objtool warning)
 
  - can: ems_usb: fix clang 14's -Wunaligned-access warning
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmL1TtkACgkQMUZtbf5S
 Iruz8Q/+O5xFFsjxuyZD0Mw9d3Jeo3ZI9PeeDvcYl5dZXVegpxqorujTFntxv1Ad
 JC8o5qqms3kO51d+W/yai6iDacEHX2YcJrupZve+vGvpOEVmBRY5O0E1AckJ18+u
 ItmjSVESkybUP5P08/An7Y0dMmj9Xb2z84dGkLe+n8lg6/fimo6Ki6yZjcOBOALu
 AYquMXUcnwztRMbTFjscbJjBd4xFMKZEtthljYtPdIReIN976wmMNYYx+jcPK7ha
 g39Kv6maklp4euerkGIJ/AMnOWHaOGCFjIaz7rr4444NDfrKdt/jeirUXJaz77Jo
 TJM2UOwgOeg6WZkSa3cmdq6UdjdkJ6LTe2CJFf1wJ1qfhAi+s8yWoszsM2Enp+66
 c/mo9jTCMAjmgEJF11idZuz2S697/5j0hvbfM3ZPgNyNBgn8qxz/Z56fNOisx95u
 TkoKKFnGH+mcm/et+omBcyLBtBVK2+/6B6mpl6btf4DOkPn5KFYWHV67uV3ksHzQ
 ye+pnzidoIG0yKbRM2EQKXk7ELKROpl52xUHko93ZinMJt0Q7jBm7tZhJozNFEzi
 hWgUvpmNXgawzLYQcJ9jJmKw3PmYZnRhvYZB/1r91YamM28Hd58k9WfpWtUtjYJN
 N0X58L6JSnKPqzR70pcFppz6iBlh0tHdcEQGWhhKU5ScS3FDxGc=
 =C5Ck
 -----END PGP SIGNATURE-----

Merge tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, bpf, can and netfilter.

  A little larger than usual but it's all fixes, no late features. It's
  large partially because of timing, and partially because of follow ups
  to stuff that got merged a week or so before the merge window and
  wasn't as widely tested. Maybe the Bluetooth fixes are a little
  alarming so we'll address that, but the rest seems okay and not scary.

  Notably we're including a fix for the netfilter Kconfig [1], your WiFi
  warning [2] and a bluetooth fix which should unblock syzbot [3].

  Current release - regressions:

   - Bluetooth:
      - don't try to cancel uninitialized works [3]
      - L2CAP: fix use-after-free caused by l2cap_chan_put

   - tls: rx: fix device offload after recent rework

   - devlink: fix UAF on failed reload and leftover locks in mlxsw

  Current release - new code bugs:

   - netfilter:
      - flowtable: fix incorrect Kconfig dependencies [1]
      - nf_tables: fix crash when nf_trace is enabled

   - bpf:
      - use proper target btf when exporting attach_btf_obj_id
      - arm64: fixes for bpf trampoline support

   - Bluetooth:
      - ISO: unlock on error path in iso_sock_setsockopt()
      - ISO: fix info leak in iso_sock_getsockopt()
      - ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
      - ISO: fix memory corruption on iso_pinfo.base
      - ISO: fix not using the correct QoS
      - hci_conn: fix updating ISO QoS PHY

   - phy: dp83867: fix get nvmem cell fail

  Previous releases - regressions:

   - wifi: cfg80211: fix validating BSS pointers in
     __cfg80211_connect_result [2]

   - atm: bring back zatm uAPI after ATM had been removed

   - properly fix old bug making bonding ARP monitor mode not being able
     to work with software devices with lockless Tx

   - tap: fix null-deref on skb->dev in dev_parse_header_protocol

   - revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some
     devices and breaks others

   - netfilter:
      - nf_tables: many fixes rejecting cross-object linking which may
        lead to UAFs
      - nf_tables: fix null deref due to zeroed list head
      - nf_tables: validate variable length element extension

   - bgmac: fix a BUG triggered by wrong bytes_compl

   - bcmgenet: indicate MAC is in charge of PHY PM

  Previous releases - always broken:

   - bpf:
      - fix bad pointer deref in bpf_sys_bpf() injected via test infra
      - disallow non-builtin bpf programs calling the prog_run command
      - don't reinit map value in prealloc_lru_pop
      - fix UAFs during the read of map iterator fd
      - fix invalidity check for values in sk local storage map
      - reject sleepable program for non-resched map iterator

   - mptcp:
      - move subflow cleanup in mptcp_destroy_common()
      - do not queue data on closed subflows

   - virtio_net: fix memory leak inside XDP_TX with mergeable

   - vsock: fix memory leak when multiple threads try to connect()

   - rework sk_user_data sharing to prevent psock leaks

   - geneve: fix TOS inheriting for ipv4

   - tunnels & drivers: do not use RT_TOS for IPv6 flowlabel

   - phy: c45 baset1: do not skip aneg configuration if clock role is
     not specified

   - rose: avoid overflow when /proc displays timer information

   - x25: fix call timeouts in blocking connects

   - can: mcp251x: fix race condition on receive interrupt

   - can: j1939:
      - replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
      - fix memory leak of skbs in j1939_session_destroy()

  Misc:

   - docs: bpf: clarify that many things are not uAPI

   - seg6: initialize induction variable to first valid array index (to
     silence clang vs objtool warning)

   - can: ems_usb: fix clang 14's -Wunaligned-access warning"

* tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits)
  net: atm: bring back zatm uAPI
  dpaa2-eth: trace the allocated address instead of page struct
  net: add missing kdoc for struct genl_multicast_group::flags
  nfp: fix use-after-free in area_cache_get()
  MAINTAINERS: use my korg address for mt7601u
  mlxsw: minimal: Fix deadlock in ports creation
  bonding: fix reference count leak in balance-alb mode
  net: usb: qmi_wwan: Add support for Cinterion MV32
  bpf: Shut up kern_sys_bpf warning.
  net/tls: Use RCU API to access tls_ctx->netdev
  tls: rx: device: don't try to copy too much on detach
  tls: rx: device: bound the frag walk
  net_sched: cls_route: remove from list when handle is 0
  selftests: forwarding: Fix failing tests with old libnet
  net: refactor bpf_sk_reuseport_detach()
  net: fix refcount bug in sk_psock_get (2)
  selftests/bpf: Ensure sleepable program is rejected by hash map iter
  selftests/bpf: Add write tests for sk local storage map iterator
  selftests/bpf: Add tests for reading a dangling map iter fd
  bpf: Only allow sleepable program for resched-able iterator
  ...
2022-08-11 13:45:37 -07:00
Linus Torvalds
e091ba5cf8 More ACPI updates for 5.20-rc1
- Replace direct references to the fwnode field in struct device with
    dev_fwnode() and device_match_fwnode() (Andy Shevchenko).
 
  - Make the ACPI code handling device properties support properties
    with buffer values (Sakari Ailus).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmL1PkQSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxFWsQAIlWhHFfeDboSPK9EUngbyrJFbEC2cpn
 3RSwTfJkpJNdN1YzCZGiTug//ImAncS9z0Lip4A5in50sfGbWtkQS54fGn5t1e+l
 HL+PGDIM07ye+aUXgnjSmg/wXoAYSCg+vURQeB/8GphRbWwXwyYBV8iqrFnMEfr+
 EX5lQTtKtFvjdMbO4wmpdydsCueZcgLKkc57T1WoiWvoggKGVcVs0OkeDAOH4/Kw
 Gwt9cAl69jbFE3R4fFtdzySZlTnTAA8J6Mfer8yOnqcnL6JneI+GpOVsJQ06n7d3
 JgPLeQNlYR6TGGmZIC0u5rsIDXaiT4//Yjnca3vHmUy1hqvsyyUVHW1t0JiPaj4h
 EGdpl4XP9+4CJyXeo9u2RM/dJF94nFtqHgJKdUuFS2wYXytrUny+jvccfaeEXCUa
 N7Z3IV9JIJeuQKDTy5EyngXwXa+FfpjrrrZr17j7goJ+xQ/zf4oAFQejiuv70aqQ
 5v9iwwZ2Dco/n/oDMKP8tUOplzPi4V42HPAmffvc8bheEE5IOiumkDB6KgVciZa5
 /M91nTnjaTMtH+ObGfwe2hG8HIskI+O/5IO/bWv9IKikBFezpBaRDXXHpfBbOefF
 64SVGmeoDkMCvmPQEGqSfx/PcltIY13wqtcJ651xKkUee8tbx76giPWpeAAguqWl
 bS3Q7ndP+gC3
 =iaSF
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI updates from Rafael Wysocki:
 "These fix up direct references to the fwnode field in struct device
  and extend ACPI device properties support.

  Specifics:

   - Replace direct references to the fwnode field in struct device with
     dev_fwnode() and device_match_fwnode() (Andy Shevchenko)

   - Make the ACPI code handling device properties support properties
     with buffer values (Sakari Ailus)"

* tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: property: Fix error handling in acpi_init_properties()
  ACPI: VIOT: Do not dereference fwnode in struct device
  ACPI: property: Read buffer properties as integers
  ACPI: property: Add support for parsing buffer property UUID
  ACPI: property: Unify integer value reading functions
  ACPI: property: Switch node property referencing from ifs to a switch
  ACPI: property: Move property ref argument parsing into a new function
  ACPI: property: Use acpi_object_type consistently in property ref parsing
  ACPI: property: Tie data nodes to acpi handles
  ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
2022-08-11 13:26:09 -07:00
Linus Torvalds
8745889a7f New code for 6.0:
- Remove iomap_writepage and all callers, since the mm apparently never
    called the zonefs or gfs2 writepage functions.
 
 Signed-off-by: Darrick J. Wong <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmL1H7kACgkQ+H93GTRK
 tOvzfw/+JJQM3WjwCUg+11O9E+oKS3wbczr0yAd2m8j+EqapdndXzIVevcZKXoTx
 K4zOK9oDecPtRKgQkvrDt7HrMB7oYv8tuSzyfcsNVHbMA6U3twkLdr5c19/lm9uj
 rnP2Xrs0RkiiFpImmTHsviPEyzniJ+BjtRDF7FxSFELxREae4EQW3YX2MjffvqQA
 dT+xXptWiOSa3ygwfoGqVeOLOMt0DqXICiV0GLrGxD6S7TLRRIPo7ojYS4703vUL
 VFTAUvhC4CD9/vsEwPnl91Jq2s06tO3LE4V6vJDPI7/uQFPcubLmcK8GpaYB6+OQ
 q9Fhpc9cU/3JTKt6Sw9uNOqA5hfUKBdJmhWE3FqZ2arql2C9tY2o+cHvRBKZWMZ9
 FdLKSwsuDpL+pYsWOPn7wU8BHZVTDDl7CtDNTCurNkkNgaAbK8C0X7QcT16RRyDF
 SAPHlg0XFewLgJ+9HNyDv70VT1VLYiJNq/h0d/EMO1+FuT4ArBOTOSe4zNNXqD3w
 vVFtbBhjGMf1ffqiMM5GdOPh0vxacL8jfxM7xyQ4yooSkecZCEvtNnuCysNTFDbl
 53b9bjk+OSuWCb7efE6p82wU+gr617Zp2/YxALl4E0FlozeRHuRimWBtABZqi/g6
 aKJL42ASY+PLJPACDjo0LhDFuCRbd75OATUGtBva7mkYWUANlMc=
 =FuyV
 -----END PGP SIGNATURE-----

Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull more iomap updates from Darrick Wong:
 "In the past 10 days or so I've not heard any ZOMG STOP style
  complaints about removing ->writepage support from gfs2 or zonefs, so
  here's the pull request removing them (and the underlying fs iomap
  support) from the kernel:

   - Remove iomap_writepage and all callers, since the mm apparently
     never called the zonefs or gfs2 writepage functions"

* tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: remove iomap_writepage
  zonefs: remove ->writepage
  gfs2: remove ->writepage
  gfs2: stop using generic_writepages in gfs2_ail1_start_one
2022-08-11 13:11:49 -07:00
Linus Torvalds
786da5da56 We have a good pile of various fixes and cleanups from Xiubo, Jeff,
Luis and others, almost exclusively in the filesystem.  Several patches
 touch files outside of our normal purview to set the stage for bringing
 in Jeff's long awaited ceph+fscrypt series in the near future.  All of
 them have appropriate acks and sat in linux-next for a while.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmL1HF8THGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHziwOuB/97JKHFuOlP1HrD6fYe5a0ul9zC9VG4
 57XPDNqG2PSmfXCjvZhyVU4n53sUlJTqzKDSTXydoPCMQjtyHvysA6gEvcgUJFPd
 PHaZDCd9TmqX8my67NiTK70RVpNR9BujJMVMbOfM+aaisl0K6WQbitO+BfhEiJcK
 QStdKm5lPyf02ESH9jF+Ga0DpokARaLbtDFH7975owxske6gWuoPBCJNrkMooKiX
 LjgEmNgH1F/sJSZXftmKdlw9DtGBFaLQBdfbfSB5oVPRb7chI7xBeraNr6Od3rls
 o4davbFkcsOr+s6LJPDH2BJobmOg+HoMoma7ezspF7ZqBF4Uipv5j3VC
 =1427
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "We have a good pile of various fixes and cleanups from Xiubo, Jeff,
  Luis and others, almost exclusively in the filesystem.

  Several patches touch files outside of our normal purview to set the
  stage for bringing in Jeff's long awaited ceph+fscrypt series in the
  near future. All of them have appropriate acks and sat in linux-next
  for a while"

* tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client: (27 commits)
  libceph: clean up ceph_osdc_start_request prototype
  libceph: fix ceph_pagelist_reserve() comment typo
  ceph: remove useless check for the folio
  ceph: don't truncate file in atomic_open
  ceph: make f_bsize always equal to f_frsize
  ceph: flush the dirty caps immediatelly when quota is approaching
  libceph: print fsid and epoch with osd id
  libceph: check pointer before assigned to "c->rules[]"
  ceph: don't get the inline data for new creating files
  ceph: update the auth cap when the async create req is forwarded
  ceph: make change_auth_cap_ses a global symbol
  ceph: fix incorrect old_size length in ceph_mds_request_args
  ceph: switch back to testing for NULL folio->private in ceph_dirty_folio
  ceph: call netfs_subreq_terminated with was_async == false
  ceph: convert to generic_file_llseek
  ceph: fix the incorrect comment for the ceph_mds_caps struct
  ceph: don't leak snap_rwsem in handle_cap_grant
  ceph: prevent a client from exceeding the MDS maximum xattr size
  ceph: choose auth MDS for getxattr with the Xs caps
  ceph: add session already open notify support
  ...
2022-08-11 12:41:07 -07:00
Linus Torvalds
e18a90427c * Xen timer fixes
* Documentation formatting fixes
 
 * Make rseq selftest compatible with glibc-2.35
 
 * Fix handling of illegal LEA reg, reg
 
 * Cleanup creation of debugfs entries
 
 * Fix steal time cache handling bug
 
 * Fixes for MMIO caching
 
 * Optimize computation of number of LBRs
 
 * Fix uninitialized field in guest_maxphyaddr < host_maxphyaddr path
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmL0qwIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroML1gf/SK6by+Gi0r7WSkrDjU94PKZ8D6Y3
 fErMhratccc9IfL3p90IjCVhEngfdQf5UVHExA5TswgHHAJTpECzuHya9TweQZc5
 2rrTvufup0MNALfzkSijrcI80CBvrJc6JyOCkv0BLp7yqXUrnrm0OOMV2XniS7y0
 YNn2ZCy44tLqkNiQrLhJQg3EsXu9l7okGpHSVO6iZwC7KKHvYkbscVFa/AOlaAwK
 WOZBB+1Ee+/pWhxsngM1GwwM3ZNU/jXOSVjew5plnrD4U7NYXIDATszbZAuNyxqV
 5gi+wvTF1x9dC6Tgd3qF7ouAqtT51BdRYaI9aYHOYgvzqdNFHWJu3XauDQ==
 =vI6Q
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:

 - Xen timer fixes

 - Documentation formatting fixes

 - Make rseq selftest compatible with glibc-2.35

 - Fix handling of illegal LEA reg, reg

 - Cleanup creation of debugfs entries

 - Fix steal time cache handling bug

 - Fixes for MMIO caching

 - Optimize computation of number of LBRs

 - Fix uninitialized field in guest_maxphyaddr < host_maxphyaddr path

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
  KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table
  Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underline
  KVM: VMX: Adjust number of LBR records for PERF_CAPABILITIES at refresh
  KVM: VMX: Use proper type-safe functions for vCPU => LBRs helpers
  KVM: x86: Refresh PMU after writes to MSR_IA32_PERF_CAPABILITIES
  KVM: selftests: Test all possible "invalid" PERF_CAPABILITIES.LBR_FMT vals
  KVM: selftests: Use getcpu() instead of sched_getcpu() in rseq_test
  KVM: selftests: Make rseq compatible with glibc-2.35
  KVM: Actually create debugfs in kvm_create_vm()
  KVM: Pass the name of the VM fd to kvm_create_vm_debugfs()
  KVM: Get an fd before creating the VM
  KVM: Shove vcpu stats_id init into kvm_vcpu_init()
  KVM: Shove vm stats_id init into kvm_create_vm()
  KVM: x86/mmu: Add sanity check that MMIO SPTE mask doesn't overlap gen
  KVM: x86/mmu: rename trace function name for asynchronous page fault
  KVM: x86/xen: Stop Xen timer before changing IRQ
  KVM: x86/xen: Initialize Xen timer only once
  KVM: SVM: Disable SEV-ES support if MMIO caching is disable
  KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  KVM: x86: Tag kvm_mmu_x86_module_init() with __init
  ...
2022-08-11 12:10:08 -07:00
Jakub Kicinski
c2e75634cb net: atm: bring back zatm uAPI
Jiri reports that linux-atm does not build without this header.
Bring it back. It's completely dead code but we can't break
the build for user space :(

Reported-by: Jiri Slaby <jirislaby@kernel.org>
Fixes: 052e1f01bf ("net: atm: remove support for ZeitNet ZN122x ATM devices")
Link: https://lore.kernel.org/all/8576aef3-37e4-8bae-bab5-08f82a78efd3@kernel.org/
Link: https://lore.kernel.org/r/20220810164547.484378-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 10:31:19 -07:00
Chen Lin
e34f49348f dpaa2-eth: trace the allocated address instead of page struct
We should trace the allocated address instead of page struct.

Fixes: 27c874867c ("dpaa2-eth: Use a single page per Rx buffer")
Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20220811151651.3327-1-chen45464546@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 10:30:04 -07:00
Rafael J. Wysocki
da2679f26b Merge branch 'acpi-properties'
Merge changes adding support for device properties with buffer values
to the ACPI device properties handling code.

* acpi-properties:
  ACPI: property: Fix error handling in acpi_init_properties()
  ACPI: property: Read buffer properties as integers
  ACPI: property: Add support for parsing buffer property UUID
  ACPI: property: Unify integer value reading functions
  ACPI: property: Switch node property referencing from ifs to a switch
  ACPI: property: Move property ref argument parsing into a new function
  ACPI: property: Use acpi_object_type consistently in property ref parsing
  ACPI: property: Tie data nodes to acpi handles
  ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
2022-08-11 19:21:03 +02:00
Jakub Kicinski
5c221f0af6 net: add missing kdoc for struct genl_multicast_group::flags
Multicast group flags were added in commit 4d54cc3211 ("mptcp: avoid
lock_fast usage in accept path"), but it missed adding the kdoc.

Mention which flags go into that field, and do the same for
op structs.

Link: https://lore.kernel.org/r/20220809232012.403730-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 09:26:04 -07:00
Linus Torvalds
2ae08b36c0 Input updates for 5.20 (6.0) merge window:
- changes to input core to properly queue synthetic events (such as
   autorepeat) and to release multitouch contacts when an input device is
   inhibited or suspended
 
 - reworked quirk handling in i8042 driver that consolidates multiple
   DMI tables into one and adds several quirks for TUXEDO line of
   laptops
 
 - update to mt6779 keypad to better reflect organization of the hardware
 
 - changes to mtk-pmic-keys driver preparing it to handle more variants
 
 - facelift of adp5588-keys driver
 
 - improvements to iqs7222 driver
 
 - adjustments to various DT binding documents for input devices
 
 - other assorted driver fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQST2eWILY88ieB2DOtAj56VGEWXnAUCYvFsGAAKCRBAj56VGEWX
 nMI6AQDQUwQpKtmCoDmxLTi/8Oy7qq0j9Sn8FNyWOaFykK3iTAD/eJNsY+PSn+M2
 bPzg1bduYK/eqZkomMaL2dAytHpr9AM=
 =XmKz
 -----END PGP SIGNATURE-----

Merge tag 'input-for-v5.20-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

 - changes to input core to properly queue synthetic events (such as
   autorepeat) and to release multitouch contacts when an input device
   is inhibited or suspended

 - reworked quirk handling in i8042 driver that consolidates multiple
   DMI tables into one and adds several quirks for TUXEDO line of
   laptops

 - update to mt6779 keypad to better reflect organization of the
   hardware

 - changes to mtk-pmic-keys driver preparing it to handle more variants

 - facelift of adp5588-keys driver

 - improvements to iqs7222 driver

 - adjustments to various DT binding documents for input devices

 - other assorted driver fixes.

* tag 'input-for-v5.20-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (54 commits)
  Input: adc-joystick - fix ordering in adc_joystick_probe()
  dt-bindings: input: ariel-pwrbutton: use spi-peripheral-props.yaml
  Input: deactivate MT slots when inhibiting or suspending devices
  Input: properly queue synthetic events
  dt-bindings: input: iqs7222: Use central 'linux,code' definition
  Input: i8042 - add dritek quirk for Acer Aspire One AO532
  dt-bindings: input: gpio-keys: accept also interrupt-extended
  dt-bindings: input: gpio-keys: reference input.yaml and document properties
  dt-bindings: input: gpio-keys: enforce node names to match all properties
  dt-bindings: input: Convert adc-keys to DT schema
  dt-bindings: input: Centralize 'linux,input-type' definition
  dt-bindings: input: Use common 'linux,keycodes' definition
  dt-bindings: input: Centralize 'linux,code' definition
  dt-bindings: input: Increase maximum keycode value to 0x2ff
  Input: mt6779-keypad - implement row/column selection
  Input: mt6779-keypad - match hardware matrix organization
  Input: i8042 - add additional TUXEDO devices to i8042 quirk tables
  Input: goodix - switch use of acpi_gpio_get_*_resource() APIs
  Input: i8042 - add TUXEDO devices to i8042 quirk tables
  Input: i8042 - add debug output for quirks
  ...
2022-08-11 09:23:08 -07:00
Jialiang Wang
02e1a114fd nfp: fix use-after-free in area_cache_get()
area_cache_get() is used to distribute cache->area and set cache->id,
 and if cache->id is not 0 and cache->area->kref refcount is 0, it will
 release the cache->area by nfp_cpp_area_release(). area_cache_get()
 set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire().

But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is
 is already set but the refcount is not increased as expected. At this
 time, calling the nfp_cpp_area_release() will cause use-after-free.

To avoid the use-after-free, set cache->id after area_init() and
 nfp_cpp_area_acquire() complete successfully.

Note: This vulnerability is triggerable by providing emulated device
 equipped with specified configuration.

 BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
  Write of size 4 at addr ffff888005b7f4a0 by task swapper/0/1

 Call Trace:
  <TASK>
 nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
 area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:884)

 Allocated by task 1:
 nfp_cpp_area_alloc_with_name (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:303)
 nfp_cpp_area_cache_add (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:802)
 nfp6000_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:1230)
 nfp_cpp_from_operations (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1215)
 nfp_pci_probe (drivers/net/ethernet/netronome/nfp/nfp_main.c:744)

 Freed by task 1:
 kfree (mm/slub.c:4562)
 area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:873)
 nfp_cpp_read (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:924 drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:973)
 nfp_cpp_readl (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cpplib.c:48)

Signed-off-by: Jialiang Wang <wangjialiang0806@163.com>
Reviewed-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220810073057.4032-1-wangjialiang0806@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 09:02:26 -07:00
Jakub Kicinski
cef8e3261b MAINTAINERS: use my korg address for mt7601u
Change my address for mt7601u to the main one.

Link: https://lore.kernel.org/r/20220809233843.408004-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 09:01:46 -07:00
Vadim Pasternak
4f98cb0408 mlxsw: minimal: Fix deadlock in ports creation
Drop devl_lock() / devl_unlock() from ports creation and removal flows
since the devlink instance lock is now taken by mlxsw_core.

Fixes: 72a4c8c94e ("mlxsw: convert driver to use unlocked devlink API during init/fini")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/f4afce5ab0318617f3866b85274be52542d59b32.1660211614.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 08:59:33 -07:00
Jay Vosburgh
4f5d33f4f7 bonding: fix reference count leak in balance-alb mode
Commit d5410ac7b0 ("net:bonding:support balance-alb interface
with vlan to bridge") introduced a reference count leak by not releasing
the reference acquired by ip_dev_find().  Remedy this by insuring the
reference is released.

Fixes: d5410ac7b0 ("net:bonding:support balance-alb interface with vlan to bridge")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/26758.1660194413@famine
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 08:51:05 -07:00
Linus Torvalds
21f9c8a13b Revert "Makefile.extrawarn: re-enable -Wformat for clang"
This reverts commit 258fafcd06.

The clang -Wformat warning is terminally broken, and the clang people
can't seem to get their act together.

This test program causes a warning with clang:

	#include <stdio.h>

	int main(int argc, char **argv)
	{
		printf("%hhu\n", 'a');
	}

resulting in

  t.c:5:19: warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat]
          printf("%hhu\n", 'a');
                  ~~~~     ^~~
                  %d

and apparently clang people consider that a feature, because they don't
want to face the reality of how either C character constants, C
arithmetic, and C varargs functions work.

The rest of the world just shakes their head at that kind of
incompetence, and turns off -Wformat for clang again.

And no, the "you should use a pointless cast to shut this up" is not a
valid answer.  That warning should not exist in the first place, or at
least be optinal with some "-Wformat-me-harder" kind of option.

[ Admittedly, there's also very little reason to *ever* use '%hh[ud]' in
  C, but what little reason there is is entirely about 'I want to see
  only the low 8 bits of the argument'. So I would suggest nobody ever
  use that format in the first place, but if they do, the clang
  behavious is simply always wrong. Because '%hhu' takes an 'int'. It's
  that simple. ]

Reported-by: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-11 08:40:01 -07:00
Slark Xiao
ae7107baa5 net: usb: qmi_wwan: Add support for Cinterion MV32
There are 2 models for MV32 serials. MV32-W-A is designed
based on Qualcomm SDX62 chip, and MV32-W-B is designed based
on Qualcomm SDX65 chip. So we use 2 different PID to separate it.

Test evidence as below:
T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=03 Dev#=  3 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1e2d ProdID=00f3 Rev=05.04
S:  Manufacturer=Cinterion
S:  Product=Cinterion PID 0x00F3 USB Mobile Broadband
S:  SerialNumber=d7b4be8d
C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=03 Dev#= 10 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1e2d ProdID=00f4 Rev=05.04
S:  Manufacturer=Cinterion
S:  Product=Cinterion PID 0x00F4 USB Mobile Broadband
S:  SerialNumber=d095087d
C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20220810014521.9383-1-slark_xiao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 07:58:42 -07:00
Jakub Kicinski
84ba289016 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull in bpf to silence a false positive warning.

* git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Shut up kern_sys_bpf warning.

Link: https://lore.kernel.org/r/CAADnVQK589CZN1Q9w8huJqkEyEed+ZMTWqcpA1Rm2CjN3a4XoQ@mail.gmail.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-11 07:50:29 -07:00
Alexei Starovoitov
4e4588f1c4 bpf: Shut up kern_sys_bpf warning.
Shut up this warning:
kernel/bpf/syscall.c:5089:5: warning: no previous prototype for function 'kern_sys_bpf' [-Wmissing-prototypes]
int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size)

Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-10 23:58:13 -07:00
Bagas Sanjaya
19a7cc817a KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table
There is unexpected warning on KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability
table, which cause the table to be rendered as paragraph text instead.

The warning is due to missing colon at capability name and returns keyword,
as well as improper alignment on multi-line returns field.

Fix the warning by adding missing colons and aligning the field.

Link: https://lore.kernel.org/lkml/20220627181937.3be67263@canb.auug.org.au/
Fixes: 084cc29f8b ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Matlack <dmatlack@google.com>
Cc: Ben Gardon <bgardon@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-next@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Message-Id: <20220627095151.19339-3-bagasdotme@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-08-11 02:35:37 -04:00
Bagas Sanjaya
b4aed4d85f Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underline
Extend heading underline for KVM_CAP_VM_DISABLE_NX_HUGE_PAGE to match
the heading text length.

Link: https://lore.kernel.org/lkml/20220627181937.3be67263@canb.auug.org.au/
Fixes: 084cc29f8b ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Matlack <dmatlack@google.com>
Cc: Ben Gardon <bgardon@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-next@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Message-Id: <20220627095151.19339-2-bagasdotme@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-08-11 02:35:36 -04:00
Maxim Mikityanskiy
94ce3b64c6 net/tls: Use RCU API to access tls_ctx->netdev
Currently, tls_device_down synchronizes with tls_device_resync_rx using
RCU, however, the pointer to netdev is stored using WRITE_ONCE and
loaded using READ_ONCE.

Although such approach is technically correct (rcu_dereference is
essentially a READ_ONCE, and rcu_assign_pointer uses WRITE_ONCE to store
NULL), using special RCU helpers for pointers is more valid, as it
includes additional checks and might change the implementation
transparently to the callers.

Mark the netdev pointer as __rcu and use the correct RCU helpers to
access it. For non-concurrent access pass the right conditions that
guarantee safe access (locks taken, refcount value). Also use the
correct helper in mlx5e, where even READ_ONCE was missing.

The transition to RCU exposes existing issues, fixed by this commit:

1. bond_tls_device_xmit could read netdev twice, and it could become
NULL the second time, after the NULL check passed.

2. Drivers shouldn't stop processing the last packet if tls_device_down
just set netdev to NULL, before tls_dev_del was called. This prevents a
possible packet drop when transitioning to the fallback software mode.

Fixes: 89df6a8104 ("net/bonding: Implement TLS TX device offload")
Fixes: c55dcdd435 ("net/tls: Fix use-after-free after the TLS device goes down and up")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Link: https://lore.kernel.org/r/20220810081602.1435800-1-maximmi@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 22:58:43 -07:00
Jakub Kicinski
d800a7b357 tls: rx: device: don't try to copy too much on detach
Another device offload bug, we use the length of the output
skb as an indication of how much data to copy. But that skb
is sized to offset + record length, and we start from offset.
So we end up double-counting the offset which leads to
skb_copy_bits() returning -EFAULT.

Reported-by: Tariq Toukan <tariqt@nvidia.com>
Fixes: 84c61fe1a7 ("tls: rx: do not use the standard strparser")
Tested-by: Ran Rozenstein <ranro@nvidia.com>
Link: https://lore.kernel.org/r/20220809175544.354343-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 22:53:25 -07:00
Jakub Kicinski
86b259f6f8 tls: rx: device: bound the frag walk
We can't do skb_walk_frags() on the input skbs, because
the input skbs is really just a pointer to the tcp read
queue. We need to bound the "is decrypted" check by the
amount of data in the message.

Note that the walk in tls_device_reencrypt() is after a
CoW so the skb there is safe to walk. Actually in the
current implementation it can't have frags at all, but
whatever, maybe one day it will.

Reported-by: Tariq Toukan <tariqt@nvidia.com>
Fixes: 84c61fe1a7 ("tls: rx: do not use the standard strparser")
Tested-by: Ran Rozenstein <ranro@nvidia.com>
Link: https://lore.kernel.org/r/20220809175544.354343-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 22:53:25 -07:00
Thadeu Lima de Souza Cascardo
9ad36309e2 net_sched: cls_route: remove from list when handle is 0
When a route filter is replaced and the old filter has a 0 handle, the old
one won't be removed from the hashtable, while it will still be freed.

The test was there since before commit 1109c00547 ("net: sched: RCU
cls_route"), when a new filter was not allocated when there was an old one.
The old filter was reused and the reinserting would only be necessary if an
old filter was replaced. That was still wrong for the same case where the
old handle was 0.

Remove the old filter from the list independently from its handle value.

This fixes CVE-2022-2588, also reported as ZDI-CAN-17440.

Reported-by: Zhenpeng Lin <zplin@u.northwestern.edu>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Kamal Mostafa <kamal@canonical.com>
Cc: <stable@vger.kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20220809170518.164662-1-cascardo@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 22:53:11 -07:00
Ido Schimmel
8bcfb4ae4d selftests: forwarding: Fix failing tests with old libnet
The custom multipath hash tests use mausezahn in order to test how
changes in various packet fields affect the packet distribution across
the available nexthops.

The tool uses the libnet library for various low-level packet
construction and injection. The library started using the
"SO_BINDTODEVICE" socket option for IPv6 sockets in version 1.1.6 and
for IPv4 sockets in version 1.2.

When the option is not set, packets are not routed according to the
table associated with the VRF master device and tests fail.

Fix this by prefixing the command with "ip vrf exec", which will cause
the route lookup to occur in the VRF routing table. This makes the tests
pass regardless of the libnet library version.

Fixes: 511e8db540 ("selftests: forwarding: Add test for custom multipath hash")
Fixes: 185b0c190b ("selftests: forwarding: Add test for custom multipath hash with IPv4 GRE")
Fixes: b7715acba4 ("selftests: forwarding: Add test for custom multipath hash with IPv6 GRE")
Reported-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Link: https://lore.kernel.org/r/20220809113320.751413-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 22:36:14 -07:00
Jakub Kicinski
fbe8870f72 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2022-08-10

We've added 23 non-merge commits during the last 7 day(s) which contain
a total of 19 files changed, 424 insertions(+), 35 deletions(-).

The main changes are:

1) Several fixes for BPF map iterator such as UAFs along with selftests, from Hou Tao.

2) Fix BPF syscall program's {copy,strncpy}_from_bpfptr() to not fault, from Jinghao Jia.

3) Reject BPF syscall programs calling BPF_PROG_RUN, from Alexei Starovoitov and YiFei Zhu.

4) Fix attach_btf_obj_id info to pick proper target BTF, from Stanislav Fomichev.

5) BPF design Q/A doc update to clarify what is not stable ABI, from Paul E. McKenney.

6) Fix BPF map's prealloc_lru_pop to not reinitialize, from Kumar Kartikeya Dwivedi.

7) Fix bpf_trampoline_put to avoid leaking ftrace hash, from Jiri Olsa.

8) Fix arm64 JIT to address sparse errors around BPF trampoline, from Xu Kuohai.

9) Fix arm64 JIT to use kvcalloc instead of kcalloc for internal program address
   offset buffer, from Aijun Sun.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (23 commits)
  selftests/bpf: Ensure sleepable program is rejected by hash map iter
  selftests/bpf: Add write tests for sk local storage map iterator
  selftests/bpf: Add tests for reading a dangling map iter fd
  bpf: Only allow sleepable program for resched-able iterator
  bpf: Check the validity of max_rdwr_access for sock local storage map iterator
  bpf: Acquire map uref in .init_seq_private for sock{map,hash} iterator
  bpf: Acquire map uref in .init_seq_private for sock local storage map iterator
  bpf: Acquire map uref in .init_seq_private for hash map iterator
  bpf: Acquire map uref in .init_seq_private for array map iterator
  bpf: Disallow bpf programs call prog_run command.
  bpf, arm64: Fix bpf trampoline instruction endianness
  selftests/bpf: Add test for prealloc_lru_pop bug
  bpf: Don't reinit map value in prealloc_lru_pop
  bpf: Allow calling bpf_prog_test kfuncs in tracing programs
  bpf, arm64: Allocate program buffer using kvcalloc instead of kcalloc
  selftests/bpf: Excercise bpf_obj_get_info_by_fd for bpf2bpf
  bpf: Use proper target btf when exporting attach_btf_obj_id
  mptcp, btf: Add struct mptcp_sock definition when CONFIG_MPTCP is disabled
  bpf: Cleanup ftrace hash in bpf_trampoline_put
  BPF: Fix potential bad pointer dereference in bpf_sys_bpf()
  ...
====================

Link: https://lore.kernel.org/r/20220810190624.10748-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 21:48:15 -07:00
Jakub Kicinski
dd48f3832d Merge branch 'net-enhancements-to-sk_user_data-field'
Hawkins Jiawei says:

====================
net: enhancements to sk_user_data field

This patchset fixes refcount bug by adding SK_USER_DATA_PSOCK flag bit in
sk_user_data field. The bug cause following info:

WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Modules linked in:
CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda7d90 #0
 <TASK>
 __refcount_add_not_zero include/linux/refcount.h:163 [inline]
 __refcount_inc_not_zero include/linux/refcount.h:227 [inline]
 refcount_inc_not_zero include/linux/refcount.h:245 [inline]
 sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439
 tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091
 tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983
 tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057
 tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659
 tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682
 sk_backlog_rcv include/net/sock.h:1061 [inline]
 __release_sock+0x134/0x3b0 net/core/sock.c:2849
 release_sock+0x54/0x1b0 net/core/sock.c:3404
 inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909
 __sys_shutdown_sock net/socket.c:2331 [inline]
 __sys_shutdown_sock net/socket.c:2325 [inline]
 __sys_shutdown+0xf1/0x1b0 net/socket.c:2343
 __do_sys_shutdown net/socket.c:2351 [inline]
 __se_sys_shutdown net/socket.c:2349 [inline]
 __x64_sys_shutdown+0x50/0x70 net/socket.c:2349
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
 </TASK>

To improve code maintainability, this patchset refactors sk_user_data
flags code to be more generic.
====================

Link: https://lore.kernel.org/r/cover.1659676823.git.yin31149@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 21:48:09 -07:00
Hawkins Jiawei
cf8c1e9672 net: refactor bpf_sk_reuseport_detach()
Refactor sk_user_data dereference using more generic function
__rcu_dereference_sk_user_data_with_flags(), which improve its
maintainability

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 21:48:04 -07:00
Hawkins Jiawei
2a0133723f net: fix refcount bug in sk_psock_get (2)
Syzkaller reports refcount bug as follows:
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Modules linked in:
CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda7d90 #0
 <TASK>
 __refcount_add_not_zero include/linux/refcount.h:163 [inline]
 __refcount_inc_not_zero include/linux/refcount.h:227 [inline]
 refcount_inc_not_zero include/linux/refcount.h:245 [inline]
 sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439
 tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091
 tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983
 tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057
 tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659
 tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682
 sk_backlog_rcv include/net/sock.h:1061 [inline]
 __release_sock+0x134/0x3b0 net/core/sock.c:2849
 release_sock+0x54/0x1b0 net/core/sock.c:3404
 inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909
 __sys_shutdown_sock net/socket.c:2331 [inline]
 __sys_shutdown_sock net/socket.c:2325 [inline]
 __sys_shutdown+0xf1/0x1b0 net/socket.c:2343
 __do_sys_shutdown net/socket.c:2351 [inline]
 __se_sys_shutdown net/socket.c:2349 [inline]
 __x64_sys_shutdown+0x50/0x70 net/socket.c:2349
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
 </TASK>

During SMC fallback process in connect syscall, kernel will
replaces TCP with SMC. In order to forward wakeup
smc socket waitqueue after fallback, kernel will sets
clcsk->sk_user_data to origin smc socket in
smc_fback_replace_callbacks().

Later, in shutdown syscall, kernel will calls
sk_psock_get(), which treats the clcsk->sk_user_data
as psock type, triggering the refcnt warning.

So, the root cause is that smc and psock, both will use
sk_user_data field. So they will mismatch this field
easily.

This patch solves it by using another bit(defined as
SK_USER_DATA_PSOCK) in PTRMASK, to mark whether
sk_user_data points to a psock object or not.
This patch depends on a PTRMASK introduced in commit f1ff5ce2cd
("net, sk_msg: Clear sk_user_data pointer on clone if tagged").

For there will possibly be more flags in the sk_user_data field,
this patch also refactor sk_user_data flags code to be more generic
to improve its maintainability.

Reported-and-tested-by: syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-10 21:47:58 -07:00
Nick Desaulniers
ffcf9c5700 x86: link vdso and boot with -z noexecstack --no-warn-rwx-segments
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple
instances of a new warning when linking kernels in the form:

  ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack
  ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
  ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions

Generally, we would like to avoid the stack being executable.  Because
there could be a need for the stack to be executable, assembler sources
have to opt-in to this security feature via explicit creation of the
.note.GNU-stack feature (which compilers create by default) or command
line flag --noexecstack.  Or we can simply tell the linker the
production of such sections is irrelevant and to link the stack as
--noexecstack.

LLVM's LLD linker defaults to -z noexecstack, so this flag isn't
strictly necessary when linking with LLD, only BFD, but it doesn't hurt
to be explicit here for all linkers IMO.  --no-warn-rwx-segments is
currently BFD specific and only available in the current latest release,
so it's wrapped in an ld-option check.

While the kernel makes extensive usage of ELF sections, it doesn't use
permissions from ELF segments.

Link: https://lore.kernel.org/linux-block/3af4127a-f453-4cf7-f133-a181cce06f73@kernel.dk/
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107
Link: https://github.com/llvm/llvm-project/issues/57009
Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-10 18:30:09 -07:00
Nick Desaulniers
0d362be5b1 Makefile: link with -z noexecstack --no-warn-rwx-segments
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple
instances of a new warning when linking kernels in the form:

  ld: warning: vmlinux: missing .note.GNU-stack section implies executable stack
  ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
  ld: warning: vmlinux has a LOAD segment with RWX permissions

Generally, we would like to avoid the stack being executable.  Because
there could be a need for the stack to be executable, assembler sources
have to opt-in to this security feature via explicit creation of the
.note.GNU-stack feature (which compilers create by default) or command
line flag --noexecstack.  Or we can simply tell the linker the
production of such sections is irrelevant and to link the stack as
--noexecstack.

LLVM's LLD linker defaults to -z noexecstack, so this flag isn't
strictly necessary when linking with LLD, only BFD, but it doesn't hurt
to be explicit here for all linkers IMO.  --no-warn-rwx-segments is
currently BFD specific and only available in the current latest release,
so it's wrapped in an ld-option check.

While the kernel makes extensive usage of ELF sections, it doesn't use
permissions from ELF segments.

Link: https://lore.kernel.org/linux-block/3af4127a-f453-4cf7-f133-a181cce06f73@kernel.dk/
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107
Link: https://github.com/llvm/llvm-project/issues/57009
Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-10 18:29:34 -07:00