Commit Graph

1185979 Commits

Author SHA1 Message Date
Vladislav Efanov
448a5ce112 udp6: Fix race condition in udp6_sendmsg & connect
Syzkaller got the following report:
BUG: KASAN: use-after-free in sk_setup_caps+0x621/0x690 net/core/sock.c:2018
Read of size 8 at addr ffff888027f82780 by task syz-executor276/3255

The function sk_setup_caps (called by ip6_sk_dst_store_flow->
ip6_dst_store) referenced already freed memory as this memory was
freed by parallel task in udpv6_sendmsg->ip6_sk_dst_lookup_flow->
sk_dst_check.

          task1 (connect)              task2 (udp6_sendmsg)
        sk_setup_caps->sk_dst_set |
                                  |  sk_dst_check->
                                  |      sk_dst_set
                                  |      dst_release
        sk_setup_caps references  |
        to already freed dst_entry|

The reason for this race condition is: sk_setup_caps() keeps using
the dst after transferring the ownership to the dst cache.

Found by Linux Verification Center (linuxtesting.org) with syzkaller.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-31 10:35:10 +01:00
Pedro Tammela
f4e4534850 net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report
The current code for the length calculation wrongly truncates the reported
length of the groups array, causing an under report of the subscribed
groups. To fix this, use 'BITS_TO_BYTES()' which rounds up the
division by 8.

Fixes: b42be38b27 ("netlink: add API to retrieve all group memberships")
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230529153335.389815-1-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31 00:02:24 -07:00
Zhengchao Shao
36eec020fa net: sched: fix NULL pointer dereference in mq_attach
When use the following command to test:
1)ip link add bond0 type bond
2)ip link set bond0 up
3)tc qdisc add dev bond0 root handle ffff: mq
4)tc qdisc replace dev bond0 parent ffff:fff1 handle ffff: mq

The kernel reports NULL pointer dereference issue. The stack information
is as follows:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Internal error: Oops: 0000000096000006 [#1] SMP
Modules linked in:
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mq_attach+0x44/0xa0
lr : qdisc_graft+0x20c/0x5cc
sp : ffff80000e2236a0
x29: ffff80000e2236a0 x28: ffff0000c0e59d80 x27: ffff0000c0be19c0
x26: ffff0000cae3e800 x25: 0000000000000010 x24: 00000000fffffff1
x23: 0000000000000000 x22: ffff0000cae3e800 x21: ffff0000c9df4000
x20: ffff0000c9df4000 x19: 0000000000000000 x18: ffff80000a934000
x17: ffff8000f5b56000 x16: ffff80000bb08000 x15: 0000000000000000
x14: 0000000000000000 x13: 6b6b6b6b6b6b6b6b x12: 6b6b6b6b00000001
x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
x8 : ffff0000c0be0730 x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000008
x5 : ffff0000cae3e864 x4 : 0000000000000000 x3 : 0000000000000001
x2 : 0000000000000001 x1 : ffff8000090bc23c x0 : 0000000000000000
Call trace:
mq_attach+0x44/0xa0
qdisc_graft+0x20c/0x5cc
tc_modify_qdisc+0x1c4/0x664
rtnetlink_rcv_msg+0x354/0x440
netlink_rcv_skb+0x64/0x144
rtnetlink_rcv+0x28/0x34
netlink_unicast+0x1e8/0x2a4
netlink_sendmsg+0x308/0x4a0
sock_sendmsg+0x64/0xac
____sys_sendmsg+0x29c/0x358
___sys_sendmsg+0x90/0xd0
__sys_sendmsg+0x7c/0xd0
__arm64_sys_sendmsg+0x2c/0x38
invoke_syscall+0x54/0x114
el0_svc_common.constprop.1+0x90/0x174
do_el0_svc+0x3c/0xb0
el0_svc+0x24/0xec
el0t_64_sync_handler+0x90/0xb4
el0t_64_sync+0x174/0x178

This is because when mq is added for the first time, qdiscs in mq is set
to NULL in mq_attach(). Therefore, when replacing mq after adding mq, we
need to initialize qdiscs in the mq before continuing to graft. Otherwise,
it will couse NULL pointer dereference issue in mq_attach(). And the same
issue will occur in the attach functions of mqprio, taprio and htb.
ffff:fff1 means that the repalce qdisc is ingress. Ingress does not allow
any qdisc to be attached. Therefore, ffff:fff1 is incorrectly used, and
the command should be dropped.

Fixes: 6ec1c69a8f ("net_sched: add classful multiqueue dummy scheduler")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Tested-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230527093747.3583502-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30 23:31:40 -07:00
Jakub Kicinski
bb50f12c69 Merge branch 'net-sched-fixes-for-sch_ingress-and-sch_clsact'
Peilin Ye says:

====================
net/sched: Fixes for sch_ingress and sch_clsact

These are v6 fixes for ingress and clsact Qdiscs, including only first 4
patches (already tested and reviewed) from v5.  Patch 5 and 6 from
previous versions are still under discussion and will be sent separately.

[a] https://syzkaller.appspot.com/bug?extid=b53a9c0d1ea4ad62da8b

Link to v5: https://lore.kernel.org/r/cover.1684887977.git.peilin.ye@bytedance.com/
Link to v4: https://lore.kernel.org/r/cover.1684825171.git.peilin.ye@bytedance.com/
Link to v3 (incomplete): https://lore.kernel.org/r/cover.1684821877.git.peilin.ye@bytedance.com/
Link to v2: https://lore.kernel.org/r/cover.1684796705.git.peilin.ye@bytedance.com/
Link to v1: https://lore.kernel.org/r/cover.1683326865.git.peilin.ye@bytedance.com/
====================

Link: https://lore.kernel.org/r/cover.1685388545.git.peilin.ye@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30 23:31:06 -07:00
Peilin Ye
9de95df5d1 net/sched: Prohibit regrafting ingress or clsact Qdiscs
Currently, after creating an ingress (or clsact) Qdisc and grafting it
under TC_H_INGRESS (TC_H_CLSACT), it is possible to graft it again under
e.g. a TBF Qdisc:

  $ ip link add ifb0 type ifb
  $ tc qdisc add dev ifb0 handle 1: root tbf rate 20kbit buffer 1600 limit 3000
  $ tc qdisc add dev ifb0 clsact
  $ tc qdisc link dev ifb0 handle ffff: parent 1:1
  $ tc qdisc show dev ifb0
  qdisc tbf 1: root refcnt 2 rate 20Kbit burst 1600b lat 560.0ms
  qdisc clsact ffff: parent ffff:fff1 refcnt 2
                                      ^^^^^^^^

clsact's refcount has increased: it is now grafted under both
TC_H_CLSACT and 1:1.

ingress and clsact Qdiscs should only be used under TC_H_INGRESS
(TC_H_CLSACT).  Prohibit regrafting them.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30 23:31:05 -07:00
Peilin Ye
f85fa45d4a net/sched: Reserve TC_H_INGRESS (TC_H_CLSACT) for ingress (clsact) Qdiscs
Currently it is possible to add e.g. an HTB Qdisc under ffff:fff1
(TC_H_INGRESS, TC_H_CLSACT):

  $ ip link add name ifb0 type ifb
  $ tc qdisc add dev ifb0 parent ffff:fff1 htb
  $ tc qdisc add dev ifb0 clsact
  Error: Exclusivity flag on, cannot modify.
  $ drgn
  ...
  >>> ifb0 = netdev_get_by_name(prog, "ifb0")
  >>> qdisc = ifb0.ingress_queue.qdisc_sleeping
  >>> print(qdisc.ops.id.string_().decode())
  htb
  >>> qdisc.flags.value_() # TCQ_F_INGRESS
  2

Only allow ingress and clsact Qdiscs under ffff:fff1.  Return -EINVAL
for everything else.  Make TCQ_F_INGRESS a static flag of ingress and
clsact Qdiscs.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30 23:31:05 -07:00
Peilin Ye
5eeebfe6c4 net/sched: sch_clsact: Only create under TC_H_CLSACT
clsact Qdiscs are only supposed to be created under TC_H_CLSACT (which
equals TC_H_INGRESS).  Return -EOPNOTSUPP if 'parent' is not
TC_H_CLSACT.

Fixes: 1f211a1b92 ("net, sched: add clsact qdisc")
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30 23:31:05 -07:00
Peilin Ye
c7cfbd1150 net/sched: sch_ingress: Only create under TC_H_INGRESS
ingress Qdiscs are only supposed to be created under TC_H_INGRESS.
Return -EOPNOTSUPP if 'parent' is not TC_H_INGRESS, similar to
mq_init().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30 23:31:04 -07:00
Paolo Abeni
7ba0732c80 Merge branch 'selftests-mptcp-skip-tests-not-supported-by-old-kernels-part-1'
Matthieu Baerts says:

====================
selftests: mptcp: skip tests not supported by old kernels (part 1)

After a few years of increasing test coverage in the MPTCP selftests, we
realised [1] the last version of the selftests is supposed to run on old
kernels without issues.

Supporting older versions is not that easy for this MPTCP case: these
selftests are often validating the internals by checking packets that
are exchanged, when some MIB counters are incremented after some
actions, how connections are getting opened and closed in some cases,
etc. In other words, it is not limited to the socket interface between
the userspace and the kernelspace. In addition, the current selftests
run a lot of different sub-tests but the TAP13 protocol used in the
selftests don't support sub-tests: in other words, one failure in
sub-tests implies that the whole selftest is seen as failed at the end
because sub-tests are not tracked. It is then important to skip
sub-tests not supported by old kernels.

To minimise the modifications and reduce the complexity to support old
versions, the idea is to look at external signs and skip the whole
selftests or just some sub-tests before starting them.

This first part focuses on marking the different selftests as skipped
if MPTCP is not even supported. That's what is done in patches 2 to 8.
Patch 2/8 introduces a new file (mptcp_lib.sh) to be able to re-use some
helpers in the different selftests. The first MPTCP selftest has been
introduced in v5.6.

Patch 1/8 is a bit different but still linked: it modifies mptcp_join.sh
selftest not to use 'cmp --bytes' which is not supported by the BusyBox
implementation. It is apparently quite common to use BusyBox in CI
environments. This tool is needed for a subtest introduced in v6.1.

Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrbOMsDZLg@mail.gmail.com/ [1]
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
====================

Link: https://lore.kernel.org/r/20230528-upstream-net-20230528-mptcp-selftests-support-old-kernels-part-1-v1-0-a32d85577fc6@tessares.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:05 +02:00
Matthieu Baerts
63212608a9 selftests: mptcp: userspace pm: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 259a834fad ("selftests: mptcp: functional tests for the userspace PM type")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:03 +02:00
Matthieu Baerts
cf6f0fda7a selftests: mptcp: sockopt: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: dc65fe82fb ("selftests: mptcp: add packet mark test case")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:03 +02:00
Matthieu Baerts
9161f21c74 selftests: mptcp: simult flows: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 1a418cb8e8 ("mptcp: simult flow self-tests")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:03 +02:00
Matthieu Baerts
46565acdd2 selftests: mptcp: diag: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: df62f2ec3d ("selftests/mptcp: add diag interface tests")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Matthieu Baerts
715c78a82e selftests: mptcp: join: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: b08fbf2410 ("selftests: add test-cases for MPTCP MP_JOIN")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Matthieu Baerts
0f4955a40d selftests: mptcp: pm nl: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped".

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: eedbc68532 ("selftests: add PM netlink functional tests")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Matthieu Baerts
d83013bdf9 selftests: mptcp: connect: skip if MPTCP is not supported
Selftests are supposed to run on any kernels, including the old ones not
supporting MPTCP.

A new check is then added to make sure MPTCP is supported. If not, the
test stops and is marked as "skipped". Note that this check can also
mark the test as failed if 'SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES' env
var is set to 1: by doing that, we can make sure a test is not being
skipped by mistake.

A new shared file is added here to be able to re-used the same check in
the different selftests we have.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368
Fixes: 048d19d444 ("mptcp: add basic kselftest for mptcp")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Matthieu Baerts
d328fe8706 selftests: mptcp: join: avoid using 'cmp --bytes'
BusyBox's 'cmp' command doesn't support the '--bytes' parameter.

Some CIs -- i.e. LKFT -- use BusyBox and have the mptcp_join.sh test
failing [1] because their 'cmp' command doesn't support this '--bytes'
option:

    cmp: unrecognized option '--bytes=1024'
    BusyBox v1.35.0 () multi-call binary.

    Usage: cmp [-ls] [-n NUM] FILE1 [FILE2]

Instead, 'head --bytes' can be used as this option is supported by
BusyBox. A temporary file is needed for this operation.

Because it is apparently quite common to use BusyBox, it is certainly
better to backport this fix to impacted kernels.

Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Link: https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.3-rc5-5-g148341f0a2f5/testrun/16088933/suite/kselftest-net-mptcp/test/net_mptcp_userspace_pm_sh/log [1]
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 13:21:02 +02:00
Haiyang Zhang
1919b39fc6 net: mana: Fix perf regression: remove rx_cqes, tx_cqes counters
The apc->eth_stats.rx_cqes is one per NIC (vport), and it's on the
frequent and parallel code path of all queues. So, r/w into this
single shared variable by many threads on different CPUs creates a
lot caching and memory overhead, hence perf regression. And, it's
not accurate due to the high volume concurrent r/w.

For example, a workload is iperf with 128 threads, and with RPS
enabled. We saw perf regression of 25% with the previous patch
adding the counters. And this patch eliminates the regression.

Since the error path of mana_poll_rx_cq() already has warnings, so
keeping the counter and convert it to a per-queue variable is not
necessary. So, just remove this counter from this high frequency
code path.

Also, remove the tx_cqes counter for the same reason. We have
warnings & other counters for errors on that path, and don't need
to count every normal cqe processing.

Cc: stable@vger.kernel.org
Fixes: bd7fc6e195 ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/1685115537-31675-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 12:05:22 +02:00
Paolo Abeni
111d467485 Merge branch 'two-fixes-for-smcrv2'
Wen Gu says:

====================
Two fixes for SMCRv2

This patch set includes two bugfix for SMCRv2.
====================

Link: https://lore.kernel.org/r/1685101741-74826-1-git-send-email-guwen@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 11:26:34 +02:00
Wen Gu
71c6aa0305 net/smc: Don't use RMBs not mapped to new link in SMCRv2 ADD LINK
We encountered a crash when using SMCRv2. It is caused by a logical
error in smc_llc_fill_ext_v2().

 BUG: kernel NULL pointer dereference, address: 0000000000000014
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 7 PID: 453 Comm: kworker/7:4 Kdump: loaded Tainted: G        W   E      6.4.0-rc3+ #44
 Workqueue: events smc_llc_add_link_work [smc]
 RIP: 0010:smc_llc_fill_ext_v2+0x117/0x280 [smc]
 RSP: 0018:ffffacb5c064bd88 EFLAGS: 00010282
 RAX: ffff9a6bc1c3c02c RBX: ffff9a6be3558000 RCX: 0000000000000000
 RDX: 0000000000000002 RSI: 0000000000000002 RDI: 000000000000000a
 RBP: ffffacb5c064bdb8 R08: 0000000000000040 R09: 000000000000000c
 R10: ffff9a6bc0910300 R11: 0000000000000002 R12: 0000000000000000
 R13: 0000000000000002 R14: ffff9a6bc1c3c02c R15: ffff9a6be3558250
 FS:  0000000000000000(0000) GS:ffff9a6eefdc0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000014 CR3: 000000010b078003 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  smc_llc_send_add_link+0x1ae/0x2f0 [smc]
  smc_llc_srv_add_link+0x2c9/0x5a0 [smc]
  ? cc_mkenc+0x40/0x60
  smc_llc_add_link_work+0xb8/0x140 [smc]
  process_one_work+0x1e5/0x3f0
  worker_thread+0x4d/0x2f0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xe5/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x50
  </TASK>

When an alernate RNIC is available in system, SMC will try to add a new
link based on the RNIC for resilience. All the RMBs in use will be mapped
to the new link. Then the RMBs' MRs corresponding to the new link will be
filled into SMCRv2 LLC ADD LINK messages.

However, smc_llc_fill_ext_v2() mistakenly accesses to unused RMBs which
haven't been mapped to the new link and have no valid MRs, thus causing
a crash. So this patch fixes the logic.

Fixes: b4ba4652b3 ("net/smc: extend LLC layer for SMC-Rv2")
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 11:26:32 +02:00
Wen Gu
b24aa141c2 net/smc: Scan from current RMB list when no position specified
When finding the first RMB of link group, it should start from the
current RMB list whose index is 0. So fix it.

Fixes: b4ba4652b3 ("net/smc: extend LLC layer for SMC-Rv2")
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 11:26:32 +02:00
David Howells
020c69c1a7 rxrpc: Truncate UTS_RELEASE for rxrpc version
UTS_RELEASE has a maximum length of 64 which can cause rxrpc_version to
exceed the 65 byte message limit.

Per the rx spec[1]: "If a server receives a packet with a type value of 13,
and the client-initiated flag set, it should respond with a 65-byte payload
containing a string that identifies the version of AFS software it is
running."

The current implementation causes a compile error when WERROR is turned on
and/or UTS_RELEASE exceeds the length of 49 (making the version string more
than 64 characters).

Fix this by generating the string during module initialisation and limiting
the UTS_RELEASE segment of the string does not exceed 49 chars.  We need to
make sure that the 64 bytes includes "linux-" at the front and " AF_RXRPC"
at the back as this may be used in pattern matching.

Fixes: 44ba06987c ("RxRPC: Handle VERSION Rx protocol packets")
Reported-by: Kenny Ho <Kenny.Ho@amd.com>
Link: https://lore.kernel.org/r/20230523223944.691076-1-Kenny.Ho@amd.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Kenny Ho <Kenny.Ho@amd.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Andrew Lunn <andrew@lunn.ch>
cc: David Laight <David.Laight@ACULAB.COM>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
Link: https://web.mit.edu/kolya/afs/rx/rx-spec [1]
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Link: https://lore.kernel.org/r/654974.1685100894@warthog.procyon.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30 10:01:06 +02:00
Cambda Zhu
34dfde4ad8 tcp: Return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss set
This patch replaces the tp->mss_cache check in getting TCP_MAXSEG
with tp->rx_opt.user_mss check for CLOSE/LISTEN sock. Since
tp->mss_cache is initialized with TCP_MSS_DEFAULT, checking if
it's zero is probably a bug.

With this change, getting TCP_MAXSEG before connecting will return
default MSS normally, and return user_mss if user_mss is set.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Jack Yang <mingliang@linux.alibaba.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/netdev/CANn89i+3kL9pYtkxkwxwNMzvC_w3LNUum_2=3u+UyLBmGmifHA@mail.gmail.com/#t
Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Link: https://lore.kernel.org/netdev/14D45862-36EA-4076-974C-EA67513C92F6@linux.alibaba.com/
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230527040317.68247-1-cambda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29 22:03:48 -07:00
Eric Dumazet
4faeee0cf8 tcp: deny tcp_disconnect() when threads are waiting
Historically connect(AF_UNSPEC) has been abused by syzkaller
and other fuzzers to trigger various bugs.

A recent one triggers a divide-by-zero [1], and Paolo Abeni
was able to diagnose the issue.

tcp_recvmsg_locked() has tests about sk_state being not TCP_LISTEN
and TCP REPAIR mode being not used.

Then later if socket lock is released in sk_wait_data(),
another thread can call connect(AF_UNSPEC), then make this
socket a TCP listener.

When recvmsg() is resumed, it can eventually call tcp_cleanup_rbuf()
and attempt a divide by 0 in tcp_rcv_space_adjust() [1]

This patch adds a new socket field, counting number of threads
blocked in sk_wait_event() and inet_wait_for_connect().

If this counter is not zero, tcp_disconnect() returns an error.

This patch adds code in blocking socket system calls, thus should
not hurt performance of non blocking ones.

Note that we probably could revert commit 499350a5a6 ("tcp:
initialize rcv_mss to TCP_MIN_MSS instead of 0") to restore
original tcpi_rcv_mss meaning (was 0 if no payload was ever
received on a socket)

[1]
divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 13832 Comm: syz-executor.5 Not tainted 6.3.0-rc4-syzkaller-00224-g00c7b5f4ddc5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
RIP: 0010:tcp_rcv_space_adjust+0x36e/0x9d0 net/ipv4/tcp_input.c:740
Code: 00 00 00 00 fc ff df 4c 89 64 24 48 8b 44 24 04 44 89 f9 41 81 c7 80 03 00 00 c1 e1 04 44 29 f0 48 63 c9 48 01 e9 48 0f af c1 <49> f7 f6 48 8d 04 41 48 89 44 24 40 48 8b 44 24 30 48 c1 e8 03 48
RSP: 0018:ffffc900033af660 EFLAGS: 00010206
RAX: 4a66b76cbade2c48 RBX: ffff888076640cc0 RCX: 00000000c334e4ac
RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000001
RBP: 00000000c324e86c R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880766417f8
R13: ffff888028fbb980 R14: 0000000000000000 R15: 0000000000010344
FS: 00007f5bffbfe700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32f25000 CR3: 000000007ced0000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
tcp_recvmsg_locked+0x100e/0x22e0 net/ipv4/tcp.c:2616
tcp_recvmsg+0x117/0x620 net/ipv4/tcp.c:2681
inet6_recvmsg+0x114/0x640 net/ipv6/af_inet6.c:670
sock_recvmsg_nosec net/socket.c:1017 [inline]
sock_recvmsg+0xe2/0x160 net/socket.c:1038
____sys_recvmsg+0x210/0x5a0 net/socket.c:2720
___sys_recvmsg+0xf2/0x180 net/socket.c:2762
do_recvmmsg+0x25e/0x6e0 net/socket.c:2856
__sys_recvmmsg net/socket.c:2935 [inline]
__do_sys_recvmmsg net/socket.c:2958 [inline]
__se_sys_recvmmsg net/socket.c:2951 [inline]
__x64_sys_recvmmsg+0x20f/0x260 net/socket.c:2951
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f5c0108c0f9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f5bffbfe168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00007f5c011ac050 RCX: 00007f5c0108c0f9
RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000003
RBP: 00007f5c010e7b39 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f5c012cfb1f R14: 00007f5bffbfe300 R15: 0000000000022000
</TASK>

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Paolo Abeni <pabeni@redhat.com>
Diagnosed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20230526163458.2880232-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29 22:03:48 -07:00
Eric Dumazet
6ffc57ea00 af_packet: do not use READ_ONCE() in packet_bind()
A recent patch added READ_ONCE() in packet_bind() and packet_bind_spkt()

This is better handled by reading pkt_sk(sk)->num later
in packet_do_bind() while appropriate lock is held.

READ_ONCE() in writers are often an evidence of something being wrong.

Fixes: 822b5a1c17 ("af_packet: Fix data-races of pkt_sk(sk)->num.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230526154342.2533026-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29 22:03:48 -07:00
Jakub Kicinski
0684f29a89 netlink: specs: correct types of legacy arrays
ethtool has some attrs which dump multiple scalars into
an attribute. The spec currently expects one attr per entry.

Fixes: a353318ebf ("tools: ynl: populate most of the ethtool spec")
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230526220653.65538-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29 22:03:48 -07:00
Sebastian Krzyszkowiak
36936a56e1 net: usb: qmi_wwan: Set DTR quirk for BroadMobi BM818
BM818 is based on Qualcomm MDM9607 chipset.

Fixes: 9a07406b00 ("net: usb: qmi_wwan: Add the BroadMobi BM818 card")
Cc: stable@vger.kernel.org
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Acked-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20230526-bm818-dtr-v1-1-64bbfa6ba8af@puri.sm
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29 22:03:48 -07:00
Osama Muhammad
9b9e46aa07 nfcsim.c: Fix error checking for debugfs_create_dir
This patch fixes the error checking in nfcsim.c.
The DebugFS kernel API is developed in
a way that the caller can safely ignore the errors that
occur during the creation of DebugFS nodes.

Signed-off-by: Osama Muhammad <osmtendev@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-26 12:18:35 +01:00
Raju Rangoju
dc362e20cd amd-xgbe: fix the false linkup in xgbe_phy_status
In the event of a change in XGBE mode, the current auto-negotiation
needs to be reset and the AN cycle needs to be re-triggerred. However,
the current code ignores the return value of xgbe_set_mode(), leading to
false information as the link is declared without checking the status
register.

Fix this by propagating the mode switch status information to
xgbe_phy_status().

Fixes: e57f7a3fea ("amd-xgbe: Prepare for working with more than one type of phy")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-26 10:42:06 +01:00
Jakub Kicinski
8a0d57df89 tls: improve lockless access safety of tls_err_abort()
Most protos' poll() methods insert a memory barrier between
writes to sk_err and sk_error_report(). This dates back to
commit a4d258036e ("tcp: Fix race in tcp_poll").

I guess we should do the same thing in TLS, tcp_poll() does
not hold the socket lock.

Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-26 10:35:58 +01:00
Jakub Kicinski
aa866ee4b1 mlx5-fixes-2023-05-24
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmRu2ZQACgkQSD+KveBX
 +j4sQAgAtCLJfYADpU7yWPZ3SMAnTelc65PfawdjEyBlRjE2CxsQ/bnRiWUXuR5i
 twsGvAJxQfpDBV0z/TxcdYznpLxHSIhosg0pRH+Tc4Lk6PjdFK7Sviqyw9JTrCJX
 rVaLJ5N877Pu5gH++OqFs8C+2gkrppQKWJtrmUaAPPdXixkr2q2pwlJiCOLpGYTN
 9yGKBO29wtvD0AJsssS8VJgy6uHMaOsTTACR4fYuV/huTKa5QfYBdJY1HqEs7Fe2
 LhGyLjVx6nHD8kt+fL9J2LJ6CLnG9a50reKuZnCOBa42obahUBCAQ5mr1mQlzNr5
 cTOmAZM+OWOa7bXJAY9AbRorlr1A4w==
 =DRJZ
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2023-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2023-05-24

This series includes bug fixes for the mlx5 driver.

* tag 'mlx5-fixes-2023-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  Documentation: net/mlx5: Wrap notes in admonition blocks
  Documentation: net/mlx5: Add blank line separator before numbered lists
  Documentation: net/mlx5: Use bullet and definition lists for vnic counters description
  Documentation: net/mlx5: Wrap vnic reporter devlink commands in code blocks
  net/mlx5: Fix check for allocation failure in comp_irqs_request_pci()
  net/mlx5: DR, Add missing mutex init/destroy in pattern manager
  net/mlx5e: Move Ethernet driver debugfs to profile init callback
  net/mlx5e: Don't attach netdev profile while handling internal error
  net/mlx5: Fix post parse infra to only parse every action once
  net/mlx5e: Use query_special_contexts cmd only once per mdev
  net/mlx5: fw_tracer, Fix event handling
  net/mlx5: SF, Drain health before removing device
  net/mlx5: Drain health before unregistering devlink
  net/mlx5e: Do not update SBCM when prio2buffer command is invalid
  net/mlx5e: Consider internal buffers size in port buffer calculations
  net/mlx5e: Prevent encap offload when neigh update is running
  net/mlx5e: Extract remaining tunnel encap code to dedicated file
====================

Link: https://lore.kernel.org/r/20230525034847.99268-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-25 21:09:41 -07:00
Kuniyuki Iwashima
822b5a1c17 af_packet: Fix data-races of pkt_sk(sk)->num.
syzkaller found a data race of pkt_sk(sk)->num.

The value is changed under lock_sock() and po->bind_lock, so we
need READ_ONCE() to access pkt_sk(sk)->num without these locks in
packet_bind_spkt(), packet_bind(), and sk_diag_fill().

Note that WRITE_ONCE() is already added by commit c7d2ef5dd4
("net/packet: annotate accesses to po->bind").

BUG: KCSAN: data-race in packet_bind / packet_do_bind

write (marked) to 0xffff88802ffd1cee of 2 bytes by task 7322 on cpu 0:
 packet_do_bind+0x446/0x640 net/packet/af_packet.c:3236
 packet_bind+0x99/0xe0 net/packet/af_packet.c:3321
 __sys_bind+0x19b/0x1e0 net/socket.c:1803
 __do_sys_bind net/socket.c:1814 [inline]
 __se_sys_bind net/socket.c:1812 [inline]
 __x64_sys_bind+0x40/0x50 net/socket.c:1812
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

read to 0xffff88802ffd1cee of 2 bytes by task 7318 on cpu 1:
 packet_bind+0xbf/0xe0 net/packet/af_packet.c:3322
 __sys_bind+0x19b/0x1e0 net/socket.c:1803
 __do_sys_bind net/socket.c:1814 [inline]
 __se_sys_bind net/socket.c:1812 [inline]
 __x64_sys_bind+0x40/0x50 net/socket.c:1812
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

value changed: 0x0300 -> 0x0000

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7318 Comm: syz-executor.4 Not tainted 6.3.0-13380-g7fddb5b5300c #4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 96ec632714 ("packet: Diag core and basic socket info dumping")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230524232934.50950-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-25 21:03:23 -07:00
Jakub Kicinski
081e8df681 tools: ynl: avoid dict errors on older Python versions
Python 3.9.0 or newer supports combining dicts() with |,
but older versions of Python are still used in the wild
(e.g. on CentOS 8, which goes EoL May 31, 2024).
With Python 3.6.8 we get:

  TypeError: unsupported operand type(s) for |: 'dict' and 'dict'

Use older syntax. Tested with non-legacy families only.

Fixes: f036d936ca ("tools: ynl: Add fixed-header support to ynl")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Tested-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230524170712.2036128-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-25 21:02:43 -07:00
Eric Dumazet
31642e7089 netrom: fix info-leak in nr_write_internal()
Simon Kapadia reported the following issue:

<quote>

The Online Amateur Radio Community (OARC) has recently been experimenting
with building a nationwide packet network in the UK.
As part of our experimentation, we have been testing out packet on 300bps HF,
and playing with net/rom.  For HF packet at this baud rate you really need
to make sure that your MTU is relatively low; AX.25 suggests a PACLEN of 60,
and a net/rom PACLEN of 40 to go with that.
However the Linux net/rom support didn't work with a low PACLEN;
the mkiss module would truncate packets if you set the PACLEN below about 200 or so, e.g.:

Apr 19 14:00:51 radio kernel: [12985.747310] mkiss: ax1: truncating oversized transmit packet!

This didn't make any sense to me (if the packets are smaller why would they
be truncated?) so I started investigating.
I looked at the packets using ethereal, and found that many were just huge
compared to what I would expect.
A simple net/rom connection request packet had the request and then a bunch
of what appeared to be random data following it:

</quote>

Simon provided a patch that I slightly revised:
Not only we must not use skb_tailroom(), we also do
not want to count NR_NETWORK_LEN twice.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Co-Developed-by: Simon Kapadia <szymon@kapadia.pl>
Signed-off-by: Simon Kapadia <szymon@kapadia.pl>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Simon Kapadia <szymon@kapadia.pl>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230524141456.1045467-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-25 21:02:29 -07:00
Wei Fang
ffb3322181 net: stmmac: fix call trace when stmmac_xdp_xmit() is invoked
We encountered a kernel call trace issue which was related to
ndo_xdp_xmit callback on our i.MX8MP platform. The reproduce
steps show as follows.
1. The FEC port (eth0) connects to a PC port, and the PC uses
pktgen_sample03_burst_single_flow.sh to generate packets and
send these packets to the FEC port. Notice that the script must
be executed before step 2.
2. Run the "./xdp_redirect eth0 eth1" command on i.MX8MP, the
eth1 interface is the dwmac. Then there will be a call trace
issue soon. Please see the log for more details.
The root cause is that the NETDEV_XDP_ACT_NDO_XMIT feature is
enabled by default, so when the step 2 command is exexcuted
and packets have already been sent to eth0, the stmmac_xdp_xmit()
starts running before the stmmac_xdp_set_prog() finishes. To
resolve this issue, we disable the NETDEV_XDP_ACT_NDO_XMIT
feature by default and turn on/off this feature when the bpf
program is installed/uninstalled which just like the other
ethernet drivers.

Call Trace log:
[  306.311271] ------------[ cut here ]------------
[  306.315910] WARNING: CPU: 0 PID: 15 at lib/timerqueue.c:55 timerqueue_del+0x68/0x70
[  306.323590] Modules linked in:
[  306.326654] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 6.4.0-rc1+ #37
[  306.333277] Hardware name: NXP i.MX8MPlus EVK board (DT)
[  306.338591] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  306.345561] pc : timerqueue_del+0x68/0x70
[  306.349577] lr : __remove_hrtimer+0x5c/0xa0
[  306.353777] sp : ffff80000b7c3920
[  306.357094] x29: ffff80000b7c3920 x28: 0000000000000000 x27: 0000000000000001
[  306.364244] x26: ffff80000a763a40 x25: ffff0000d0285a00 x24: 0000000000000001
[  306.371390] x23: 0000000000000001 x22: ffff000179389a40 x21: 0000000000000000
[  306.378537] x20: ffff000179389aa0 x19: ffff0000d2951308 x18: 0000000000001000
[  306.385686] x17: f1d3000000000000 x16: 00000000c39c1000 x15: 55e99bbe00001a00
[  306.392835] x14: 09000900120aa8c0 x13: e49af1d300000000 x12: 000000000000c39c
[  306.399987] x11: 100055e99bbe0000 x10: ffff8000090b1048 x9 : ffff8000081603fc
[  306.407133] x8 : 000000000000003c x7 : 000000000000003c x6 : 0000000000000001
[  306.414284] x5 : ffff0000d2950980 x4 : 0000000000000000 x3 : 0000000000000000
[  306.421432] x2 : 0000000000000001 x1 : ffff0000d2951308 x0 : ffff0000d2951308
[  306.428585] Call trace:
[  306.431035]  timerqueue_del+0x68/0x70
[  306.434706]  __remove_hrtimer+0x5c/0xa0
[  306.438549]  hrtimer_start_range_ns+0x2bc/0x370
[  306.443089]  stmmac_xdp_xmit+0x174/0x1b0
[  306.447021]  bq_xmit_all+0x194/0x4b0
[  306.450612]  __dev_flush+0x4c/0x98
[  306.454024]  xdp_do_flush+0x18/0x38
[  306.457522]  fec_enet_rx_napi+0x6c8/0xc68
[  306.461539]  __napi_poll+0x40/0x220
[  306.465038]  net_rx_action+0xf8/0x240
[  306.468707]  __do_softirq+0x128/0x3a8
[  306.472378]  run_ksoftirqd+0x40/0x58
[  306.475961]  smpboot_thread_fn+0x1c4/0x288
[  306.480068]  kthread+0x124/0x138
[  306.483305]  ret_from_fork+0x10/0x20
[  306.486889] ---[ end trace 0000000000000000 ]---

Fixes: 66c0e13ad2 ("drivers: net: turn on XDP features")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230524125714.357337-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-25 21:01:35 -07:00
Thomas Bogendoerfer
d68cb7cf1f net: mellanox: mlxbf_gige: Fix skb_panic splat under memory pressure
Do skb_put() after a new skb has been successfully allocated otherwise
the reused skb leads to skb_panics or incorrect packet sizes.

Fixes: f92e1869d7 ("Add Mellanox BlueField Gigabit Ethernet driver")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230524194908.147145-1-tbogendoerfer@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-25 20:57:46 -07:00
Linus Torvalds
50fb587e6a Networking fixes for 6.4-rc4, including fixes from bluetooth and bpf
Current release - regressions:
 
   - net: fix skb leak in __skb_tstamp_tx()
 
   - eth: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCs
 
 Current release - new code bugs:
 
   - handshake:
     - fix sock->file allocation
     - fix handshake_dup() ref counting
 
   - bluetooth:
     - fix potential double free caused by hci_conn_unlink
     - fix UAF in hci_conn_hash_flush
 
 Previous releases - regressions:
 
   - core: fix stack overflow when LRO is disabled for virtual interfaces
 
   - tls: fix strparser rx issues
 
   - bpf:
     - fix many sockmap/TCP related issues
     - fix a memory leak in the LRU and LRU_PERCPU hash maps
     - init the offload table earlier
 
   - eth: mlx5e:
     - do as little as possible in napi poll when budget is 0
     - fix using eswitch mapping in nic mode
     - fix deadlock in tc route query code
 
 Previous releases - always broken:
 
   - udplite: fix NULL pointer dereference in __sk_mem_raise_allocated()
 
   - raw: fix output xfrm lookup wrt protocol
 
   - smc: reset connection when trying to use SMCRv2 fails
 
   - phy: mscc: enable VSC8501/2 RGMII RX clock
 
   - eth: octeontx2-pf: fix TSOv6 offload
 
   - eth: cdc_ncm: deal with too low values of dwNtbOutMaxSize
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRvOisSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkMW8P/3rZy4Yy2bIWFCkxKD/aPvqG60ZZfvV/
 sB7Qu3X0OLiDNAmdDsXjCFeMYnV4cxDvwxjFUVQX0ZZEilEbGQ2XlOaFTpXS3jeW
 UQup55DW7VG6BkuNJipwtLkLSQ498Z+qinRPsmNPVADkItHHbyrSnKNjh34ruhly
 P5edWJ/3PuzoK2hN/izgBpk0i1UC1+tSKKANV5dlIWb6CXY9C8pvr0CScuGb5rKv
 xAs40Rp1eaFmkYkhbAn3H2fvSOoCr2aSDeS2SvRAxca9OUcrUAjnnsLTVq5WI22/
 PxSESy6wfE2e5+q1AwskwBdFO3LLKheVYJF2KzSlRk4FuWk50GbwbpueRSOYEU7b
 2w0MveYggr4m3B06/2esrsr6bEPsb4QFKE+hubX5FmIPECOz+dOA0RW4mOysvzqM
 q+xEuR9uWFsrMO7WVU7/4oF02HqAfAtaEn/87aniGz5o7bzPbmyyyBKfmb4s2c13
 TU828rEBNGkmqxSwsZHUOt21IJoOa646W99zsmGpRo/m47pFx093HVR22Hr1dH0B
 BllhsmtvJZ2XsWkR2Q9aAyyluc3/b3yI24OM125y7bIBWte2MF908xaStx/al+AF
 jPL/ioEQKNsOJKHan9EzhbyH98RCfEotLb+ha/qNQ9GGjKROHsTn9EgP7h7367oo
 yS8QLmvng01f
 =hz3D
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth and bpf.

  Current release - regressions:

   - net: fix skb leak in __skb_tstamp_tx()

   - eth: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCs

  Current release - new code bugs:

   - handshake:
      - fix sock->file allocation
      - fix handshake_dup() ref counting

   - bluetooth:
      - fix potential double free caused by hci_conn_unlink
      - fix UAF in hci_conn_hash_flush

  Previous releases - regressions:

   - core: fix stack overflow when LRO is disabled for virtual
     interfaces

   - tls: fix strparser rx issues

   - bpf:
      - fix many sockmap/TCP related issues
      - fix a memory leak in the LRU and LRU_PERCPU hash maps
      - init the offload table earlier

   - eth: mlx5e:
      - do as little as possible in napi poll when budget is 0
      - fix using eswitch mapping in nic mode
      - fix deadlock in tc route query code

  Previous releases - always broken:

   - udplite: fix NULL pointer dereference in __sk_mem_raise_allocated()

   - raw: fix output xfrm lookup wrt protocol

   - smc: reset connection when trying to use SMCRv2 fails

   - phy: mscc: enable VSC8501/2 RGMII RX clock

   - eth: octeontx2-pf: fix TSOv6 offload

   - eth: cdc_ncm: deal with too low values of dwNtbOutMaxSize"

* tag 'net-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits)
  udplite: Fix NULL pointer dereference in __sk_mem_raise_allocated().
  net: phy: mscc: enable VSC8501/2 RGMII RX clock
  net: phy: mscc: remove unnecessary phydev locking
  net: phy: mscc: add support for VSC8501
  net: phy: mscc: add VSC8502 to MODULE_DEVICE_TABLE
  net/handshake: Enable the SNI extension to work properly
  net/handshake: Unpin sock->file if a handshake is cancelled
  net/handshake: handshake_genl_notify() shouldn't ignore @flags
  net/handshake: Fix uninitialized local variable
  net/handshake: Fix handshake_dup() ref counting
  net/handshake: Remove unneeded check from handshake_dup()
  ipv6: Fix out-of-bounds access in ipv6_find_tlv()
  net: ethernet: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCs
  docs: netdev: document the existence of the mail bot
  net: fix skb leak in __skb_tstamp_tx()
  r8169: Use a raw_spinlock_t for the register locks.
  page_pool: fix inconsistency for page_pool_ring_[un]lock()
  bpf, sockmap: Test progs verifier error with latest clang
  bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer with drops
  bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer
  ...
2023-05-25 10:55:26 -07:00
Linus Torvalds
eb03e31813 Power Supply Fixes for 6.4 cycle
Fixes for the 6.4 cycle:
  * Fix power_supply_get_battery_info for devices without parent
    devices resulting in NULL pointer dereference
  * Fix desktop systems reporting to run on battery once a power-supply
    device with device scope appears (e.g. a HID keyboard with a battery)
  * Ratelimit debug print about driver not providing data
  * Fix race condition related to external_power_changed in multiple
    drivers (ab8500, axp288, bq25890, sc27xx, bq27xxx)
  * Fix LED trigger switching from blinking to solid-on when charging
    finishes
  * Fix multiple races in bq27xxx battery driver
  * mt6360: handle potential ENOMEM from devm_work_autocancel
  * sbs-charger: Fix SBS_CHARGER_STATUS_CHARGE_INHIBITED bit
  * rt9467: avoid passing 0 to dev_err_probe
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE72YNB0Y/i3JqeVQT2O7X88g7+poFAmRvknYACgkQ2O7X88g7
 +pqM9w/8DrpP3Te4mNoxvdi4teVXkBVmEEaK4aK+KJ8ag+v5hLCDEO7BDEMHYob3
 ihfINsIbHRdo3PEAy049fXic847h8NsIUr79HiLZKVPbqFFVxyy17IBElt+FPiLA
 nx0Gqi2lkuF3hWsgHXwsOS+ltQzcGMDPUM3xpQbrXDi8zdjDsNFSDHX8Luqqjmo7
 +hIaWt1QmzzF4IH9cdAdO0/B5zbEKcfiU6XM/IDgftUdp3gXRxlwawIj67z3hGDC
 ObumINevoYir/ZA3J/Q0KKiUS5F6tr3XxMgAi3s2wrIrRRF7VgrWMAf98Laf64Yl
 jJwWwDzktYhJ9fMY6qksMdPfP+VT0g/6/mWKH3emmT18w1kGqSbHEvqvThmZvV6E
 i+iPol5657dvXI+2Os75yyjJ9lSncXuQEUsi+i6isa+ac249t5XINzo+Gz+rOUrZ
 qkZ/Jkkv8xnlnvRZXSNs4FDTE4yBiiqQbg/H5y3anu8Vb4Q99IlVruwjj1ukS3gZ
 eIzM72UbWmdYbATXIwhwgiRqVWWkJDJL9BsaLN55th231jRJCzwLoYRG5elOuXQC
 JTjgoz31hAoCtc5f16FZS+PIuNEFOsTE32aX3SlnhHhA7NWewSgwsw9tZCmqlaJ2
 rA5rdrkEtWooL2JzgAqfgjbKCkWfS2e1oCGBb1lgSiio+YOdRkg=
 =jWEO
 -----END PGP SIGNATURE-----

Merge tag 'for-v6.4-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:

 - Fix power_supply_get_battery_info for devices without parent devices
   resulting in NULL pointer dereference

 - Fix desktop systems reporting to run on battery once a power-supply
   device with device scope appears (e.g. a HID keyboard with a battery)

 - Ratelimit debug print about driver not providing data

 - Fix race condition related to external_power_changed in multiple
   drivers (ab8500, axp288, bq25890, sc27xx, bq27xxx)

 - Fix LED trigger switching from blinking to solid-on when charging
   finishes

 - Fix multiple races in bq27xxx battery driver

 - mt6360: handle potential ENOMEM from devm_work_autocancel

 - sbs-charger: Fix SBS_CHARGER_STATUS_CHARGE_INHIBITED bit

 - rt9467: avoid passing 0 to dev_err_probe

* tag 'for-v6.4-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (21 commits)
  power: supply: Fix logic checking if system is running from battery
  power: supply: mt6360: add a check of devm_work_autocancel in mt6360_charger_probe
  power: supply: sbs-charger: Fix INHIBITED bit for Status reg
  power: supply: rt9467: Fix passing zero to 'dev_err_probe'
  power: supply: Ratelimit no data debug output
  power: supply: Fix power_supply_get_battery_info() if parent is NULL
  power: supply: bq24190: Call power_supply_changed() after updating input current
  power: supply: bq25890: Call power_supply_changed() after updating input current or voltage
  power: supply: bq27xxx: Use mod_delayed_work() instead of cancel() + schedule()
  power: supply: bq27xxx: After charger plug in/out wait 0.5s for things to stabilize
  power: supply: bq27xxx: Ensure power_supply_changed() is called on current sign changes
  power: supply: bq27xxx: Move bq27xxx_battery_update() down
  power: supply: bq27xxx: Add cache parameter to bq27xxx_battery_current_and_status()
  power: supply: bq27xxx: Fix poll_interval handling and races on remove
  power: supply: bq27xxx: Fix I2C IRQ race on remove
  power: supply: bq27xxx: Fix bq27xxx_battery_update() race condition
  power: supply: leds: Fix blink to LED on transition
  power: supply: sc27xx: Fix external_power_changed race
  power: supply: bq25890: Fix external_power_changed race
  power: supply: axp288_fuel_gauge: Fix external_power_changed race
  ...
2023-05-25 10:26:36 -07:00
Linus Torvalds
029c77f89a A collection of small fixes since the previous PR.
- HD-audio runtime PM bug fix
 - A couple of HD-audio quirks
 - Fix series of ASoC Intel AVS drivers
 - ASoC DPCM fix for a bug found on new Intel systems
 - A few other ASoC device-specific small fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmRuMRIOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE8hww/8D9PklvrAKGSpyEnjdNM/Uin5wxq647nIwANJ
 rgypSu9oyi9oeyCRFH52HTjKY9UzyWV350jJA50tyW41ri345RYzrr04x3zUtffH
 MRO3Hw6bfZou2H0lvz+uTJhDH/G17Tvu/ThrOssKCyHzZguTp1obgLhXQCm9iCAr
 2oh7Z+fkzX2+312Y5jhZnnzNjThk76pe/z1Poxh+3MbRNkaCLYhu1k/dHs2u/TnV
 jv+4SUAHUtaK55+ZYBNZhw+Qb02wgx4SrZfHXH6O4TNCPCsE7JH3afb9QuZpu/8k
 HJONc7ASKyCqACY+tFW7KSONZIxyG1jd0vaK/4uMPdi+xFWEefP4NM5hEKgUfLxv
 uEWg7ojgQ65Xka6Zn68Va8wqLieU8xXm3mkRkBLUJKHfDjAsTgK9Qldas1yD97P2
 5OmiHhgGTdBfxiJ+A92va54L4HZBnW8cvn7FFCd/JvlKDrPswVdM2gwagpdEJL9c
 Nw/ocDMk8hlXuuSTOavq++PYBtA72c3R2a6X8Z7TVcRs1+N2iOymE2/azcj6PCmD
 +AC872GBR45NkHMKzfMmsYfJF3wbNYGsTMEIBkMm5b0tSxWa7oXI4oD/ddCZoQdL
 KgvEhtdBKxtv9IOkPAhMGc9hB/BTHORjU1Kj7kuz/hXmZ7HnbMtxgvctYCxz6H74
 +yg5eLo=
 =/I2b
 -----END PGP SIGNATURE-----

Merge tag 'sound-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes:

   - HD-audio runtime PM bug fix

   - A couple of HD-audio quirks

   - Fix series of ASoC Intel AVS drivers

   - ASoC DPCM fix for a bug found on new Intel systems

   - A few other ASoC device-specific small fixes"

* tag 'sound-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Enable headset onLenovo M70/M90
  ASoC: dwc: move DMA init to snd_soc_dai_driver probe()
  ASoC: cs35l41: Fix default regmap values for some registers
  ALSA: hda: Fix unhandled register update during auto-suspend period
  ASoC: dt-bindings: tlv320aic32x4: Fix supply names
  ASoC: Intel: avs: Add missing checks on FE startup
  ASoC: Intel: avs: Fix avs_path_module::instance_id size
  ASoC: Intel: avs: Account for UID of ACPI device
  ASoC: Intel: avs: Fix declaration of enum avs_channel_config
  ASoC: Intel: Skylake: Fix declaration of enum skl_ch_cfg
  ASoC: Intel: avs: Access path components under lock
  ASoC: Intel: avs: Fix module lookup
  ALSA: hda/ca0132: add quirk for EVGA X299 DARK
  ASoC: soc-pcm: test if a BE can be prepared
  ASoC: rt5682: Disable jack detection interrupt during suspend
  ASoC: lpass: Fix for KASAN use_after_free out of bounds
2023-05-25 09:48:23 -07:00
Linus Torvalds
ecea3ba2fb platform-drivers-x86 for v6.4-3
A small set of assorted bug fixes for 6.4.
 
 The following is an automated git shortlog grouped by driver:
 
 ISST:
  -  Remove 8 socket limit
 
 asus-wmi:
  -  Ignore WMI events with codes 0x7B, 0xC0
 
 platform/mellanox:
  -  mlxbf-pmc: fix sscanf() error checking
 
 platform/x86/amd/pmf:
  -  Fix CnQF and auto-mode after resume
 
 platform/x86/intel/ifs:
  -  Annotate work queue on stack so object debug does not complain
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmRvMOcUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9z7Mwf+Jy5XGnRucPcK4AWqraXfOU5CMQsy
 gGKR4hUZdIJUR7kmLJ5p7KAucUGE2lYzJ0/V39ieVQi47e3VcR13P6ED3oh3guB5
 8cjKz9LfxsQcf3nHZml4HcnC8ZRBtzlSnnq7BUTt2DskqOz37gp3mwx3jCKTSI7x
 8bIG7b4sTIsiKvtNKvpUsypPT0eR+wfIHbmBUlD+yO3r4xqpar6ipu3Gn08ggY/2
 H5DBaHIVApyqvoAI0BFsX2v0sy8UMbO2KxW8CrQKYanCtay/SMYweMhmCI7pYPh3
 rD/pKjdifd7xIabEuePjFQfS5Tmsrgvv7o2Ir+k0X6gQGD8oQeiAoJ5Uyw==
 =EQe5
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:
 "Nothing special to report just a few small fixes"

* tag 'platform-drivers-x86-v6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86/intel/ifs: Annotate work queue on stack so object debug does not complain
  platform/x86: ISST: Remove 8 socket limit
  platform/mellanox: mlxbf-pmc: fix sscanf() error checking
  platform/x86/amd/pmf: Fix CnQF and auto-mode after resume
  platform/x86: asus-wmi: Ignore WMI events with codes 0x7B, 0xC0
2023-05-25 09:42:15 -07:00
Linus Torvalds
5566051fa6 m68k fixes for v6.4
- Fix user-space crashes on 68020/68030.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCZG8O3xUcZ2VlcnRAbGlu
 dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XCvywD/TDu1jmuiKKZeIOdGjRQOIW8jL94Y
 FlZ/3Ot58VR0A60BAK6My+4pkBoeAO4/cIWQNNf7+vzymV8bVP36BxZVx/IH
 =1Qsv
 -----END PGP SIGNATURE-----

Merge tag 'm68k-for-v6.4-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k fix from Geert Uytterhoeven:

 - Fix signal frame issue causing user-space crashes on 68020/68030

* tag 'm68k-for-v6.4-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Move signal frame following exception on 68020/030
2023-05-25 09:33:03 -07:00
Kuniyuki Iwashima
ad42a35bdf udplite: Fix NULL pointer dereference in __sk_mem_raise_allocated().
syzbot reported [0] a null-ptr-deref in sk_get_rmem0() while using
IPPROTO_UDPLITE (0x88):

  14:25:52 executing program 1:
  r0 = socket$inet6(0xa, 0x80002, 0x88)

We had a similar report [1] for probably sk_memory_allocated_add()
in __sk_mem_raise_allocated(), and commit c915fe13cb ("udplite: fix
NULL pointer dereference") fixed it by setting .memory_allocated for
udplite_prot and udplitev6_prot.

To fix the variant, we need to set either .sysctl_wmem_offset or
.sysctl_rmem.

Now UDP and UDPLITE share the same value for .memory_allocated, so we
use the same .sysctl_wmem_offset for UDP and UDPLITE.

[0]:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 6829 Comm: syz-executor.1 Not tainted 6.4.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/28/2023
RIP: 0010:sk_get_rmem0 include/net/sock.h:2907 [inline]
RIP: 0010:__sk_mem_raise_allocated+0x806/0x17a0 net/core/sock.c:3006
Code: c1 ea 03 80 3c 02 00 0f 85 23 0f 00 00 48 8b 44 24 08 48 8b 98 38 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 <0f> b6 14 02 48 89 d8 83 e0 07 83 c0 03 38 d0 0f 8d 6f 0a 00 00 8b
RSP: 0018:ffffc90005d7f450 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90004d92000
RDX: 0000000000000000 RSI: ffffffff88066482 RDI: ffffffff8e2ccbb8
RBP: ffff8880173f7000 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000030000
R13: 0000000000000001 R14: 0000000000000340 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8880b9800000(0063) knlGS:00000000f7f1cb40
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 000000002e82f000 CR3: 0000000034ff0000 CR4: 00000000003506f0
Call Trace:
 <TASK>
 __sk_mem_schedule+0x6c/0xe0 net/core/sock.c:3077
 udp_rmem_schedule net/ipv4/udp.c:1539 [inline]
 __udp_enqueue_schedule_skb+0x776/0xb30 net/ipv4/udp.c:1581
 __udpv6_queue_rcv_skb net/ipv6/udp.c:666 [inline]
 udpv6_queue_rcv_one_skb+0xc39/0x16c0 net/ipv6/udp.c:775
 udpv6_queue_rcv_skb+0x194/0xa10 net/ipv6/udp.c:793
 __udp6_lib_mcast_deliver net/ipv6/udp.c:906 [inline]
 __udp6_lib_rcv+0x1bda/0x2bd0 net/ipv6/udp.c:1013
 ip6_protocol_deliver_rcu+0x2e7/0x1250 net/ipv6/ip6_input.c:437
 ip6_input_finish+0x150/0x2f0 net/ipv6/ip6_input.c:482
 NF_HOOK include/linux/netfilter.h:303 [inline]
 NF_HOOK include/linux/netfilter.h:297 [inline]
 ip6_input+0xa0/0xd0 net/ipv6/ip6_input.c:491
 ip6_mc_input+0x40b/0xf50 net/ipv6/ip6_input.c:585
 dst_input include/net/dst.h:468 [inline]
 ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline]
 NF_HOOK include/linux/netfilter.h:303 [inline]
 NF_HOOK include/linux/netfilter.h:297 [inline]
 ipv6_rcv+0x250/0x380 net/ipv6/ip6_input.c:309
 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5491
 __netif_receive_skb+0x1f/0x1c0 net/core/dev.c:5605
 netif_receive_skb_internal net/core/dev.c:5691 [inline]
 netif_receive_skb+0x133/0x7a0 net/core/dev.c:5750
 tun_rx_batched+0x4b3/0x7a0 drivers/net/tun.c:1553
 tun_get_user+0x2452/0x39c0 drivers/net/tun.c:1989
 tun_chr_write_iter+0xdf/0x200 drivers/net/tun.c:2035
 call_write_iter include/linux/fs.h:1868 [inline]
 new_sync_write fs/read_write.c:491 [inline]
 vfs_write+0x945/0xd50 fs/read_write.c:584
 ksys_write+0x12b/0x250 fs/read_write.c:637
 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
 __do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
 do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203
 entry_SYSENTER_compat_after_hwframe+0x70/0x82
RIP: 0023:0xf7f21579
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f7f1c590 EFLAGS: 00000282 ORIG_RAX: 0000000000000004
RAX: ffffffffffffffda RBX: 00000000000000c8 RCX: 0000000020000040
RDX: 0000000000000083 RSI: 00000000f734e000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
Modules linked in:

Link: https://lore.kernel.org/netdev/CANaxB-yCk8hhP68L4Q2nFOJht8sqgXGGQO2AftpHs0u1xyGG5A@mail.gmail.com/ [1]
Fixes: 850cbaddb5 ("udp: use it's own memory accounting schema")
Reported-by: syzbot+444ca0907e96f7c5e48b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=444ca0907e96f7c5e48b
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230523163305.66466-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-25 10:51:58 +02:00
Jakub Kicinski
aa015a204b Merge branch 'net-phy-mscc-support-vsc8501'
David Epping says:

====================
net: phy: mscc: support VSC8501

this updated series of patches adds support for the VSC8501 Ethernet
PHY and fixes support for the VSC8502 PHY in cases where no other
software (like U-Boot) has initialized the PHY after power up.

The first patch simply adds the VSC8502 to the MODULE_DEVICE_TABLE,
where I guess it was unintentionally missing. I have no hardware to
test my change.

The second patch adds the VSC8501 PHY with exactly the same driver
implementation as the existing VSC8502.

The (new) third patch removes phydev locking from
vsc85xx_rgmii_set_skews(), as discussed for v2 of the patch set.

The (now) fourth patch fixes the initialization for VSC8501 and VSC8502.
I have tested this patch with VSC8501 on hardware in RGMII mode only.
https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8501-03_Datasheet_60001741A.PDF
https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8502-03_Datasheet_60001742B.pdf
Table 4-42 "RGMII CONTROL, ADDRESS 20E2 (0X14)" Bit 11 for each of
them.
By default the RX_CLK is disabled for these PHYs. In cases where no
other software, like U-Boot, enabled the clock, this results in no
received packets being handed to the MAC.
The patch enables this clock output.
According to Microchip support (case number 01268776) this applies
to all modes (RGMII, GMII, and MII).

Other PHYs sharing the same register map and code, like
VSC8530/31/40/41 have the clock enabled and the relevant bit 11 is
reserved and read-only for them. As per previous discussion the
patch still clears the bit on these PHYs, too, possibly more easily
supporting other future PHYs implementing this functionality.

For the VSC8572 family of PHYs, having a different register map,
no such changes are applied.
====================

Link: https://lore.kernel.org/r/20230523153108.18548-1-david.epping@missinglinkelectronics.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 22:14:29 -07:00
David Epping
71460c9ec5 net: phy: mscc: enable VSC8501/2 RGMII RX clock
By default the VSC8501 and VSC8502 RGMII/GMII/MII RX_CLK output is
disabled. To allow packet forwarding towards the MAC it needs to be
enabled.

For other PHYs supported by this driver the clock output is enabled
by default.

Fixes: d316986331 ("net: phy: mscc: add support for VSC8502")
Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 22:14:23 -07:00
David Epping
7df0b33d79 net: phy: mscc: remove unnecessary phydev locking
Holding the struct phy_device (phydev) lock is unnecessary when
accessing phydev->interface in the PHY driver .config_init method,
which is the only place that vsc85xx_rgmii_set_skews() is called from.

The phy_modify_paged() function implements required MDIO bus level
locking, which can not be achieved by a phydev lock.

Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 22:14:10 -07:00
David Epping
fb055ce4a9 net: phy: mscc: add support for VSC8501
The VSC8501 PHY can use the same driver implementation as the VSC8502.
Adding the PHY ID and copying the handler functions of VSC8502 is
sufficient to operate it.

Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 22:14:10 -07:00
David Epping
57fb54ab9f net: phy: mscc: add VSC8502 to MODULE_DEVICE_TABLE
The mscc driver implements support for VSC8502, so its ID should be in
the MODULE_DEVICE_TABLE for automatic loading.

Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Fixes: d316986331 ("net: phy: mscc: add support for VSC8502")
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 22:14:10 -07:00
Jakub Kicinski
1de5900c81 Merge branch 'bug-fixes-for-net-handshake'
Chuck Lever says:

====================
Bug fixes for net/handshake

Paolo observed that there is a possible leak of sock->file. I
haven't looked into that yet, but it seems to be separate from
the fixes in this series, so no need to hold these up.
====================

The submissions mentions net-next but it means netdev (perhaps
merge window left over when trees are converged). In any case,
it should have gone into net, but was instead applied to net-next
as commit deb2e484ba ("Merge branch 'net-handshake-fixes'").
These are fixes tho, and Chuck needs them to make progress with
the client so double-merging them into net... it is what it is :(

Link: https://lore.kernel.org/r/168381978252.84244.1933636428135211300.stgit@91.116.238.104.host.secureserver.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 22:05:26 -07:00
Chuck Lever
26fb5480a2 net/handshake: Enable the SNI extension to work properly
Enable the upper layer protocol to specify the SNI peername. This
avoids the need for tlshd to use a DNS lookup, which can return a
hostname that doesn't match the incoming certificate's SubjectName.

Fixes: 2fd5532044 ("net/handshake: Add a kernel API for requesting a TLSv1.3 handshake")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 22:05:24 -07:00
Chuck Lever
1ce77c998f net/handshake: Unpin sock->file if a handshake is cancelled
If user space never calls DONE, sock->file's reference count remains
elevated. Enable sock->file to be freed eventually in this case.

Reported-by: Jakub Kacinski <kuba@kernel.org>
Fixes: 3b3009ea8a ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24 22:05:24 -07:00