linux/net/core
Geliang Tang f0c1802569 skmsg: Skip zero length skb in sk_msg_recvmsg
When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
platform, the following kernel panic occurs:

  [...]
  Oops[#1]:
  CPU: 22 PID: 2824 Comm: test_progs Tainted: G           OE  6.10.0-rc2+ #18
  Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
     ... ...
     ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
    ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
   CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
   PRMD: 0000000c (PPLV0 +PIE +PWE)
   EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
   ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
  ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
   BADV: 0000000000000040
   PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
  Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
  Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
  Stack : ...
  Call Trace:
  [<9000000004162774>] copy_page_to_iter+0x74/0x1c0
  [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
  [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
  [<90000000049aae34>] inet_recvmsg+0x54/0x100
  [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
  [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
  [<900000000481e27c>] sys_recvfrom+0x1c/0x40
  [<9000000004c076ec>] do_syscall+0x8c/0xc0
  [<9000000003731da4>] handle_syscall+0xc4/0x160
  Code: ...
  ---[ end trace 0000000000000000 ]---
  Kernel panic - not syncing: Fatal exception
  Kernel relocated by 0x3510000
   .text @ 0x9000000003710000
   .data @ 0x9000000004d70000
   .bss  @ 0x9000000006469400
  ---[ end Kernel panic - not syncing: Fatal exception ]---
  [...]

This crash happens every time when running sockmap_skb_verdict_shutdown
subtest in sockmap_basic.

This crash is because a NULL pointer is passed to page_address() in the
sk_msg_recvmsg(). Due to the different implementations depending on the
architecture, page_address(NULL) will trigger a panic on Loongarch
platform but not on x86 platform. So this bug was hidden on x86 platform
for a while, but now it is exposed on Loongarch platform. The root cause
is that a zero length skb (skb->len == 0) was put on the queue.

This zero length skb is a TCP FIN packet, which was sent by shutdown(),
invoked in test_sockmap_skb_verdict_shutdown():

	shutdown(p1, SHUT_WR);

In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
page is put to this sge (see sg_set_page in sg_set_page), but this empty
sge is queued into ingress_msg list.

And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
to kmap_local_page() and to page_address(), then kernel panics.

To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
if copy is zero, that means it's a zero length skb, skip invoking
copy_page_to_iter(). We are using the EFAULT return triggered by
copy_page_to_iter to check for is_fin in tcp_bpf.c.

Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Suggested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn
2024-07-09 10:24:50 +02:00
..
bpf_sk_storage.c netlink: introduce type-checking attribute iteration 2024-03-29 15:06:02 -07:00
datagram.c net: allow skb_datagram_iter to be called from any context 2024-07-02 14:55:15 +02:00
dev_addr_lists_test.c net: dev_addr_lists: move locking out of init/exit in kunit 2024-04-15 10:26:35 +01:00
dev_addr_lists.c
dev_ioctl.c net: partial revert of the "Make timestamping selectable: series 2023-11-18 18:42:37 -08:00
dev.c net: add softirq safety to netdev_rename_lock 2024-06-21 12:18:34 +01:00
dev.h net: move sysctl_skb_defer_max to net_hotdata 2024-04-30 18:46:52 -07:00
drop_monitor.c drop_monitor: replace spin_lock by raw_spin_lock 2024-04-15 09:54:15 +01:00
dst_cache.c net: dst_cache: add two DEBUG_NET warnings 2024-06-03 18:50:09 -07:00
dst.c net: dst: Make dst_destroy() static and return void. 2024-02-06 11:45:53 +01:00
failover.c net: failover: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf 2022-12-12 15:18:25 -08:00
fib_notifier.c
fib_rules.c fib: rules: no longer hold RTNL in fib_nl_dumprule() 2024-04-12 19:09:31 -07:00
filter.c bpf: Avoid splat in pskb_pull_reason 2024-06-14 17:20:21 +02:00
flow_dissector.c ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
flow_offload.c tc: flower: Enable offload support IPSEC SPI field. 2023-08-02 10:09:32 +01:00
gen_estimator.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
gen_stats.c net: Remove the obsolte u64_stats_fetch_*_irq() users (net). 2022-10-28 20:13:54 -07:00
gro_cells.c net: move netdev_max_backlog to net_hotdata 2024-03-07 21:12:42 -08:00
gro.c net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segment 2024-05-13 14:44:06 -07:00
gso.c net: introduce struct net_hotdata 2024-03-07 21:12:41 -08:00
hotdata.c net: move sysctl_mem_pcpu_rsv to net_hotdata 2024-04-30 18:46:52 -07:00
hwbm.c
ieee8021q_helpers.c net: add IEEE 802.1q specific helpers 2024-05-08 10:35:09 +01:00
link_watch.c net: add netdev_set_operstate() helper 2024-02-14 11:20:13 +00:00
lwt_bpf.c lwt: Fix return values of BPF xmit ops 2023-08-18 16:05:26 +02:00
lwtunnel.c xfrm: lwtunnel: squelch kernel warning in case XFRM encap type is not available 2022-10-12 10:45:51 +02:00
Makefile net: add IEEE 802.1q specific helpers 2024-05-08 10:35:09 +01:00
neighbour.c net: Remove the now superfluous sentinel elements from ctl_table array 2024-05-03 13:29:41 +01:00
net_namespace.c netns: Make get_net_ns() handle zero refcount net 2024-06-18 10:59:52 +02:00
net_test.c pfcp: always set pfcp metadata 2024-04-01 10:49:28 +01:00
net-procfs.c net: make softnet_data.dropped an atomic_t 2024-04-01 11:28:32 +01:00
net-sysfs.c net: no longer acquire RTNL in threaded_show() 2024-05-03 15:14:01 -07:00
net-sysfs.h
net-traces.c udp6: add a missing call into udp_fail_queue_rcv_skb tracepoint 2023-07-07 09:16:52 +01:00
netclassid_cgroup.c cgroup, netclassid: on modifying netclassid in cgroup, only consider the main process. 2023-10-16 16:36:53 -07:00
netdev-genl-gen.c netdev: support dumping a single netdev in qstats 2024-04-23 10:09:49 -07:00
netdev-genl-gen.h netdev: add per-queue statistics 2024-03-07 21:13:25 -08:00
netdev-genl.c netdev-genl: fix error codes when outputting XDP features 2024-06-14 18:04:29 -07:00
netevent.c
netpoll.c netpoll: Fix race condition in netpoll_owner_active 2024-04-30 19:03:47 -07:00
netprio_cgroup.c
of_net.c net: Explicitly include correct DT includes 2023-07-27 20:33:16 -07:00
page_pool_priv.h net: page_pool: report when page pool was destroyed 2023-11-28 15:48:39 +01:00
page_pool_user.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-03-07 10:29:36 -08:00
page_pool.c dma-mapping updates for Linux 6.10 2024-05-20 10:23:39 -07:00
pktgen.c net: pktgen: Use wait_event_freezable_timeout() for freezable kthread 2023-12-27 14:34:52 +00:00
ptp_classifier.c
request_sock.c tcp: make sure init the accept_queue's spinlocks once 2024-01-19 21:13:25 -08:00
rtnetlink.c rtnetlink: make the "split" NLM_DONE handling generic 2024-06-05 12:34:54 +01:00
scm.c af_unix: Add dead flag to struct scm_fp_list. 2024-05-10 18:52:45 -07:00
secure_seq.c
selftests.c net: fill in MODULE_DESCRIPTION()s under net/core 2023-10-28 11:29:27 +01:00
skbuff.c Revert "net: mirror skb frag ref/unref helpers" 2024-05-03 16:05:53 -07:00
skmsg.c skmsg: Skip zero length skb in sk_msg_recvmsg 2024-07-09 10:24:50 +02:00
sock_destructor.h
sock_diag.c sock_diag: remove sock_diag_mutex 2024-01-23 15:13:55 +01:00
sock_map.c sock_map: avoid race between sock_map_close and sk_psock_put 2024-05-28 12:05:19 +02:00
sock_reuseport.c soreuseport: Fix socket selection for SO_INCOMING_CPU. 2022-10-25 11:35:16 +02:00
sock.c net: do not leave a dangling sk pointer, when socket creation fails 2024-06-20 10:43:14 +02:00
stream.c net: Return error from sk_stream_wait_connect() if sk_wait_event() fails 2023-12-15 10:48:51 +00:00
sysctl_net_core.c net: Remove the now superfluous sentinel elements from ctl_table array 2024-05-03 13:29:41 +01:00
timestamping.c net: partial revert of the "Make timestamping selectable: series 2023-11-18 18:42:37 -08:00
tso.c net: tso: inline tso_count_descs() 2022-12-12 15:04:39 -08:00
utils.c net: core: inet[46]_pton strlen len types 2022-11-01 21:14:39 -07:00
xdp.c xdp: Remove WARN() from __xdp_reg_mem_model() 2024-06-24 13:44:02 +02:00