linux/net/core
Jiri Benc 9e4b7a99a0 net: gso: fix panic on frag_list with mixed head alloc types
Since commit 3dcbdb134f ("net: gso: Fix skb_segment splat when
splitting gso_size mangled skb having linear-headed frag_list"), it is
allowed to change gso_size of a GRO packet. However, that commit assumes
that "checking the first list_skb member suffices; i.e if either of the
list_skb members have non head_frag head, then the first one has too".

It turns out this assumption does not hold. We've seen BUG_ON being hit
in skb_segment when skbs on the frag_list had differing head_frag with
the vmxnet3 driver. This happens because __netdev_alloc_skb and
__napi_alloc_skb can return a skb that is page backed or kmalloced
depending on the requested size. As the result, the last small skb in
the GRO packet can be kmalloced.

There are three different locations where this can be fixed:

(1) We could check head_frag in GRO and not allow GROing skbs with
    different head_frag. However, that would lead to performance
    regression on normal forward paths with unmodified gso_size, where
    !head_frag in the last packet is not a problem.

(2) Set a flag in bpf_skb_net_grow and bpf_skb_net_shrink indicating
    that NETIF_F_SG is undesirable. That would need to eat a bit in
    sk_buff. Furthermore, that flag can be unset when all skbs on the
    frag_list are page backed. To retain good performance,
    bpf_skb_net_grow/shrink would have to walk the frag_list.

(3) Walk the frag_list in skb_segment when determining whether
    NETIF_F_SG should be cleared. This of course slows things down.

This patch implements (3). To limit the performance impact in
skb_segment, the list is walked only for skbs with SKB_GSO_DODGY set
that have gso_size changed. Normal paths thus will not hit it.

We could check only the last skb but since we need to walk the whole
list anyway, let's stay on the safe side.

Fixes: 3dcbdb134f ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/e04426a6a91baf4d1081e1b478c82b5de25fdf21.1667407944.git.jbenc@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03 20:58:09 -07:00
..
bpf_sk_storage.c net: Fix data-races around sysctl_optmem_max. 2022-08-24 13:46:57 +01:00
datagram.c tcp: TX zerocopy should not sense pfmemalloc status 2022-09-02 12:29:02 +01:00
dev_addr_lists_test.c net: kunit: add a test for dev_addr_lists 2021-11-20 12:25:57 +00:00
dev_addr_lists.c net: extract a few internals from netdevice.h 2022-04-07 20:32:09 -07:00
dev_ioctl.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
dev.c net: Fix return value of qdisc ingress handling on success 2022-10-19 14:04:36 +01:00
dev.h net: add skb_defer_max sysctl 2022-05-16 11:33:59 +01:00
devlink.c net: devlink: add port_init/fini() helpers to allow pre-register/post-unregister functions 2022-09-30 18:17:16 -07:00
drop_monitor.c genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
dst_cache.c wireguard: device: reset peer src endpoint when netns exits 2021-11-29 19:50:45 -08:00
dst.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
failover.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
fib_notifier.c
fib_rules.c fib: expand fib_rule_policy 2021-12-16 07:18:35 -08:00
filter.c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-10-03 13:02:49 -07:00
flow_dissector.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-09-22 13:02:10 -07:00
flow_offload.c flow_offload: Introduce flow_match_l2tpv3 2022-09-20 09:13:38 +02:00
gen_estimator.c
gen_stats.c net: sched: fix misuse of qcpu->backlog in gnet_stats_add_queue_cpu 2022-08-16 19:38:20 -07:00
gro_cells.c net: drop the weight argument from netif_napi_add 2022-09-28 18:57:14 -07:00
gro.c gro: add support of (hw)gro packets to gro stack 2022-10-03 12:38:34 +01:00
hwbm.c
link_watch.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
lwt_bpf.c bpf, lwt: Fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook 2022-04-22 17:45:25 +02:00
lwtunnel.c xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md mode 2022-08-29 10:44:08 +02:00
Makefile net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM 2022-09-07 15:28:08 +01:00
neighbour.c net, neigh: Fix null-ptr-deref in neigh_table_clear() 2022-11-02 20:44:27 -07:00
net_namespace.c net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed 2022-10-24 12:40:06 +01:00
net-procfs.c net: extract a few internals from netdevice.h 2022-04-07 20:32:09 -07:00
net-sysfs.c net-sysfs: Convert to use sysfs_emit() APIs 2022-09-30 12:27:44 +01:00
net-sysfs.h
net-traces.c
netclassid_cgroup.c core: Variable type completion 2022-08-31 09:40:34 +01:00
netevent.c
netpoll.c net: move from strlcpy with unused retval to strscpy 2022-08-22 18:06:18 -07:00
netprio_cgroup.c
of_net.c Revert "of: net: support NVMEM cells with MAC in text format" 2022-01-12 14:14:36 +00:00
page_pool.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
pktgen.c treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
ptp_classifier.c ptp: Add generic PTP is_sync() function 2022-03-07 11:31:34 +00:00
request_sock.c
rtnetlink.c net: rtnetlink: Enslave device before bringing it up 2022-09-20 08:37:44 -07:00
scm.c
secure_seq.c tcp: Fix data-races around sysctl knobs related to SYN option. 2022-07-20 10:14:49 +01:00
selftests.c
skbuff.c net: gso: fix panic on frag_list with mixed head alloc types 2022-11-03 20:58:09 -07:00
skmsg.c bpf, sock_map: Move cancel_work_sync() out of sock lock 2022-11-03 13:51:06 +01:00
sock_destructor.h
sock_diag.c net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
sock_map.c bpf, sock_map: Move cancel_work_sync() out of sock lock 2022-11-03 13:51:06 +01:00
sock_reuseport.c udp: Update reuse->has_conns under reuseport_lock. 2022-10-18 10:17:18 +02:00
sock.c ipv6: Fix data races around sk->sk_prot. 2022-10-12 17:50:37 -07:00
stream.c treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
sysctl_net_core.c net: sysctl: remove unused variable long_max 2022-09-07 15:31:19 +01:00
timestamping.c
tso.c
utils.c net: core: Use csum_replace_by_diff() and csum_sub() instead of opencoding 2022-02-21 11:40:44 +00:00
xdp.c xdp: improve page_pool xdp_return performance 2022-09-26 11:28:19 -07:00