linux/include/net
SeongJae Park 0b9b241406 inet: frags: batch fqdir destroy works
On a few of our systems, I found frequent 'unshare(CLONE_NEWNET)' calls
make the number of active slab objects including 'sock_inode_cache' type
rapidly and continuously increase.  As a result, memory pressure occurs.

In more detail, I made an artificial reproducer that resembles the
workload that we found the problem and reproduce the problem faster.  It
merely repeats 'unshare(CLONE_NEWNET)' 50,000 times in a loop.  It takes
about 2 minutes.  On 40 CPU cores / 70GB DRAM machine, the available
memory continuously reduced in a fast speed (about 120MB per second,
15GB in total within the 2 minutes).  Note that the issue don't
reproduce on every machine.  On my 6 CPU cores machine, the problem
didn't reproduce.

'cleanup_net()' and 'fqdir_work_fn()' are functions that deallocate the
relevant memory objects.  They are asynchronously invoked by the work
queues and internally use 'rcu_barrier()' to ensure safe destructions.
'cleanup_net()' works in a batched maneer in a single thread worker,
while 'fqdir_work_fn()' works for each 'fqdir_exit()' call in the
'system_wq'.  Therefore, 'fqdir_work_fn()' called frequently under the
workload and made the contention for 'rcu_barrier()' high.  In more
detail, the global mutex, 'rcu_state.barrier_mutex' became the
bottleneck.

This commit avoids such contention by doing the 'rcu_barrier()' and
subsequent lightweight works in a batched manner, as similar to that of
'cleanup_net()'.  The fqdir hashtable destruction, which is done before
the 'rcu_barrier()', is still allowed to run in parallel for fast
processing, but this commit makes it to use a dedicated work queue
instead of the 'system_wq', to make sure that the number of threads is
bounded.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20201211112405.31158-1-sjpark@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-12 15:08:54 -08:00
..
9p net: 9p: drop duplicate word in comment 2020-07-15 20:34:11 -07:00
bluetooth Bluetooth: Change MGMT security info CMD to be more generic 2020-12-07 17:01:42 +02:00
caif net: caif: Remove unused caif SPI driver 2020-09-29 14:02:53 -07:00
iucv net/af_iucv: clean up function prototypes 2020-05-19 12:50:14 -07:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-12-11 22:29:38 -08:00
netns sctp: add encap_port for netns sock asoc and transport 2020-10-30 15:24:06 -07:00
nfc net/nfc/nci: Support NCI 2.x initial sequence 2020-12-04 17:47:35 -08:00
phonet treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 336 2019-06-05 17:37:07 +02:00
sctp sctp: bring inet(6)_skb_parm back to sctp_input_cb 2020-11-05 14:27:30 -08:00
tc_act Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
6lowpan.h 6lowpan: Replace zero-length array with flexible-array member 2020-02-28 14:51:30 +01:00
act_api.h net/sched: sch_frag: add generic packet fragment support. 2020-11-27 14:36:02 -08:00
addrconf.h ipv6: some fixes for ipv6_dev_find() 2020-08-18 15:58:53 -07:00
af_ieee802154.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
af_rxrpc.h rxrpc: Make rxrpc_kernel_get_srtt() indicate validity 2020-08-20 18:21:28 +01:00
af_unix.h unix: uses an atomic type for scm files accounting 2020-02-28 12:12:53 -08:00
af_vsock.h vsock: add local transport support in the vsock core 2019-12-11 15:01:23 -08:00
ah.h
arp.h net: avoid potential false sharing in neighbor related code 2019-11-06 16:14:48 -08:00
atmclip.h
ax25.h ax25: fix possible use-after-free 2019-01-23 11:18:00 -08:00
ax88796.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
bareudp.h bareudp: Reverted support to enable & disable rx metadata collection 2020-07-21 18:30:47 -07:00
bond_3ad.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 90 2019-05-24 17:37:53 +02:00
bond_alb.h bonding/alb: Add helper functions to get the xmit slave 2020-05-01 12:15:37 -07:00
bond_options.h bonding: add an option to specify a delay between peer notifications 2019-07-04 12:30:48 -07:00
bonding.h bonding: fix feature flag setting at init time 2020-12-08 11:26:08 -08:00
bpf_sk_storage.h bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP 2020-11-12 18:39:28 -08:00
busy_poll.h net, xdp, xsk: fix __sk_mark_napi_id_once napi_id error 2020-12-01 15:51:19 +01:00
calipso.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 2019-05-21 11:28:45 +02:00
cfg80211-wext.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
cfg80211.h nl80211: add common API to configure SAR power limitations 2020-12-11 13:38:54 +01:00
cfg802154.h cfg802154: Replace zero-length array with flexible-array member 2020-02-29 14:39:08 +01:00
checksum.h saner calling conventions for csum_and_copy_..._user() 2020-08-20 15:45:15 -04:00
cipso_ipv4.h cipso: Remove unused inline functions 2020-07-15 07:45:24 -07:00
cls_cgroup.h bpf: Allow to retrieve cgroup v1 classid from v2 hooks 2020-03-27 19:40:38 -07:00
codel_impl.h
codel_qdisc.h
codel.h
compat.h compat: always include linux/compat.h from net/compat.h 2020-11-23 13:31:54 -08:00
datalink.h
dcbevent.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
dcbnl.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
devlink.h devlink: Add blackhole_nexthop trap 2020-11-24 12:14:56 -08:00
dn_dev.h
dn_fib.h net: dn_fib: Replace zero-length array with flexible-array member 2020-02-29 21:52:20 -08:00
dn_neigh.h
dn_nsp.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 24 2019-05-21 11:52:39 +02:00
dn_route.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 24 2019-05-21 11:52:39 +02:00
dn.h
dsa.h net: dsa: Give drivers the chance to veto certain upper devices 2020-11-05 14:04:49 -08:00
dsfield.h ipv6: Annotate bitwise IPv6 dsfield pointer cast 2019-12-16 16:09:44 -08:00
dst_cache.h
dst_metadata.h
dst_ops.h net/dst: use a smaller percpu_counter batch for dst entries accounting 2020-05-08 21:33:33 -07:00
dst.h mpls: drop skb's dst in mpls_forward() 2020-11-03 12:55:53 -08:00
erspan.h erspan: Add type I version 0 support. 2020-05-05 13:23:29 -07:00
esp.h ESP: Export esp_output_fill_trailer function 2020-02-19 13:52:32 +01:00
espintcp.h xfrm: espintcp: save and call old ->sk_destruct 2020-04-20 07:34:16 +02:00
ethoc.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
failover.h
fib_notifier.h ipv6: Remove old route notifications and convert listeners 2019-12-24 22:37:30 -08:00
fib_rules.h fib: use indirect call wrappers in the most common fib_rules_ops 2020-07-28 17:42:31 -07:00
firewire.h
flow_dissector.h net/flow_dissector: add packet hash dissection 2020-07-24 15:23:31 -07:00
flow_offload.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-08-05 20:13:21 -07:00
flow.h ipv4: Initialize flowi4_multipath_hash in data path 2020-09-14 14:54:56 -07:00
fou.h
fq_impl.h net/fq_impl: use skb_get_hash instead of skb_get_hash_perturb 2020-07-31 09:24:24 +02:00
fq.h net/fq_impl: use skb_get_hash instead of skb_get_hash_perturb 2020-07-31 09:24:24 +02:00
garp.h treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
gen_stats.h net_sched: extend packet counter to 64bit 2019-11-05 18:20:55 -08:00
genetlink.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-05 18:40:01 -07:00
geneve.h net: Move the definition of the default Geneve udp port to public header file 2019-03-22 12:09:31 -07:00
gre.h
gro_cells.h
gtp.h
gue.h GUE: Fix a typo 2020-06-22 21:12:44 -07:00
hwbm.h net: hwbm: if CONFIG_NET_HWBM unset, make stub functions static 2019-10-25 16:24:32 -07:00
icmp.h icmp: introduce helper for nat'd source address in network device context 2020-02-13 14:19:00 -08:00
ieee80211_radiotap.h mac80211: add radiotap flag to assure frames are not reordered 2020-11-06 11:01:01 +01:00
ieee802154_netdev.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
if_inet6.h ipv6: Replace zero-length array with flexible-array 2020-05-11 13:18:54 -07:00
ife.h net: ife: drop include of module.h from net/ife.h 2019-04-22 21:50:53 -07:00
ila.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
inet6_connection_sock.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
inet6_hashtables.h net: Track socket refcounts in skb_steal_sock() 2020-03-30 13:45:04 -07:00
inet_common.h bpf: Allow any port in bpf_bind helper 2020-05-09 00:48:20 +02:00
inet_connection_sock.h tcp: add exponential backoff in __tcp_send_ack() 2020-09-30 14:21:30 -07:00
inet_ecn.h inet_ecn: Fix endianness of checksum update when setting ECT(1) 2020-12-01 17:16:54 -08:00
inet_frag.h inet: frags: batch fqdir destroy works 2020-12-12 15:08:54 -08:00
inet_hashtables.h tcp: fix race condition when creating child sockets from syncookies 2020-11-23 16:32:33 -08:00
inet_sock.h inet: remove inet_sk_copy_descendant() 2020-08-26 07:33:19 -07:00
inet_timewait_sock.h tcp: honor SO_PRIORITY in TIME_WAIT state 2019-09-27 12:05:02 +02:00
inetpeer.h net: ipv4: use a dedicated counter for icmp_v4 redirect packets 2019-02-08 21:50:15 -08:00
ip6_checksum.h tcp: remove indirect calls for icsk->icsk_af_ops->send_check 2020-06-20 17:47:53 -07:00
ip6_fib.h net: ip6_fib.h: drop duplicate word in comment 2020-07-15 20:34:11 -07:00
ip6_route.h ipv6: lift copy_from_user out of ipv6_route_ioctl 2020-05-18 17:35:02 -07:00
ip6_tunnel.h ip6_tunnel: allow not to count pkts on tstats by passing dev as NULL 2019-06-18 20:48:45 -04:00
ip_fib.h ipv4: nexthop version of fib_info_nh_uses_dev 2020-05-26 16:06:07 -07:00
ip_tunnels.h Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-19 19:08:46 -08:00
ip_vs.h ipvs: remove dependency on ip6_tables 2020-08-31 23:06:51 +02:00
ip.h inet: constify inet_sdif() argument 2020-11-10 17:56:54 -08:00
ipcomp.h
ipconfig.h
ipv6_frag.h ipv6: Remove dependency of ipv6_frag_thdr_truncated on ipv6 module 2020-11-19 10:49:50 -08:00
ipv6_stubs.h ipv6: add ipv6_fragment hook in ipv6_stub 2020-08-31 12:26:39 -07:00
ipv6.h ipv6: Remove dependency of ipv6_frag_thdr_truncated on ipv6 module 2020-11-19 10:49:50 -08:00
ipx.h bonding/alb: properly access headers in bond_alb_xmit() 2020-02-05 14:28:09 +01:00
iw_handler.h
kcm.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
l3mdev.h l3mdev: add infrastructure for table to VRF mapping 2020-06-20 17:22:22 -07:00
lag.h
lapb.h
lib80211.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h llc: fix sk_buff leak in llc_conn_service() 2019-10-08 13:23:05 -07:00
llc_if.h
llc_pdu.h
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
llc.h
lwtunnel.h net: add net available in build_state 2020-03-29 22:30:57 -07:00
mac80211.h mac80211: add ieee80211_set_sar_specs 2020-12-11 13:39:59 +01:00
mac802154.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
macsec.h net: macsec: add support for getting offloaded stats 2020-03-26 20:17:36 -07:00
mip6.h net: mip6: Replace zero-length array with flexible-array member 2020-03-02 11:16:27 -08:00
mld.h net: ipv6: mld: Replace zero-length array with flexible-array member 2020-02-29 21:52:20 -08:00
mpls_iptunnel.h net: mpls: Replace zero-length array with flexible-array member 2020-02-28 12:08:37 -08:00
mpls.h net: Make mpls_entry_encode() available for generic users 2020-05-29 21:20:20 -07:00
mptcp.h mptcp: add port support for ADD_ADDR suboption writing 2020-12-09 19:02:15 -08:00
mrp.h treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
ncsi.h
ndisc.h ipv6: ndisc: adjust ndisc_ifinfo_sysctl_change prototype 2020-08-24 06:40:07 -07:00
neighbour.h net: Exempt multicast addresses from five-second neighbor lifetime 2020-11-13 14:24:39 -08:00
net_failover.h
net_namespace.h bpf, net: Rework cookie generator as per-cpu one 2020-09-30 11:50:35 -07:00
net_ratelimit.h
netevent.h
netlabel.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 2019-05-21 11:28:45 +02:00
netlink.h treewide: rename nla_strlcpy to nla_strscpy. 2020-11-16 08:08:54 -08:00
netprio_cgroup.h netprio: use css ID instead of cgroup ID 2019-11-12 08:18:03 -08:00
netrom.h net: netrom: Fix error cleanup path of nr_proto_init 2019-04-11 13:59:49 -07:00
nexthop.h nexthop: Pass extack to register_nexthop_notifier() 2020-11-06 11:28:49 -08:00
nl802154.h
nsh.h
p8022.h
page_pool.h net: page_pool: Add bulk support for ptr_ring 2020-11-14 02:29:00 +01:00
pie.h pie: realign comment 2020-03-04 13:25:55 -08:00
ping.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
pkt_cls.h net: sched: remove redundant 'rtnl_held' argument 2020-12-01 11:24:56 -08:00
pkt_sched.h net: sched: convert tasklets to use new tasklet_setup() API 2020-11-07 10:41:15 -08:00
pptp.h
protocol.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
psample.h net: sched: take reference to psample group in flow_action infra 2019-09-16 09:18:03 +02:00
psnap.h
raw.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
rawv6.h
red.h net: sched: RED: Introduce an ECN nodrop mode 2020-03-14 21:03:46 -07:00
regulatory.h net/wireless: regulatory.h: drop duplicate word in comment 2020-07-31 09:24:23 +02:00
request_sock.h tcp: bpf: Optionally store mac header in TCP_SAVE_SYN 2020-08-24 14:35:00 -07:00
rose.h
route.h Remove DST_HOST 2020-03-23 21:57:44 -07:00
rpl.h net: ipv6: Use struct_size() helper and kcalloc() 2020-06-23 20:27:09 -07:00
rsi_91x.h
rtnetlink.h
rtnh.h net: Rename net/nexthop.h net/rtnh.h 2019-04-22 21:47:25 -07:00
sch_generic.h net/sched: sch_frag: add generic packet fragment support. 2020-11-27 14:36:02 -08:00
scm.h fs: Move __scm_install_fd() to __receive_fd() 2020-07-13 11:03:44 -07:00
secure_seq.h
seg6_hmac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
seg6_local.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
seg6.h seg6: fix seg6_validate_srh() to avoid slab-out-of-bounds 2020-06-04 15:39:32 -07:00
slhc_vj.h
smc.h net/smc: introduce CHID callback for ISM devices 2020-09-28 15:19:03 -07:00
snmp.h net/tls: add skeleton of MIB statistics 2019-10-05 16:29:00 -07:00
sock_reuseport.h net: sock_reuseport: Replace zero-length array with flexible-array member 2020-02-29 21:52:19 -08:00
sock.h Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2020-12-04 07:48:12 -08:00
Space.h
stp.h
strparser.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
switchdev.h bridge: switchdev: Notify about VLAN protocol changes 2020-12-01 15:21:13 -08:00
tcp_states.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tcp.h Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2020-12-04 07:48:12 -08:00
timewait_sock.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tipc.h
tls_toe.h net/tls: rename tls_hw_* functions tls_toe_* 2019-10-04 14:07:07 -07:00
tls.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-27 18:25:27 -08:00
transp_v6.h tcp: move ipv4_specific to tcp include file 2020-06-23 20:10:15 -07:00
tso.h net: tso: cache transport header length 2020-06-18 20:46:23 -07:00
tun_proto.h
udp_tunnel.h udp_tunnel: add the ability to share port tables 2020-09-28 12:50:12 -07:00
udp.h inet: udp{4|6}_lib_lookup_skb() skb argument is const 2020-11-10 17:57:14 -08:00
udplite.h
vsock_addr.h vsock: remove include/linux/vm_sockets.h file 2019-11-14 18:12:17 -08:00
vxlan.h net: sched: only keep the available bits when setting vxlan md->gbp 2020-09-14 16:49:39 -07:00
wext.h
x25.h net/x25: add new state X25_STATE_5 2019-12-09 10:28:43 -08:00
x25device.h
xdp_priv.h page_pool: do not release pool until inflight == 0. 2019-11-16 12:39:10 -08:00
xdp_sock_drv.h xsk: Introduce batched Tx descriptor interfaces 2020-11-17 22:07:40 +01:00
xdp_sock.h xsk: Fix umem cleanup bug at socket destruct 2020-11-20 15:52:39 +01:00
xdp.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-12-11 22:29:38 -08:00
xfrm.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-05 18:40:01 -07:00
xsk_buff_pool.h xsk: Fix possible memory leak at socket close 2020-10-29 15:19:56 +01:00