mirror of
https://github.com/torvalds/linux.git
synced 2024-11-22 20:22:09 +00:00
bc9d3a9f2a
Under high contention dst_entry::__refcnt becomes a significant bottleneck. atomic_inc_not_zero() is implemented with a cmpxchg() loop, which goes into high retry rates on contention. Switch the reference count to rcuref_t which results in a significant performance gain. Rename the reference count member to __rcuref to reflect the change. The gain depends on the micro-architecture and the number of concurrent operations and has been measured in the range of +25% to +130% with a localhost memtier/memcached benchmark which amplifies the problem massively. Running the memtier/memcached benchmark over a real (1Gb) network connection the conversion on top of the false sharing fix for struct dst_entry::__refcnt results in a total gain in the 2%-5% range over the upstream baseline. Reported-by: Wangyang Guo <wangyang.guo@intel.com> Reported-by: Arjan Van De Ven <arjan.van.de.ven@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230307125538.989175656@linutronix.de Link: https://lore.kernel.org/r/20230323102800.215027837@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
---|---|---|
.. | ||
netfilter | ||
br_arp_nd_proxy.c | ||
br_cfm_netlink.c | ||
br_cfm.c | ||
br_device.c | ||
br_fdb.c | ||
br_forward.c | ||
br_if.c | ||
br_input.c | ||
br_ioctl.c | ||
br_mdb.c | ||
br_mrp_netlink.c | ||
br_mrp_switchdev.c | ||
br_mrp.c | ||
br_mst.c | ||
br_multicast_eht.c | ||
br_multicast.c | ||
br_netfilter_hooks.c | ||
br_netfilter_ipv6.c | ||
br_netlink_tunnel.c | ||
br_netlink.c | ||
br_nf_core.c | ||
br_private_cfm.h | ||
br_private_mcast_eht.h | ||
br_private_mrp.h | ||
br_private_stp.h | ||
br_private_tunnel.h | ||
br_private.h | ||
br_stp_bpdu.c | ||
br_stp_if.c | ||
br_stp_timer.c | ||
br_stp.c | ||
br_switchdev.c | ||
br_sysfs_br.c | ||
br_sysfs_if.c | ||
br_vlan_options.c | ||
br_vlan_tunnel.c | ||
br_vlan.c | ||
br.c | ||
Kconfig | ||
Makefile |