tcp: defer skb freeing after socket lock is released
tcp recvmsg() (or rx zerocopy) spends a fair amount of time freeing skbs after their payload has been consumed. A typical ~64KB GRO packet has to release ~45 page references, eventually going to page allocator for each of them. Currently, this freeing is performed while socket lock is held, meaning that there is a high chance that BH handler has to queue incoming packets to tcp socket backlog. This can cause additional latencies, because the user thread has to process the backlog at release_sock() time, and while doing so, additional frames can be added by BH handler. This patch adds logic to defer these frees after socket lock is released, or directly from BH handler if possible. Being able to free these skbs from BH handler helps a lot, because this avoids the usual alloc/free assymetry, when BH handler and user thread do not run on same cpu or NUMA node. One cpu can now be fully utilized for the kernel->user copy, and another cpu is handling BH processing and skb/page allocs/frees (assuming RFS is not forcing use of a single CPU) Tested: 100Gbit NIC Max throughput for one TCP_STREAM flow, over 10 runs MTU : 1500 Before: 55 Gbit After: 66 Gbit MTU : 4096+(headers) Before: 82 Gbit After: 95 Gbit Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
committed by
David S. Miller
parent
3df684c1a3
commit
f35f821935
@@ -63,6 +63,7 @@
|
||||
#include <linux/indirect_call_wrapper.h>
|
||||
#include <linux/atomic.h>
|
||||
#include <linux/refcount.h>
|
||||
#include <linux/llist.h>
|
||||
#include <net/dst.h>
|
||||
#include <net/checksum.h>
|
||||
#include <net/tcp_states.h>
|
||||
@@ -408,6 +409,8 @@ struct sock {
|
||||
struct sk_buff *head;
|
||||
struct sk_buff *tail;
|
||||
} sk_backlog;
|
||||
struct llist_head defer_list;
|
||||
|
||||
#define sk_rmem_alloc sk_backlog.rmem_alloc
|
||||
|
||||
int sk_forward_alloc;
|
||||
|
||||
Reference in New Issue
Block a user