linux/include
Tariq Toukan 461017cb00 net/mlx5e: Support RX multi-packet WQE (Striding RQ)
Introduce the feature of multi-packet WQE (RX Work Queue Element)
referred to as (MPWQE or Striding RQ), in which WQEs are larger
and serve multiple packets each.

Every WQE consists of many strides of the same size, every received
packet is aligned to a beginning of a stride and is written to
consecutive strides within a WQE.

In the regular approach, each regular WQE is big enough to be capable
of serving one received packet of any size up to MTU or 64K in case of
device LRO is enabled, making it very wasteful when dealing with
small packets or device LRO is enabled.

For its flexibility, MPWQE allows a better memory utilization
(implying improvements in CPU utilization and packet rate) as packets
consume strides according to their size, preserving the rest of
the WQE to be available for other packets.

MPWQE default configuration:
	Num of WQEs	= 16
	Strides Per WQE = 2048
	Stride Size	= 64 byte

The default WQEs memory footprint went from 1024*mtu (~1.5MB) to
16 * 2048 * 64 = 2MB per ring.
However, HW LRO can now be supported at no additional cost in memory
footprint, and hence we turn it on by default and get an even better
performance.

Performance tested on ConnectX4-Lx 50G.
To isolate the feature under test, the numbers below were measured with
HW LRO turned off. We verified that the performance just improves when
LRO is turned back on.

* Netperf single TCP stream:
- BW raised by 10-15% for representative packet sizes:
  default, 64B, 1024B, 1478B, 65536B.

* Netperf multi TCP stream:
- No degradation, line rate reached.

* Pktgen: packet rate raised by 2-10% for traffic of different message
sizes: 64B, 128B, 256B, 1024B, and 1500B.

* Pktgen: packet loss in bursts of small messages (64byte),
single stream:
- | num packets | packets loss before | packets loss after
  |     2K      |       ~ 1K          |       0
  |     8K      |       ~ 6K          |       0
  |     16K     |       ~13K          |       0
  |     32K     |       ~28K          |       0
  |     64K     |       ~57K          |     ~24K

As expected as the driver can receive as many small packets (<=64B) as
the number of total strides in the ring (default = 2048 * 16) vs. 1024
(default ring size regardless of packets size) before this feature.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:05 -04:00
..
acpi Merge branches 'acpi-processor' and 'acpi-cppc' 2016-03-14 14:20:33 +01:00
asm-generic arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections 2016-03-25 16:37:42 -07:00
clocksource
crypto Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-03-17 11:33:45 -07:00
drm drm/ttm: use phys_addr_t for ttm_bus_placement 2016-04-04 17:00:01 -04:00
dt-bindings The clk changes for this release cycle are mostly dominated by 2016-03-23 06:06:45 -07:00
keys
kvm
linux net/mlx5e: Support RX multi-packet WQE (Striding RQ) 2016-04-21 15:09:05 -04:00
math-emu
media Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2016-03-19 16:31:54 -07:00
memory
misc
net libnl: add more helpers to align attributes on 64-bit 2016-04-21 14:22:12 -04:00
pcmcia
ras
rdma Round two of 4.6 merge window patches 2016-03-22 15:48:44 -07:00
rxrpc rxrpc: Static arrays of strings should be const char *const[] 2016-04-11 15:34:40 -04:00
scsi Merge branch 'fixes-base' into fixes 2016-04-05 06:56:47 -04:00
soc IOMMU Updates for Linux v4.6 2016-03-22 11:57:43 -07:00
sound ASoC: Updates for v4.6 2016-03-14 14:03:29 +01:00
target target: add a new add_wwn_groups fabrics method 2016-03-30 20:06:44 -07:00
trace perf, bpf: minimize the size of perf_trace_() tracepoint handler 2016-04-21 13:48:20 -04:00
uapi ipmr: align RTA_MFC_STATS on 64-bit 2016-04-21 14:22:13 -04:00
video gpu: ipu-v3: ipu-dmfc: Rename ipu_dmfc_init_channel to ipu_dmfc_config_wait4eot 2016-03-31 11:24:33 +02:00
xen xen-netback: re-import canonical netif header 2016-03-13 22:08:01 -04:00
Kbuild