linux/net
Huy Nguyen e549f6f9c0 net/dcb: Add dcbnl buffer attribute
In this patch, we add dcbnl buffer attribute to allow user
change the NIC's buffer configuration such as priority
to buffer mapping and buffer size of individual buffer.

This attribute combined with pfc attribute allows advanced user to
fine tune the qos setting for specific priority queue. For example,
user can give dedicated buffer for one or more priorities or user
can give large buffer to certain priorities.

The dcb buffer configuration will be controlled by lldptool.
lldptool -T -i eth2 -V BUFFER prio 0,2,5,7,1,2,3,6
  maps priorities 0,1,2,3,4,5,6,7 to receive buffer 0,2,5,7,1,2,3,6
lldptool -T -i eth2 -V BUFFER size 87296,87296,0,87296,0,0,0,0
  sets receive buffer size for buffer 0,1,2,3,4,5,6,7 respectively

After discussion on mailing list with Jakub, Jiri, Ido and John, we agreed to
choose dcbnl over devlink interface since this feature is intended to set
port attributes which are governed by the netdev instance of that port, where
devlink API is more suitable for global ASIC configurations.

We present an use case scenario where dcbnl buffer attribute configured
by advance user helps reduce the latency of messages of different sizes.

Scenarios description:
On ConnectX-5, we run latency sensitive traffic with
small/medium message sizes ranging from 64B to 256KB and bandwidth sensitive
traffic with large messages sizes 512KB and 1MB. We group small, medium,
and large message sizes to their own pfc enables priorities as follow.
  Priorities 1 & 2 (64B, 256B and 1KB)
  Priorities 3 & 4 (4KB, 8KB, 16KB, 64KB, 128KB and 256KB)
  Priorities 5 & 6 (512KB and 1MB)

By default, ConnectX-5 maps all pfc enabled priorities to a single
lossless fixed buffer size of 50% of total available buffer space. The
other 50% is assigned to lossy buffer. Using dcbnl buffer attribute,
we create three equal size lossless buffers. Each buffer has 25% of total
available buffer space. Thus, the lossy buffer size reduces to 25%. Priority
to lossless  buffer mappings are set as follow.
  Priorities 1 & 2 on lossless buffer #1
  Priorities 3 & 4 on lossless buffer #2
  Priorities 5 & 6 on lossless buffer #3

We observe improvements in latency for small and medium message sizes
as follows. Please note that the large message sizes bandwidth performance is
reduced but the total bandwidth remains the same.
  256B message size (42 % latency reduction)
  4K message size (21% latency reduction)
  64K message size (16% latency reduction)

CC: Ido Schimmel <idosch@idosch.org>
CC: Jakub Kicinski <jakub.kicinski@netronome.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Or Gerlitz <gerlitz.or@gmail.com>
CC: Parav Pandit <parav@mellanox.com>
CC: Aron Silverton <aron.silverton@oracle.com>
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-24 14:22:59 -07:00
..
6lowpan
9p Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-11 20:53:22 -04:00
802
8021q vlan: Add extack messages for link create 2018-05-17 17:08:55 -04:00
appletalk net: Use octal not symbolic permissions 2018-03-26 12:07:48 -04:00
atm net: atm: Fix potential Spectre v1 2018-05-04 12:52:47 -04:00
ax25 net: Use octal not symbolic permissions 2018-03-26 12:07:48 -04:00
batman-adv batman-adv: enable B.A.T.M.A.N. V compilation by default 2018-05-14 09:31:17 +02:00
bluetooth Bluetooth: Add __hci_cmd_send function 2018-05-18 06:37:52 +02:00
bpf bpf: making bpf_prog_test run aware of possible data_end ptr change 2018-04-18 23:34:16 +02:00
bpfilter bpfilter: don't pass O_CREAT when opening console for debug 2018-05-24 09:36:49 -04:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-21 16:01:54 -04:00
caif net: caif: fix spelling mistake "UKNOWN" -> "UNKNOWN" 2018-04-19 13:37:10 -04:00
can net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
ceph libceph: add osd_req_op_extent_osd_data_bvecs() 2018-05-10 10:15:05 +02:00
core devlink: don't take instance lock around eswitch mode set 2018-05-23 14:26:19 -04:00
dcb net/dcb: Add dcbnl buffer attribute 2018-05-24 14:22:59 -07:00
dccp dccp: fix tasklet usage 2018-05-03 15:14:57 -04:00
decnet net: fib_rules: add extack support 2018-04-23 10:21:24 -04:00
dns_resolver KEYS: DNS: limit the length of option strings 2018-04-17 15:17:41 -04:00
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-21 16:01:54 -04:00
ethernet net: core: rework basic flow dissection helper 2018-05-08 00:02:36 -04:00
hsr
ieee802154 net: ieee802154: 6lowpan: fix frag reassembly 2018-04-23 20:56:24 +02:00
ife net: sched: ife: check on metadata length 2018-04-22 21:12:00 -04:00
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2018-05-23 16:37:11 -04:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2018-05-23 16:37:11 -04:00
iucv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-23 11:31:58 -04:00
kcm net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
key af_key: Always verify length of provided sadb_key 2018-04-09 07:06:38 +02:00
l2tp l2tp: consistent reference counting in procfs and debufs 2018-04-27 11:06:35 -04:00
l3mdev
lapb
llc llc: better deal with too small mtu 2018-05-08 00:11:40 -04:00
mac80211 mac80211: Support adding duration for prepare_tx() callback 2018-05-23 11:06:10 +02:00
mac802154 net/mac802154: disambiguate mac80215 vs mac802154 trace events 2018-03-28 22:55:18 +02:00
mpls net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
ncsi net/ncsi: prevent a couple array underflows 2018-05-17 16:27:39 -04:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2018-05-23 16:37:11 -04:00
netlabel
netlink net/netlink: make sure the headers line up actual value output 2018-05-04 13:00:57 -04:00
netrom net: Use octal not symbolic permissions 2018-03-26 12:07:48 -04:00
nfc
nsh nsh: fix infinite loop 2018-05-04 12:54:38 -04:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-11 20:53:22 -04:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-21 16:01:54 -04:00
phonet net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
psample
qrtr net: qrtr: Expose tunneling endpoint to user space 2018-04-27 15:06:10 -04:00
rds rds: do not leak kernel memory to user land 2018-05-03 11:26:14 -04:00
rfkill rfkill: Create rfkill-none LED trigger 2018-05-23 11:26:45 +02:00
rose net: Use octal not symbolic permissions 2018-03-26 12:07:48 -04:00
rxrpc rxrpc: Trace UDP transmission failure 2018-05-10 23:26:01 +01:00
sched net: sched: don't disable bh when accessing action idr 2018-05-22 15:34:34 -04:00
sctp sctp: checkpatch fixups 2018-05-14 23:15:27 -04:00
smc net/smc: longer delay when freeing client link groups 2018-05-23 16:02:35 -04:00
strparser strparser: Do not call mod_delayed_work with a timeout of LONG_MAX 2018-04-22 21:09:16 -04:00
sunrpc NFS client fixes for Linux 4.17-rc4 2018-05-11 13:56:43 -07:00
switchdev
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-11 20:53:22 -04:00
tls Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-21 16:01:54 -04:00
unix af_unix: remove redundant lockdep class 2018-04-04 11:13:40 -04:00
vmw_vsock VSOCK: make af_vsock.ko removable again 2018-04-17 09:44:30 -04:00
wimax
wireless nl80211: Reject disconnect commands except from conn_owner 2018-05-23 11:56:26 +02:00
x25 net: Use octal not symbolic permissions 2018-03-26 12:07:48 -04:00
xdp xsk: fix 64-bit division 2018-05-09 18:12:21 +02:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-11 20:53:22 -04:00
compat.c net: support compat 64-bit time in {s,g}etsockopt 2018-04-27 19:46:06 -04:00
Kconfig net: add skeleton of bpfilter kernel module 2018-05-23 13:23:40 -04:00
Makefile net: add skeleton of bpfilter kernel module 2018-05-23 13:23:40 -04:00
socket.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2018-04-05 11:56:35 -07:00
sysctl_net.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00