linux/drivers/net/ethernet
Steve Wise 05eb23893c cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes
The current logic suffers from a slow response time to disable user DB
usage, and also fails to avoid DB FIFO drops under heavy load. This commit
fixes these deficiencies and makes the avoidance logic more optimal.
This is done by more efficiently notifying the ULDs of potential DB
problems, and implements a smoother flow control algorithm in iw_cxgb4,
which is the ULD that puts the most load on the DB fifo.

Design:

cxgb4:

Direct ULD callback from the DB FULL/DROP interrupt handler.  This allows
the ULD to stop doing user DB writes as quickly as possible.

While user DB usage is disabled, the LLD will accumulate DB write events
for its queues.  Then once DB usage is reenabled, a single DB write is
done for each queue with its accumulated write count.  This reduces the
load put on the DB fifo when reenabling.

iw_cxgb4:

Instead of marking each qp to indicate DB writes are disabled, we create
a device-global status page that each user process maps.  This allows
iw_cxgb4 to only set this single bit to disable all DB writes for all
user QPs vs traversing the idr of all the active QPs.  If the libcxgb4
doesn't support this, then we fall back to the old approach of marking
each QP.  Thus we allow the new driver to work with an older libcxgb4.

When the LLD upcalls iw_cxgb4 indicating DB FULL, we disable all DB writes
via the status page and transition the DB state to STOPPED.  As user
processes see that DB writes are disabled, they call into iw_cxgb4
to submit their DB write events.  Since the DB state is in STOPPED,
the QP trying to write gets enqueued on a new DB "flow control" list.
As subsequent DB writes are submitted for this flow controlled QP, the
amount of writes are accumulated for each QP on the flow control list.
So all the user QPs that are actively ringing the DB get put on this
list and the number of writes they request are accumulated.

When the LLD upcalls iw_cxgb4 indicating DB EMPTY, which is in a workq
context, we change the DB state to FLOW_CONTROL, and begin resuming all
the QPs that are on the flow control list.  This logic runs on until
the flow control list is empty or we exit FLOW_CONTROL mode (due to
a DB DROP upcall, for example).  QPs are removed from this list, and
their accumulated DB write counts written to the DB FIFO.  Sets of QPs,
called chunks in the code, are removed at one time. The chunk size is 64.
So 64 QPs are resumed at a time, and before the next chunk is resumed, the
logic waits (blocks) for the DB FIFO to drain.  This prevents resuming to
quickly and overflowing the FIFO.  Once the flow control list is empty,
the db state transitions back to NORMAL and user QPs are again allowed
to write directly to the user DB register.

The algorithm is designed such that if the DB write load is high enough,
then all the DB writes get submitted by the kernel using this flow
controlled approach to avoid DB drops.  As the load lightens though, we
resume to normal DB writes directly by user applications.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:44:11 -04:00
..
3com Drivers: net: ethernet: 3com: 3c589_cs fixed coding style issues 2014-02-18 16:59:46 -05:00
8390 net/apne: Remove unused variable ei_local 2014-01-26 22:40:43 -08:00
adaptec net: starfire: remove unnecessary pci_set_drvdata() 2013-10-18 00:03:28 -04:00
adi net: bfin_mac: do not reset PHY after phy_start() 2013-12-09 20:38:59 -05:00
aeroflex drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
allwinner net: ethernet: sunxi: Add new compatibles 2014-02-06 19:46:54 -08:00
alteon drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
amd pcnet32: add missing check for pci_dma_mapping_error 2014-02-19 14:58:27 -05:00
apple macmace: add missing platform_set_drvdata() in mace_probe() 2013-11-11 14:02:08 -05:00
arc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-06 17:37:45 -05:00
atheros alx: add missing stats_lock spinlock init 2014-02-10 17:50:35 -08:00
broadcom net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
brocade Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-03-14 22:31:55 -04:00
cadence net: macb: DMA-unmap full rx-buffer 2014-03-05 20:40:25 -05:00
calxeda drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
chelsio cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes 2014-03-14 22:44:11 -04:00
cirrus drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
cisco enic: Use pci_enable_msix_range() instead of pci_enable_msix() 2014-02-18 15:33:30 -05:00
davicom dm9000: fix a lot of checkpatch issues 2014-01-16 16:22:53 -08:00
dec drivers/net: tulip_remove_one needs to call pci_disable_device() 2014-02-17 00:19:24 -05:00
dlink drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
emulex net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
faraday net: ftgmac100: use kfree_skb() where appropriate 2014-01-17 18:54:13 -08:00
freescale Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-03-14 22:31:55 -04:00
fujitsu drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
hp hp100: replace hardcoded name in /proc/interrupts with interface name 2013-09-27 17:38:32 -04:00
i825xx drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
ibm ibmveth: Fix endian issues with MAC addresses 2014-03-06 16:26:41 -05:00
icplus drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
intel net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
marvell net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
mellanox Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-03-14 22:31:55 -04:00
micrel drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
microchip
moxa drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
myricom myri10ge: Use pci_enable_msix_range() instead of pci_enable_msix() 2014-02-18 15:33:32 -05:00
natsemi Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
neterion net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
nuvoton drivers:net: delete premature free_irq 2013-09-04 13:18:19 -04:00
nvidia net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
nxp drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
octeon Merge branch 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm 2013-11-14 07:55:21 +09:00
oki-semi ethernet: Fix FSF address in file headers 2013-12-06 12:37:55 -05:00
packetengines net: packetengines: slight optimization of addr 2013-12-31 16:48:32 -05:00
pasemi drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
qlogic Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-03-05 20:32:02 -05:00
rdc r6040: use ETH_ZLEN instead of MISR for SKB length checking 2014-01-16 16:22:54 -08:00
realtek net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
renesas sh_eth: update OF PHY registeration 2014-03-13 15:47:37 -04:00
seeq drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
sfc sfc: Use ether_addr_copy and eth_broadcast_addr 2014-03-10 13:53:37 -04:00
sgi drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
silan
sis net: sis900: remove unnecessary pci_set_drvdata() 2013-12-09 18:09:28 -05:00
smsc drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
stmicro stmmac: dwmac-sti: fix broken STiD127 compatibility 2014-03-11 16:14:31 -04:00
sun niu: Use pci_enable_msix_range() instead of pci_enable_msix() 2014-02-18 15:33:34 -05:00
tehuti net: Spelling s/transmition/transmission/ 2014-01-14 17:11:26 -08:00
ti net: eth: cpsw: Use net_device_stats from struct net_device 2014-03-10 21:53:01 -04:00
tile net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
toshiba drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
tundra drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
via net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq 2014-03-14 22:41:36 -04:00
wiznet net: w5100: Use devm_ioremap_resource() 2014-02-28 16:57:24 -05:00
xilinx xilinx: Convert uses of __constant_<foo> to <foo> 2014-03-12 15:28:06 -04:00
xircom ethernet: Fix FSF address in file headers 2013-12-06 12:37:55 -05:00
xscale ixp4xx_eth: Implement the SIOCGHWTSTAMP ioctl 2013-11-21 17:17:48 +00:00
dnet.c drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
dnet.h
ethoc.c net: ethoc: set up MII management bus clock 2014-02-04 20:19:51 -08:00
fealnx.c net: fealnx: remove unnecessary pci_set_drvdata() 2013-10-21 17:21:01 -04:00
jme.c net: jme: remove unnecessary pci_set_drvdata() 2013-10-21 17:21:01 -04:00
jme.h jme: Remove unused #define PFX 2013-11-07 02:14:32 -05:00
Kconfig net: Add MOXA ART SoCs ethernet driver 2013-08-11 21:38:12 -07:00
korina.c drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00
lantiq_etop.c netdevice: add queue selection fallback handler for ndo_select_queue 2014-02-17 00:36:34 -05:00
Makefile net: Add MOXA ART SoCs ethernet driver 2013-08-11 21:38:12 -07:00
netx-eth.c ethernet: Fix FSF address in file headers 2013-12-06 12:37:55 -05:00
s6gmac.c drivers/net: delete non-required instances of include <linux/init.h> 2014-01-16 11:53:26 -08:00