linux

Author	SHA1	Message	Date
Michał Mirosław	3d96c74d89	net: infiniband/ulp/ipoib: convert to hw_features Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-20 01:30:42 -07:00
Joe Perches	948579cd8c	RDMA: Use vzalloc() to replace vmalloc()+memset(0) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-12 11:11:58 -08:00
Or Gerlitz	8ae31e5b1f	IPoIB: Add GRO support Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-10 17:41:55 -08:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Or Gerlitz	a48f509b26	IPoIB: Include return code in trace message for ib_post_send() failures Print the return code of ib_post_send() if it fails to make these debugging messages more useful. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-03-11 13:43:11 -08:00
Eli Cohen	f0dc117abd	IPoIB: Fix TX queue lockup with mixed UD/CM traffic The IPoIB UD QP reports send completions to priv->send_cq, which is usually left unarmed; it only gets armed when the number of outstanding send requests reaches the size of the TX queue. This arming is done only in the send path for the UD QP. However, when sending CM packets, the net queue may be stopped for the same reasons but no measures are taken to recover the UD path from a lockup. Consider this scenario: a host sends high rate of both CM and UD packets, with a TX queue length of N. If at some time the number of outstanding UD packets is more than N/2 and the overall outstanding packets is N-1, and CM sends a packet (making the number of outstanding sends equal N), the TX queue will be stopped. When all the CM packets complete, the number of outstanding packets will still be higher than N/2 so the TX queue will not be restarted. Fix this by calling ib_req_notify_cq() when the queue is stopped in the CM path. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-03-11 13:37:11 -08:00
Alexey Dobriyan	3ffe533c87	ipv6: drop unused "dev" arg of icmpv6_send() Dunno, what was the idea, it wasn't used for a long time. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-18 14:30:17 -08:00
Linus Torvalds	d7e9660ad9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits) netxen: update copyright netxen: fix tx timeout recovery netxen: fix file firmware leak netxen: improve pci memory access netxen: change firmware write size tg3: Fix return ring size breakage netxen: build fix for INET=n cdc-phonet: autoconfigure Phonet address Phonet: back-end for autoconfigured addresses Phonet: fix netlink address dump error handling ipv6: Add IFA_F_DADFAILED flag net: Add DEVTYPE support for Ethernet based devices mv643xx_eth.c: remove unused txq_set_wrr() ucc_geth: Fix hangs after switching from full to half duplex ucc_geth: Rearrange some code to avoid forward declarations phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs drivers/net/phy: introduce missing kfree drivers/net/wan: introduce missing kfree net: force bridge module(s) to be GPL Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded ... Fixed up trivial conflicts: - arch/x86/include/asm/socket.h converted to <asm-generic/socket.h> in the x86 tree. The generic header has the same new #define's, so that works out fine. - drivers/net/tun.c fix conflict between `89f56d1e9` ("tun: reuse struct sock fields") that switched over to using 'tun->socket.sk' instead of the redundantly available (and thus removed) 'tun->sk', and `2b980dbd` ("lsm: Add hooks to the TUN driver") which added a new 'tun->sk' use. Noted in 'next' by Stephen Rothwell.	2009-09-14 10:37:28 -07:00
Roland Dreier	cd0bcf4cb9	IPoIB: Remove unused <rdma/ib_cache.h> includes Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-05 20:23:38 -07:00
Eric Dumazet	451f144398	drivers: Kill now superfluous ->last_rx stores The generic packet receive code takes care of setting netdev->last_rx when necessary, for the sake of the bonding ARP monitor. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Neil Horman <nhorman@txudriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-02 23:07:36 -07:00
Eric Dumazet	adf30907d6	net: skb->dst accessors Define three accessors to get/set dst attached to a skb struct dst_entry skb_dst(const struct sk_buff skb) void skb_dst_set(struct sk_buff skb, struct dst_entry dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 02:51:04 -07:00
Eric W. Biederman	26574401fe	net: Fix ipoib rtnl_lock sysfs deadlock. Network device sysfs files that grab the rtnl_lock unconditionally will deadlock if accessed when the network device is being unregistered. So use trylock and syscall_restart to avoid this deadlock. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 22:15:59 -07:00
Harvey Harrison	5b095d9892	net: replace %p6 with %pI6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:52:50 -07:00
Harvey Harrison	fcace2fe7a	infiniband: ipoib replace IPOIB_GID_FMT with %p6 Replace all uses of IPOIB_GID_FMT, IPOIB_GID_RAW_ARG() and IPOIB_GID_ARG() Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:36 -07:00
Roland Dreier	943c246e9b	IPoIB: Use netif_tx_lock() and get rid of private tx_lock, LLTX Currently, IPoIB is an LLTX driver that uses its own IRQ-disabling tx_lock. Not only do we want to get rid of LLTX, this actually causes problems because of the skb_orphan() done with this tx_lock held: some skb destructors expect to be run with interrupts enabled. The simplest fix for this is to get rid of the driver-private tx_lock and stop using LLTX. We kill off priv->tx_lock and use netif_tx_lock[_bh]() instead; the patch to do this is a tiny bit tricky because we need to update places that take priv->lock inside the tx_lock to disable IRQs, rather than relying on tx_lock having already disabled IRQs. Also, there are a couple of places where we need to disable BHs to make sure we have a consistent context to call netif_tx_lock() (since we no longer can use _irqsave() variants), and we also have to change ipoib_send_comp_handler() to call drain_tx_cq() through a timer rather than directly, because ipoib_send_comp_handler() runs in interrupt context and drain_tx_cq() must run in BH context so it can call netif_tx_lock(). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-09-30 10:36:21 -07:00
David J. Wilder	b1404069f6	IPoIB/cm: Use vmalloc() to allocate rx_rings There are users that are running UDP applications that require a large receive queue size in order to get good performance. To prevent allocation failures for rx_rings when using non-SRQ mode and large recv_queue_size (1K or larger), use vmalloc() instead of kcalloc() to alocate rx_rings. Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-08-08 15:51:29 -07:00
Roland Dreier	e08198169e	IPoIB/cm: Set correct SG list in ipoib_cm_init_rx_wr() wr->sg_list should be set to the sge pointer passed in, not priv->cm.rx_sge. Reported-by: Hoang-Nam Nguyen <HNGUYEN@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-30 07:21:46 -07:00
Eli Cohen	e112373fd6	IPoIB/cm: Reduce connected mode TX object size Since IPoIB connected mode does not NETIF_F_SG, we only have one DMA mapping per send, so we don't need a mapping[] array. Define a new struct with a single u64 mapping member and use it for the CM tx_ring. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:52 -07:00
Eli Cohen	bd3606715e	IPoIB: Use dev_set_mtu() to change mtu When the driver sets the MTU of the net device outside of its change_mtu method, it should make use of dev_set_mtu() instead of directly setting the mtu field of struct netdevice. Otherwise functions registered to be called upon MTU change will not get called (this is done through call_netdevice_notifiers() in dev_set_mtu()). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:51 -07:00
Eli Cohen	c8c2afe360	IPoIB: Use rtnl lock/unlock when changing device flags Use of this lock is required to synchronize changes to the netdvice's data structs. Also move the call to ipoib_flush_paths() after the modification of the netdevice flags in set_mode(). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:51 -07:00
Roland Dreier	a7d834c4bc	IPoIB/cm: Fix racy use of receive WR/SGL in ipoib_cm_post_receive_nonsrq() For devices that don't support SRQs, ipoib_cm_post_receive_nonsrq() is called from both ipoib_cm_handle_rx_wc() and ipoib_cm_nonsrq_init_rx(), and these two callers are not synchronized against each other. However, ipoib_cm_post_receive_nonsrq() always reuses the same receive work request and scatter list structures, so multiple callers can end up stepping on each other, which leads to posting garbled work requests. Fix this by having the caller pass in the ib_recv_wr and ib_sge structures to use, and allocating new local structures in ipoib_cm_nonsrq_init_rx(). Based on a patch by Pradeep Satyanarayana <pradeep@us.ibm.com> and David Wilder <dwilder@us.ibm.com>, with debugging help from Hoang-Nam Nguyen <hnguyen@de.ibm.com>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:47 -07:00
Eli Cohen	f89271da32	IPoIB: Copy small received SKBs in connected mode The connected mode implementation in the IPoIB driver has a large overhead in the way SKBs are handled in the receive flow. It usually allocates an SKB with as big as was used in the currently received SKB and moves unused fragments from the old SKB to the new one. This involves a loop on all the remaining fragments and incurs overhead on the CPU. This patch, for small SKBs, allocates an SKB just large enough to contain the received data and copies to it the data from the received SKB. The newly allocated SKB is passed to the stack and the old SKB is reposted. When running netperf, UDP small messages, without this pach I get: UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 14.4.3.178 (14.4.3.178) port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 114688 128 10.00 5142034 0 526.31 114688 10.00 1130489 115.71 With this patch I get both send and receive at ~315 mbps. The reason that send performance actually slows down is as follows: When using this patch, the overhead of the CPU for handling RX packets is dramatically reduced. As a result, we do not experience RNR NAK messages from the receiver which cause the connection to be closed and reopened again; when the patch is not used, the receiver cannot handle the packets fast enough so there is less time to post new buffers and hence the mentioned RNR NACKs. So what happens is that the application thinks it posted a certain number of packets for transmission but these packets are flushed and do not really get transmitted. Since the connection gets opened and closed many times, each time netperf gets the CPU time that otherwise would have been given to IPoIB to actually transmit the packets. This can be verified when looking at the port counters -- the output of ifconfig and the oputput of netperf (this is for the case without the patch): tx packets ========== port counter: 1,543,996 ifconfig: 1,581,426 netperf: 5,142,034 rx packets ========== netperf 1,1304,089 Signed-off-by: Eli Cohen <eli@mellanox.co.il>	2008-07-14 23:48:44 -07:00
Roland Dreier	f3781d2e89	RDMA: Remove subversion $Id tags They don't get updated by git and so they're worse than useless. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:44 -07:00
Eli Cohen	f56bcd8013	IPoIB: Use separate CQ for UD send completions Use a dedicated CQ for UD send completions. Also, do not arm the UD send CQ, which reduces the number of interrupts generated. This patch farther reduces overhead by not calling poll CQ for every posted send WR -- it does polls only when there 16 or more outstanding work requests. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-29 13:46:53 -07:00
Roland Dreier	9fdd5e5bf6	IPoIB: Handle case when P_Key is deleted and re-added at same index If a P_Key is deleted and then re-added at the same index, then IPoIB gets confused because __ipoib_ib_dev_flush() only checks whether the index is the same without checking whether the P_Key was present, so the interface is stopped when the P_Key is deleted, but the event when the P_Key is re-added gets ignored and the interface never gets restarted. Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey() everywhere in IPoIB, since none of the places that look for P_Keys are in a fast path or in non-sleeping context, and in general we want to kill off the whole caching infrastructure eventually. This also fixes consistency problems caused because some IPoIB queries were cached and some were uncached during the window where the cache was not updated. Thanks to Venkata Subramonyam <vsubramo@cisco.com> for debugging this problem and testing this fix. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:35 -07:00
Eli Cohen	40ca1988e0	IPoIB: Add LSO support For HCAs that support TCP segmentation offload (IB_DEVICE_UD_TSO), set NETIF_F_TSO and use HW LSO to offload TCP segmentation. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:27 -07:00
Eli Cohen	6046136c74	IPoIB: Use checksum offload support if available For HCAs that support checksum offload (ie that set IB_DEVICE_UD_IP_CSUM in the device capabilities flags), have IPoIB set NETIF_F_IP_CSUM and use the HCA to generate and verify IP checksums. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:01:10 -07:00
Roland Dreier	10313cbb92	IPoIB: Allocate priv->tx_ring with vmalloc() Commit `7143740d` ("IPoIB: Add send gather support") made struct ipoib_tx_buf significantly larger, since the mapping member changed from a single u64 to an array with MAX_SKB_FRAGS + 1 entries. This means that allocating tx_rings with kzalloc() may fail because there is not enough contiguous memory for the new, much bigger size. Fix this regression by allocating the rings with vmalloc() instead. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-03-12 07:51:03 -07:00
Roland Dreier	4200406b8f	IPoIB/cm: Set tx_wr.num_sge in connected mode post_send() Commit `7143740d` ("IPoIB: Add send gather support") made it possible for tx_wr.num_sge to be != 1 -- this happens if send gather support is enabled. However, the code in the connected mode post_send() function assumes the old invariant, namely that tx_wr.num_sge is always 1. Fix this by explicitly setting tx_wr.num_sge to 1 in the CM post_send(). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-03-11 18:35:20 -07:00
Pradeep Satyanarayana	ec229e5e81	IPoIB/cm: Fix ipoib_cm_dev_stop() cleanup when drain times out Commit `efcd9971` ("IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list()") introduced a bug in ipoib_cm_dev_stop() when the receive drain times out. In that case, the function moves all the pending rx stuff into a private list but then calls ipoib_cm_free_rx_reap_list(), which handles a different list. Fix this by moving everything to the rx_reap_list that will actually get freed up. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=906>. Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-19 10:25:11 -08:00
Eli Cohen	7143740d26	IPoIB: Add send gather support This patch acts as a preparation for using checksum offload for IB devices capable of inserting/verifying checksum in IP packets. The patch does not actaully turn on NETIF_F_SG - we defer that to the patches adding checksum offload capabilities. We only add support for send gathers for datagram mode, since existing HW does not support checksum offload on connected QPs. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-08 14:32:37 -08:00
Pradeep Satyanarayana	586a693448	IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entries Some HCAs (such as ehca2) support SRQ, but only support fewer than 16 SG entries for SRQs. Currently IPoIB/CM implicitly assumes all HCAs will support 16 SG entries for SRQs (to handle a 64K MTU with 4K pages). This patch removes that restriction by limiting the maximum MTU in connected mode to what the maximum number of SRQ SG entries allows. This patch addresses <https://bugs.openfabrics.org/show_bug.cgi?id=728> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:37 -08:00
Pradeep Satyanarayana	68e995a295	IPoIB/cm: Add connected mode support for devices without SRQs Some IB adapters (notably IBM's eHCA) do not implement SRQs (shared receive queues). The current IPoIB connected mode support only works on devices that support SRQs. Fix this by adding support for using the receive queue of each connected mode receive QP. The disadvantage of this compared to using an SRQ is that it means a full queue of receives must be posted for each remote connected mode peer, which means that total memory usage is potentially much higher than when using SRQs. To manage this, add a new module parameter "max_nonsrq_conn_qp" that limits the number of connections allowed per interface. The rest of the changes are fairly straightforward: we use a table of struct ipoib_cm_rx to hold all the active connections, and put the table index of the connection in the high bits of receive WR IDs. This is needed because we cannot rely on the struct ib_wc.qp field for non-SRQ receive completions. Most of the rest of the changes just test whether or not an SRQ is available, and post receives or find received packets in the right place depending on the answer. Cleaning up dead connections actually becomes simpler, because we do not have to do the "last WQE reached" dance that is required to destroy QPs attached to an SRQ. We just move the QP to the error state and wait for all pending receives to be flushed. Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> [ Completely rewritten and split up, based on Pradeep's work. Several bugs fixed and no doubt several bugs introduced. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:24 -08:00
Roland Dreier	efcd99717f	IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list() Factor out the code for going through the rx_reap list of struct ipoib_cm_rx and freeing each one. This consolidates the code duplicated between ipoib_cm_dev_stop() and ipoib_cm_rx_reap() and reduces the risk of error when adding additional accounting. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:24 -08:00
Roland Dreier	7b3687df66	IPoIB/cm: Factor out ipoib_cm_create_srq() Factor out the code to create an SRQ and allocate the receive ring in ipoib_cm_dev_init() into a new function ipoib_cm_create_srq(). This will make the code neater when support for devices that don't implement SRQs is added. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:24 -08:00
Roland Dreier	1efb61444c	IPoIB/cm: Factor out ipoib_cm_free_rx_ring() Factor out the code to unmap/free skbs and free the receive ring in ipoib_cm_dev_cleanup() into a new function ipoib_cm_free_rx_ring(). This function will be called from a couple of other places when support for devices that don't implement SRQs is added. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:24 -08:00
Roland Dreier	2337f80941	IPoIB: Trivial formatting cleanups Fix whitespace blunders, convert "foo* bar" to "foo *bar", etc. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:23 -08:00
Roland Dreier	09f60f8f54	IPoIB/cm: Fix receive QP cleanup Commit `1b524963` ("IPoIB/cm: Use common CQ for CM send completions") changed how the high-order bits of work request IDs were used, which had the effect that IPOIB_CM_RX_DRAIN_WRID was no longer handled as a connected mode receive completion. This leads to the messages ib1: cm send completion event with wrid 1073741823 (> 64) ib1: RX drain timing out when an interface with connected mode QPs is brought down. Fix this by making sure that both IPOIB_OP_CM and IPOIB_OP_RECV are set in IPOIB_CM_RX_DRAIN_WRID. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-26 13:44:25 -07:00
Michael S. Tsirkin	1b524963fd	IPoIB/cm: Use common CQ for CM send completions Use the same CQ for CM send completions as for all other IPoIB completions. This means all completions are processed via the same NAPI polling routine. This should help reduce the number of interrupts for bi-directional traffic (such as TCP) and fixes "driver is hogging interrupts" errors reported for IPoIB send side, e.g. <https://bugs.openfabrics.org/show_bug.cgi?id=508> To do this, keep a per-interface counter of outstanding send WRs, and stop the interface when this counter reaches the send queue size to avoid CQ overruns. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-19 21:39:34 -07:00
Roland Dreier	fd312561ad	IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))" It's too hard to figure out what "!likely(...)" really means, and who knows how compilers interpret the hint. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-17 21:54:44 -07:00
Linus Torvalds	ce9d3c9a6a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (87 commits) mlx4_core: Fix section mismatches IPoIB: Allow setting policy to ignore multicast groups IB/mthca: Mark error paths as unlikely() in post_srq_recv functions IB/ipath: Minor fix to ordering of freeing and zeroing of tid pages. IB/ipath: Remove redundant link state checks IB/ipath: Fix IB_EVENT_PORT_ERR event IB/ipath: Better handling of unexpected GPIO interrupts IB/ipath: Maintain active time on all chips IB/ipath: Fix QHT7040 serial number check IB/ipath: Indicate a couple of chip bugs to userspace IB/ipath: iba6110 rev4 no longer needs recv header overrun workaround IB/ipath: Use counters in ipath_poll and cleanup interrupts in ipath_close IB/ipath: Remove duplicate copy of LMC IB/ipath: Add ability to set the LMC via the sysfs debugging interface IB/ipath: Optimize completion queue entry insertion and polling IB/ipath: Implement IB_EVENT_QP_LAST_WQE_REACHED IB/ipath: Generate flush CQE when QP is in error state IB/ipath: Remove redundant code IB/ipath: Future proof eeprom checksum code (contents reading) IB/ipath: UC RDMA WRITE with IMMEDIATE doesn't send the immediate ...	2007-10-11 19:43:13 -07:00
Roland Dreier	de90351219	[IPoIB]: Convert to netdevice internal stats Use the stats member of struct netdevice in IPoIB, so we can save memory by deleting the stats member of struct ipoib_dev_priv, and save code by deleting ipoib_get_stats(). Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:41 -07:00
Dotan Barak	ede6bc04f3	IPoIB/cm: Clean up initialization of QP attr in ipoib_cm_create_tx_qp() Make the way QP is being created in ipoib_cm_create_tx_qp() consistent with ipoib_cm_create_rx_qp(). Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:18 -07:00
Sean Hefty	1d84612649	IB/cm: Include HCA ACK delay in local ACK timeout The IB CM should include the HCA ACK delay when calculating the local ACK timeout value to use for RC QPs. If the HCA ACK delay is large enough relative to the packet life time, then if it is not taken into account, the calculated timeout value ends up being too small, which can result in "retry exceeded" errors. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 21:50:05 -07:00
Roland Dreier	20089ca557	IPoIB/cm: Fix warning if IPV6 is not enabled Fix drivers/infiniband/ulp/ipoib/ipoib_cm.c:1151: warning: unused variable 'dev' by getting rid of the variable dev, which is only used if CONFIG_IPV6 is enabled, and replacing the one use of it with the value it is assigned, namely priv->dev. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 11:18:34 -07:00
Ralph Campbell	841adfca9c	IPoIB/cm: Partial error clean up unmaps wrong address If a page can't be allocated for the frag list of a skb, the code to unmap the partially allocated list is off by one. For exaple, if 'frags' equals one, i == 0, and the alloc_page() fails, then the old loop would have unmapped mapping[1] which is uninitialized. The same would happen if the call to ib_dma_map_page() failed. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-02 20:48:31 -07:00
Roland Dreier	13ef5f44c3	IPoIB/cm: Remove dead definition of struct ipoib_cm_id It's completely unused. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-21 13:39:08 -07:00
Michael S. Tsirkin	82c3aca6ad	IPoIB/cm: Fix interoperability when MTU doesn't match IPoIB connected mode currently rejects a connection request unless the supported MTU is >= the local netdevice MTU. This breaks interoperability with implementations that might have tweaked IPOIB_CM_MTU, and there's real no longer a reason to do so: this test is just a leftover from when we did not tweak MTU per-connection. Fix this by making the test as permissive as possible. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-21 13:38:08 -07:00
Michael S. Tsirkin	3ec7393a68	IPoIB/cm: Initialize RX before moving QP to RTR Fix a crasher bug in IPoIB CM: once a QP is in the RTR state, a receive completion (or even an asynchronous error) might be observed on this QP, so we have to initialize all of our receive data structures before moving to the RTR state. As an optimization (since modify_qp might take a long time), the jiffies update done when moving RX to the passive_ids list is also left in place to reduce the chance of the RX being misdetected as stale. This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=662>. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-21 13:03:50 -07:00
Michael S. Tsirkin	ec56dc0b7f	IPoIB/cm: Fix performance regression on Mellanox commit `518b1646` ("IPoIB/cm: Fix SRQ WR leak") introduced a severe performance regression on Mellanox cards, because keeping a QP in the error state for extended periods of time moves hardware to the slow path (until the QP is destroyed). For example, MPI latency goes from ~3 usecs to ~7 usecs. Fix this by posting a send WR on one of the QPs that are being flushed, instead of using a separate drain QP that is kept in the error state. This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=636>, reported and bisected by Scott Weitzenkamp at Cisco and debugged by Sasha Mikheev at Voltaire. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-29 16:07:09 -07:00

1 2

69 Commits