* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/nes: Fix CX4 link problem in back-to-back configuration
RDMA/nes: Clear stall bit before destroying NIC QP
RDMA/nes: Set assume_aligned_header bit
RDMA/cxgb3: Wait at least one schedule cycle during device removal
IB/mad: Ignore iWARP devices on device removal
IPoIB: Include return code in trace message for ib_post_send() failures
IPoIB: Fix TX queue lockup with mixed UD/CM traffic
Commit 09124e19 ("RDMA/nes: Add support for KR device id 0x0110") took
out too much code and broke CX4 link detection in back-to-back
configuration. Put back the code that does the link check.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Clear the stall bit to drop any incoming packets while destroying NIC
QP. This will prevent a chip resource leak.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Set assume_aligned_header bit in QP context as requested by hardware group.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
During a hot-plug LLD removal event or an EEH error event, iw_cxgb3
must ensure that any/all threads that might be in a cxgb3 exported
function must return from the function before iw_cxgb3 returns from
its event processing. Do this by calling synchronize_net().
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (48 commits)
IB/srp: Clean up error path in srp_create_target_ib()
IB/srp: Split send and recieve CQs to reduce number of interrupts
RDMA/nes: Add support for KR device id 0x0110
IB/uverbs: Use anon_inodes instead of private infinibandeventfs
IB/core: Fix and clean up ib_ud_header_init()
RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing
RDMA/cxgb3: Don't allocate the SW queue for user mode CQs
RDMA/cxgb3: Increase the max CQ depth
RDMA/cxgb3: Doorbell overflow avoidance and recovery
IB/core: Pack struct ib_device a little tighter
IB/ucm: Clean whitespace errors
IB/ucm: Increase maximum devices supported
IB/ucm: Use stack variable 'base' in ib_ucm_add_one
IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one
IB/umad: Clean whitespace
IB/umad: Increase maximum devices supported
IB/umad: Use stack variable 'base' in ib_umad_init_port
IB/umad: Use stack variable 'devnum' in ib_umad_init_port
IB/umad: Remove port_table[]
IB/umad: Convert *cdev to cdev in struct ib_umad_port
...
Due to the loop complexicity in nes_nic.c, I'm using char* to copy mc addresses
to it.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for KR device id 0x0110. While at it, cleanup
nes_init_phy() by splitting it into nes_init_1g_phy() and
nes_init_2025_phy().
Remove support for NES_PHY_TYPE_IRIS, which was used on an XFP board
that was only manufactured in small quantities and given out for evals
in even smaller quantities.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ib_ud_header_init() first clears header and then fills up the various
fields. Later on, it tests header->immediate_present, which it has
already cleared, so the condition is always false. Fix this by adding
an immediate_present parameter and setting header->immediate_present
as is done with grh_present. Also remove unused calculation of
header_len.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If cxgb3 calls the iw_cxgb3 t3cclient remove function due to a device
removal event, then the iwch device must be marked with CXIO_ERROR_FATAL
since the device below us is going away. Otherwise, we can get stuck in
a deadlock as RDMA ULPs try and deallocate objects (like MRs, QPs, etc).
So always mark the device with CXIO_ERROR_FATAL when removing.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Only kernel mode CQs need the SW queue memory allocated. The SW queue
for user mode CQs is allocated in userspace by libcxgb3.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
T3 hardware doorbell FIFO overflows can cause application stalls due
to lost doorbell ring events. This has been seen when running large
NP IMB alltoall MPI jobs. The T3 hardware supports an xon/xoff-type
flow control mechanism to help avoid overflowing the HW doorbell FIFO.
This patch uses these interrupts to disable RDMA QP doorbell rings
when we near an overflow condition, and then turn them back on (and
ring all the active QP doorbells) when when the doorbell FIFO empties
out. In addition if an doorbell ring is dropped by the hardware, the
code will now recover.
Design:
cxgb3:
- enable these DB interrupts
- in the interrupt handler, schedule work tasks to call the ULPs event
handlers with the new events.
- ring all the qset txqs when an overflow is detected.
iw_cxgb3:
- disable db ringing on all active qps when we get the DB_FULL event
- enable db ringing on all active qps and ring all active dbs when we get
the DB_EMPTY event
- On DB_DROP event:
- disable db rings in the event handler
- delay-schedule a work task which rings and enables the dbs on
all active qps.
- in post_send and post_recv logic, don't ring the db if it's disabled.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change the nes driver to return -ENOMEM on SQ/RQ overflow to match the
return code of other RDMA HW drivers (e.g cxgb3, ehca, mlx4, mthca).
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Acked-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There is a double disconnect during AE processing, causing crashes.
While fixing the crash, also simplify the AE handling code.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When a listener is destroyed and there is an MPA response pending for
loopback connection, the active side cm_node gets destroyed twice:
once in cm_event_connect_error() and again in nes_accept()/nes_reject().
Increment the cm_node's refcount so it's not destroyed by
cm_event_connect_error().
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
After running long iterative MPI tests, sometimes ethtool reports a
"CM Destroy Listener" count more than the "CM Create Listener" count.
This inconsistency is fixed by making counter variables atomic.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the caller does not pass a valid in_wc to process_mad(), return MAD
failure status, as it is not possible to generate a valid MAD redirect
response (and redirects are the only MAD responses ehca generates).
Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The max_dest_rd_atomic and max_qp_rd_atomic values are properly
returned by query_qp(), so there should not be an error returned when
they are queried.
Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The irq_spinlock is only taken in tasklet context, so it is safe not to
disable hardware interrupts.
Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
struct ib_qp already holds a pointer to the ib device. No need to dive to the
hw device object to retrieve it.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch replaces dev->mc_count in all drivers (hopefully I didn't miss
anything). Used spatch and did small tweaks and conding style changes when
it was suitable.
Jirka
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure compiler won't do weird things with limits by using the
rlimit helpers added in 3e10e716 ("resource: add helpers for fetching
rlimits"). E.g. fetching them twice may return 2 different values
after writable limits are implemented.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Failure to rearm a CQ means the cxgb3 device is wedged, but we shouldn't
kill the whole system with a BUG_ON() if this happens.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Use the %pM kernel extension to display the MAC address.
The only difference in the output is that the MAC address is
shown in the usual colon-separated hex notation.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In mlx4_ib_post_recv(), we should check the queue for overflow using
recv_cq instead of send_cq (current code looks like a copy-and-paste
mistake).
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
As for memfree mthca hardware, ConnectX also requires SRQ WQE scatter
entries to be initialized with the invalid L_Key at SRQ creation time.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix the "ignoring return value of '...', declared with attribute
warn_unused_result" compiler warning in several users of the new kfifo
API.
It removes the __must_check attribute from kfifo_in() and
kfifo_in_locked() which must not necessary performed.
Fix the allocation bug in the nozomi driver file, by moving out the
kfifo_alloc from the interrupt handler into the probe function.
Fix the kfifo_out() and kfifo_out_locked() users to handle a unexpected
end of fifo.
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rename kfifo_put... into kfifo_in... to prevent miss use of old non in
kernel-tree drivers
ditto for kfifo_get... -> kfifo_out...
Improve the prototypes of kfifo_in and kfifo_out to make the kerneldoc
annotations more readable.
Add mini "howto porting to the new API" in kfifo.h
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the pointer to the spinlock out of struct kfifo. Most users in
tree do not actually use a spinlock, so the few exceptions now have to
call kfifo_{get,put}_locked, which takes an extra argument to a
spinlock.
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a new generic kernel FIFO implementation.
The current kernel fifo API is not very widely used, because it has to
many constrains. Only 17 files in the current 2.6.31-rc5 used it.
FIFO's are like list's a very basic thing and a kfifo API which handles
the most use case would save a lot of development time and memory
resources.
I think this are the reasons why kfifo is not in use:
- The API is to simple, important functions are missing
- A fifo can be only allocated dynamically
- There is a requirement of a spinlock whether you need it or not
- There is no support for data records inside a fifo
So I decided to extend the kfifo in a more generic way without blowing up
the API to much. The new API has the following benefits:
- Generic usage: For kernel internal use and/or device driver.
- Provide an API for the most use case.
- Slim API: The whole API provides 25 functions.
- Linux style habit.
- DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros
- Direct copy_to_user from the fifo and copy_from_user into the fifo.
- The kfifo itself is an in place member of the using data structure, this save an
indirection access and does not waste the kernel allocator.
- Lockless access: if only one reader and one writer is active on the fifo,
which is the common use case, no additional locking is necessary.
- Remove spinlock - give the user the freedom of choice what kind of locking to use if
one is required.
- Ability to handle records. Three type of records are supported:
- Variable length records between 0-255 bytes, with a record size
field of 1 bytes.
- Variable length records between 0-65535 bytes, with a record size
field of 2 bytes.
- Fixed size records, which no record size field.
- Preserve memory resource.
- Performance!
- Easy to use!
This patch:
Since most users want to have the kfifo as part of another object,
reorganize the code to allow including struct kfifo in another data
structure. This requires changing the kfifo_alloc and kfifo_init
prototypes so that we pass an existing kfifo pointer into them. This
patch changes the implementation and all existing users.
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Always set bad_wr when an immediate error is detected. Return ENOMEM
for queue full instead of EINVAL to match other drivers.
Signed-off-by: Frank Zago <fzago@systemfabricworks.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
m68k: rename global variable vmalloc_end to m68k_vmalloc_end
percpu: add missing per_cpu_ptr_to_phys() definition for UP
percpu: Fix kdump failure if booted with percpu_alloc=page
percpu: make misc percpu symbols unique
percpu: make percpu symbols in ia64 unique
percpu: make percpu symbols in powerpc unique
percpu: make percpu symbols in x86 unique
percpu: make percpu symbols in xen unique
percpu: make percpu symbols in cpufreq unique
percpu: make percpu symbols in oprofile unique
percpu: make percpu symbols in tracer unique
percpu: make percpu symbols under kernel/ and mm/ unique
percpu: remove some sparse warnings
percpu: make alloc_percpu() handle array types
vmalloc: fix use of non-existent percpu variable in put_cpu_var()
this_cpu: Use this_cpu_xx in trace_functions_graph.c
this_cpu: Use this_cpu_xx for ftrace
this_cpu: Use this_cpu_xx in nmi handling
this_cpu: Use this_cpu operations in RCU
this_cpu: Use this_cpu ops for VM statistics
...
Fix up trivial (famous last words) global per-cpu naming conflicts in
arch/x86/kvm/svm.c
mm/slab.c
When the remote node's ethernet address changes, the connection keeps
trying to connect using the old address. The connection wil continue
failing until the driver is unloaded and loaded again (eiter reboot or
rmmod). Fix this by checking that the NIC has the correct address
before starting a connection.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A FIN that is received during an MPA start up sequence causes a
timeout in iwcm.c. The connection has not been completely closed so
the iwcm code is waiting for resources to be cleaned up. This closes
the connection so everything cleans up correctly.
Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We fail when creating many qps as kmap() fails for sq_vbase.
Fix this by doing kunmap() as soon as we are done with sq_vbase.
We do kunmap() in one of the locations below:
(1) nes_destroy_qp()
(2) nes_accept()
(3) nes_connect_event
We keep a flag to avoid multiple calls to kunmap().
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>