The PCI Express and Hypertransport chip-specific source files should only
be built when the kernel has the capability of actually compiling them.
This fixes the driver build on, for example, ia64.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/mad: Fix race between cancel and receive completion
RDMA/amso1100: Fix && typo
RDMA/amso1100: Fix unitialized pseudo_netdev accessed in c2_register_device
IB/ehca: Activate scaling code by default
IB/ehca: Use named constant for max mtu
IB/ehca: Assure 4K alignment for firmware control blocks
When ib_cancel_mad() is called, it puts the canceled send on a list
and schedules a "flushed" callback from process context. However,
this leaves a window where a receive completion could be processed
before the send is fully flushed.
This is fine, except that ib_find_send_mad() will find the MAD and
return it to the receive processing, which results in the sender
getting both a successful receive and a "flushed" send completion for
the same request. Understandably, this confuses the sender, which is
expecting only one of these two callbacks, and leads to grief such as
a use-after-free in IPoIB.
Fix this by changing ib_find_send_mad() to return a send struct only
if the status is still successful (and not "flushed"). The search of
the send_list already had this check, so this patch just adds the same
check to the search of the wait_list.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix the AMSO1100 firmware version computation, which was broken
due to "&&" being used where "&" should have.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Rework some load-time error handling: c2_register_device() leaked when
it failed, and the function that called it didn't check the return code.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change ehca's Kconfig to activates scaling code as default. After
several measurements we saw that this feature prevents dropped packets
(UD) in stress situation. Thus, enabling it helps to improve ehca's
bandwidth through IPoIB.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Define and use a constant EHCA_MAX_MTU instead hardcoded value.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Assure 4K alignment for firmware control blocks in 64K page mode,
because kzalloc()'s result address might not be 4K aligned if 64K
pages are enabled. Thus, we introduce wrappers called
ehca_{alloc,free}_fw_ctrlblock(), which use a slab cache for objects
with 4K length and 4K alignment in order to alloc/free firmware
control blocks in 64K page mode. In 4K page mode those wrappers just
are defines of get_zeroed_page() and free_page().
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Eric's changes to the htirq infrastructure require corresponding
modifications to the ipath HT driver code so that interrupts are still
delivered properly.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Require registration with ib_addr module to prevent caller from
unloading while a callback is in progress.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Several fields in an incoming MAD extended info header were passed
into the MAD_IFC firmware command at incorrect offsets (mostly off by
4 bytes). As the result, the HCA will fail to generate traps in which
this info is needed (e.g. traps which include the GRH of the incoming
packet), in violation of the IB spec.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Return the sq_draining value back to user space for query_qp instead
of the en_sqd_async notify value, which is valid only for
modify_qp. For query_qp, the draining status should returned.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The Ammasso driver needs to use dma_alloc_coherent() for
allocating memory that will be used by the HW for dma.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The eHCA driver does not compile for a uniprocessor configuration
(CONFIG_SMP=n), due to H_SUCCESS and other symbols being undefined.
This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Hoang-Nam Nguyen <HNGUYEN@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When a connection is started (a new connection or a recovered one),
iSER should prepare its resources for full-featured mode and only then
notify the iSCSI layer that it is ready to start queueing commands.
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We discovered a problem when running IPoIB applications on multiple
CPUs on an Altix system. Many messages such as:
ib_mthca 0002:01:00.0: SQ 000014 full (19941644 head, 19941707 tail, 64 max, 0 nreq)
appear in syslog, and the driver wedges up.
Apparently this is because writes to the doorbells from different CPUs
reach the device out of order. The following patch adds mmiowb() calls
after doorbell rings to ensure the doorbell writes are ordered.
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Don't attempt to set up the diagpkt device in the module init code.
Instead, wait until a piece of hardware is initialized. Fixes a
problem when loading the ib_ipath module when no InfiniPath hardware
is present: modprobe would go into the D state and stay there.
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a NULL dereference spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
pci_module_init() convertion in amso1100 driver.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
All HCAs (not just mem-free) need a spare SRQ entry, so bump srq->max
by 1 in all cases.
Noted by Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Since pr_debug() has changed from a macro to an inline function when
DEBUG is not defined, its arguments now need to be defined even when
debugging is off. Therefore to_event_str() and to_qp_state_str() need
to be moved out of #ifdef DEBUG. The compiler will throw the
definitions away if DEBUG is not defined, but it needs to be able to
see that the functions exist.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Currently a DREP is only sent in response to a DREQ if a connection
has been found matching the DREQ, and it is in the proper state. Once
a DREP is sent, the local connection moves into timewait. Duplicate
DREQs received while in this state result in re-sending the DREP.
However, it's likely that the local connection will enter and exit
timewait before the remote side times out a lost DREP and resends a DREQ.
To handle this, we send a DREP in response to a DREQ, even if a local
connection is not found. This avoids maintaining disconnected
id's in timewait states for excessively long times, just to handle a
lost DREP.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the ib_cm module is unloaded while id's are still in timewait, the
CM will destroy the work queue used to process timewait. Once the
id's exit timewait, their timers will fire, leading to a crash trying
to access the destroyed work queue.
We need to track id's that are in timewait, and cancel their deferred
work on module unload.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fill in "max_vl_num" (encoded according to VLCap field in the PortInfo MAD)
and "init_type_reply" values in the ib_query_port() verb.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Enable multiple concurrent connections to the same SRP target:
1) Use port GUID instead of node GUID in the initiator port
identifier. This allows connections to be made from multiple HCA
ports at the same time.
2) Let the user specify the identifier extention when adding the
device. This allows userspace to make multiple connections even
from the same port, if it wants too.
Without this, only one connection can be made from any given HCA, even
if it has multiple ports, because we don't use multi-channel mode, so
targets will only allow one connection from a given initiator port ID.
Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
scsi_host_alloc() already allocates with kzalloc(), so the struct Scsi_Host
is zeroed out, including the private data portion. Remove the redundant
memset that zeros this out again in the SRP initiator.
Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The AMSO driver was not thread-safe in the post WR code and had
code that would sleep if the WR post FIFO was full. Since these
functions can be called on interrupt level I changed the sleep to a
udelay.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.
The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).
Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.
Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.
I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.
This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
struct pt_regs *old_regs = set_irq_regs(regs);
And put the old one back at the end:
set_irq_regs(old_regs);
Don't pass regs through to generic_handle_irq() or __do_IRQ().
In timer_interrupt(), this sort of change will be necessary:
- update_process_times(user_mode(regs));
- profile_tick(CPU_PROFILING, regs);
+ update_process_times(user_mode(get_irq_regs()));
+ profile_tick(CPU_PROFILING);
I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().
Some notes on the interrupt handling in the drivers:
(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.
(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.
(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.
Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
Move the call to ib_register_device() later, since a device should not
be registered until it is completely read to be used. This fixes
crashes that occur if an upper-layer driver such as IPoIB is loaded
before the ehca module.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The PSN used to generate the request following a RDMA read was
incorrect and some state booking wasn't maintained correctly. This
patch fixes that.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Reorganize code relating to cma_get_net_info() and rdam_create_id() to
optimize error case handling (no need to alloc memory/etc. as part of
rdma_create_id() if input parameters are wrong).
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Eliminate remove_list by using list_del_init() instead during device
removal handling.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
On reporting a route error, also include the status for the error,
rather than indicating a status of 0 when an error has occurred.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The race is as follows:
A process : cma_process_remove() calls cma_remove_id_dev(),
which sets id state to CMA_DEVICE_REMOVAL and
calls wait_event(dev_remove).
B process : cma_req_handler() had incremented dev_remove,
and calls cma_acquire_ib_dev() and on failure
calls cma_release_remove(), which does a
wake_up of cma_process_remove(). Then
cma_req_handler() calls rdma_destroy_id();
A Process : cma_remove_id_dev() gets woken and checks the
state of id, and since it is still (wrongly)
CMA_DEVICE_REMOVAL, it calls notify_user(id)
and if that fails, the caller - cma_process_remove()
calls rdma_destroy_id(id). Two processes can
call rdma_destroy_id(), resulting in one
de-referencing kfreed id_priv.
Fix is for process B to set CMA_DESTROYING in cma_req_handler()
so that process A will return instead of doing a rdma_destroy_id().
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
cma_connect_ib() and cma_connect_iw() leak cm_id's in failure cases.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In some places, particularly drivers and __init code, the init utsns is the
appropriate one to use. This patch replaces those with a the init_utsname
helper.
Changes: Removed several uses of init_utsname(). Hope I picked all the
right ones in net/ipv4/ipconfig.c. These are now changed to
utsname() (the per-process namespace utsname) in the previous
patch (2/7)
[akpm@osdl.org: CIFS fix]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Cc: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is mostly included for parity with dec_nlink(), where we will have some
more hooks. This one should stay pretty darn straightforward for now.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (33 commits)
IB/ipath: Fix lockdep error upon "ifconfig ibN down"
IB/ipath: Fix races with ib_resize_cq()
IB/ipath: Support new PCIE device, QLE7142
IB/ipath: Set CPU affinity early
IB/ipath: Fix EEPROM read when driver is compiled with -Os
IB/ipath: Fix and recover TXE piobuf and PBC parity errors
IB/ipath: Change HT CRC message to indicate how to resolve problem
IB/ipath: Clean up module exit code
IB/ipath: Call mtrr_del with correct arguments
IB/ipath: Flush RWQEs if access error or invalid error seen
IB/ipath: Improved support for PowerPC
IB/ipath: Drop unnecessary "(void *)" casts
IB/ipath: Support multiple simultaneous devices of different types
IB/ipath: Fix mismatch in shifts and masks for printing debug info
IB/ipath: Fix compiler warnings and errors on non-x86_64 systems
IB/ipath: Print more informative parity error messages
IB/ipath: Ensure that PD of MR matches PD of QP checking the Rkey
IB/ipath: RC and UC should validate SLID and DLID
IB/ipath: Only allow complete writes to flash
IB/ipath: Count SRQs properly
...
inet_confirm_addr(), inet_ifa_byprefix(), ip_dev_find(), inet_make_mask() and
inet_ifa_match() annotated, along with inferred net-endian variables
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The resize CQ function changes the memory used to store the queue.
Other routines need to honor the lock before accessing the pointer
to the queue and verify that the head and tail are in range.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This change moves around port assignment so that it happens before any
memory is allocated. This allows memory to be allocated on an appropriate
CPU, which improves performance for users of /dev/ipath.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The EEPROM is read via programmable I/O pins. When the driver
is compiled -Os, the CPU can speculatively read the I/O
value before it is valid. This patch fixes the problem.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>