Return the receive queue sizes for both userspace QPs and kernel Qps
(not just kernel QPs) from mlx4_ib_query_qp(). Also zero the send
queue sizes for userspace QPs to avoid a possible information leak,
and set the max_inline_data for kernel QPs to 0 since inline sends are
not supported for kernel QPs.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Local write permission makes no sense as part of the QP access flags,
since the access flags only control what the remote end of the
connection is allowed to do. Remove the code in the RDMA CM that
initializes qp_access_flags with IB_ACCESS_LOCAL_WRITE.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit 9db48926 ("drivers/infiniband/hw/mthca/mthca_qp: kill uninit'd
var warning") added "= 0" to the declarations of f0 to shut up gcc
warnings. However, there's no point in making the code bigger by
initializing f0 to a random value just to get rid of a warning;
setting f0 to 0 is no safer than just using uninitialized_var(), which
documents the situation better and gives smaller code too. For example,
on x86_64:
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-16 (-16)
function old new delta
mthca_tavor_post_send 1352 1344 -8
mthca_arbel_post_send 1489 1481 -8
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Make some functions that are only used in a single .c file static. In
addition to being a cleanup, this shrinks the generated code. On x86_64:
add/remove: 1/3 grow/shrink: 2/1 up/down: 4777/-4956 (-179)
function old new delta
handle_errors - 3994 +3994
__verbs_timer 42 710 +668
ipath_do_ruc_send 2131 2246 +115
ipath_no_bufs_available 136 - -136
ipath_disarm_senderrbufs 639 - -639
ipath_ib_timer 658 - -658
ipath_intr 5878 2355 -3523
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Make iser_conn_release() and iser_start_rdma_unaligned_sg() static,
since they are only used in the .c file where they are defined. In
addition to being a cleanup, this even shrinks the generated code by
allowing the single call of iser_start_rdma_unaligned_sg() to be
inlined into its callsite. On x86_64:
add/remove: 0/1 grow/shrink: 1/0 up/down: 466/-533 (-67)
function old new delta
iser_reg_rdma_mem 1518 1984 +466
iser_start_rdma_unaligned_sg 533 - -533
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When warning about out-of-date firmware, current mthca code messes up
the formatting of the version if the subminor doesn't have three
digits. It doesn't fill the field with 0s so we end up with:
ib_mthca 0000:0b:00.0: HCA FW version 1.1. 0 is old (1.2. 0 is current).
Change the format from "%3d" to "%03d" to get the right thing printed.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The mthca driver supports both MSI and MSI-X. However, MSI-X works with
all hardware that the driver handles, and provides a superset of what
MSI does, so there's no point in having code for both. Schedule MSI
support for removal in 2008 to give anyone who actually needs MSI and
who can't use MSI time to speak up.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Run the existing ehca code through checkpatch.pl and clean up the
worst of the coding style violations.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Split ehca_set_pagebuf() into three functions depending on MR type
(phys/user/fast) and remove superfluous ehca_set_pagebuf_1().
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Rename struct ehca_mr fields to clearly distinguish between kernel
and HW page size.
- Sort struct ehca_mr_pginfo into a common part and a union containing
specific fields for physical, user and fast MR
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Instead of one error mapping function for each potential error source
in ehca_mrmw.c, use a centralized function that handles all cases,
saving a three-figure line count.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Autodetection was missing a few HW revisions, causing certain eHCA1
revisions to be treated like eHCA2. Fixed.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When clearing the ib_ah_attr parameter in to_ib_ah_attr(), use sizeof
*ib_ah_attr instead of sizeof *path. This is the same bug as was
fixed for mthca in 99d4f22e ("IB/mthca: Use correct structure size in
call to memset()"), but the code was cut and pasted into mlx4 before the
fix was merged.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When a QP is in the INIT state, the sched_queue field hasn't been given
to the firmware yet, so the firmware cannot return the value when the QP
is queried. To handle this, use the port number that is saved in the
driver's QP data structure.
Found by Dotan Barak and Yaron Gepstein of Mellanox.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Correct the mask used to get the flow label, since the field is 20 bits,
not 24 bits.
Found by Dotan Barak and Yaron Gepstein of Mellanox.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
drivers/infiniband/hw/mthca/mthca_qp.c: In function
‘mthca_tavor_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1594: warning: ‘f0’ may be used
uninitialized in this function
drivers/infiniband/hw/mthca/mthca_qp.c: In function
‘mthca_arbel_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1949: warning: ‘f0’ may be used
uninitialized in this function
Initializing 'f0' is not strictly necessary in either case, AFAICS.
I was considering use of uninitialized_var(), but looking at the
complex flow of control in each function, I feel it is wiser and
safer to simply zero the var and be certain of ourselves.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (166 commits)
[SCSI] ibmvscsi: convert to use the data buffer accessors
[SCSI] dc395x: convert to use the data buffer accessors
[SCSI] ncr53c8xx: convert to use the data buffer accessors
[SCSI] sym53c8xx: convert to use the data buffer accessors
[SCSI] ppa: coding police and printk levels
[SCSI] aic7xxx_old: remove redundant GFP_ATOMIC from kmalloc
[SCSI] i2o: remove redundant GFP_ATOMIC from kmalloc from device.c
[SCSI] remove the dead CYBERSTORMIII_SCSI option
[SCSI] don't build scsi_dma_{map,unmap} for !HAS_DMA
[SCSI] Clean up scsi_add_lun a bit
[SCSI] 53c700: Remove printk, which triggers because of low scsi clock on SNI RMs
[SCSI] sni_53c710: Cleanup
[SCSI] qla4xxx: Fix underrun/overrun conditions
[SCSI] megaraid_mbox: use mutex instead of semaphore
[SCSI] aacraid: add 51245, 51645 and 52245 adapters to documentation.
[SCSI] qla2xxx: update version to 8.02.00-k1.
[SCSI] qla2xxx: add support for NPIV
[SCSI] stex: use resid for xfer len information
[SCSI] Add Brownie 1200U3P to blacklist
[SCSI] scsi.c: convert to use the data buffer accessors
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (76 commits)
IB: Update MAINTAINERS with Hal's new email address
IB/mlx4: Implement query SRQ
IB/mlx4: Implement query QP
IB/cm: Send no match if a SIDR REQ does not match a listen
IB/cm: Fix handling of duplicate SIDR REQs
IB/cm: cm_msgs.h should include ib_cm.h
IB/cm: Include HCA ACK delay in local ACK timeout
IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible
IB/sa: Make sure SA queries use default P_Key
IPoIB: Recycle loopback skbs instead of freeing and reallocating
IB/mthca: Replace memset(<addr>, 0, PAGE_SIZE) with clear_page(<addr>)
IPoIB/cm: Fix warning if IPV6 is not enabled
IB/core: Take sizeof the correct pointer when calling kmalloc()
IB/ehca: Improve latency by unlocking after triggering the hardware
IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events
IB/ehca: Return QP pointer in poll_cq()
IB/ehca: Change idr spinlocks into rwlocks
IB/ehca: Refactor sync between completions and destroy_cq using atomic_t
IB/ehca: Lock renaming, static initializers
IB/ehca: Report RDMA atomic attributes in query_qp()
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
PCI: Only build PCI syscalls on architectures that want them
PCI: limit pci_get_bus_and_slot to domain 0
PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
PCI: hotplug: pciehp: wait for 1 second after power off slot
PCI: pci_set_power_state(): check for PM capabilities earlier
PCI: cpci_hotplug: Convert to use the kthread API
PCI: add pci_try_set_mwi
PCI: pcie: remove SPIN_LOCK_UNLOCKED
PCI: ROUND_UP macro cleanup in drivers/pci
PCI: remove pci_dac_dma_... APIs
PCI: pci-x-pci-express-read-control-interfaces cleanups
PCI: Fix typo in include/linux/pci.h
PCI: pci_ids, remove double or more empty lines
PCI: pci_ids, add atheros and 3com_2 vendors
PCI: pci_ids, reorder some entries
PCI: i386: traps, change VENDOR to DEVICE
PCI: ATM: lanai, change VENDOR to DEVICE
PCI: Change all drivers to use pci_device->revision
...
sysfs is now completely out of driver/module lifetime game. After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners. Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.
This patch kills now unnecessary attribute->owner. Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.
For more info regarding lifetime rule cleanup, please read the
following message.
http://article.gmane.org/gmane.linux.kernel/510293
(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Instead of all drivers reading pci config space to get the revision
ID, they can now use the pci_device->revision member.
This exposes some issues where drivers where reading a word or a dword
for the revision number, and adding useless error-handling around the
read. Some drivers even just read it for no purpose of all.
In devices where the revision ID is being copied over and used in what
appears to be the equivalent of hotpath, I have left the copy code
and the cached copy as not to influence the driver's performance.
Compile tested with make all{yes,mod}config on x86_64 and i386.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If a SIDR REQ does not match a listen, we should reply with status
value 1 (service ID not supported), rather than dropping through to
the default case of status 2 (rejected by service provider).
Doing this also fixes a bug where the cm_id_priv is removed from the
remote_sidr_table twice.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix handling to duplicate SIDR REQs to avoid sending a reject if a
duplicate is detected. Duplicates should just be silently discarded.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
cm_msgs.h uses definitions from ib_cm.h. Include it directly, rather
than depending on a specific include order.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The IB CM should include the HCA ACK delay when calculating the local
ACK timeout value to use for RC QPs. If the HCA ACK delay is large
enough relative to the packet life time, then if it is not taken into
account, the calculated timeout value ends up being too small, which
can result in "retry exceeded" errors.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The ib_cm is a little over zealous about using spin_lock_irqsave,
when spin_lock_irq would do.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
MADs sent to the SA should use the the default P_Key (0x7fff/0xffff).
There's no requirement that the default P_Key is stored at index 0 in
the local P_Key table, so add code to the sa_query module to look up
the index of the default P_Key when creating an address handle for the
SA (which is done any time the P_Key table might change), and use this
index for all SA queries.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
InfiniBand HCAs replicate multicast packets back to the QP that sent
them if that QP is attached to the destination multicast group. This
means that IPoIB multicasts are often replicated back to the receive
queue of the interface that generated them. To avoid confusing the
network stack, we drop these duplicates within the IPoIB driver.
However, there's no reason to free the skb that received the duplicate
and then immediately allocate a new skb to post to the receive queue.
We can be more efficient and just repost the same skb.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix
drivers/infiniband/ulp/ipoib/ipoib_cm.c:1151: warning: unused variable 'dev'
by getting rid of the variable dev, which is only used if CONFIG_IPV6
is enabled, and replacing the one use of it with the value it is
assigned, namely priv->dev.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When allocating out_mad in show_pma_counter(), take sizeof *out_mad
instead of sizeof *in_mad. It is true that today the type of in_mad
and out_mad are the same, but this patch will give us a cleaner code.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Kick the hardware before unlocking the send/receive queue to overlap
processing a little more.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When firmware reports a nondisruptive port configuration change event,
previous versions of the eHCA driver didn't forward the event to consumers
like IPoIB. Add code that determines the type of configuration change by
comparing old and new port attributes and reports it.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This eliminates lock contention among IRQs as well as the need to
disable IRQs around idr_find, because there are no IRQ writers.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- ehca_cq.nr_events is made an atomic_t, eliminating a lot of locking.
- The CQ is removed from the CQ idr first now to make sure no more
completions are scheduled on that CQ. The "wait for all completions to
end" code becomes much simpler this way.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Rename all spinlock flags to "flags", matching the vast majority of kernel
code.
- Move hcall_lock into the only file it's used in.
- Replaced spin_lock_init() and friends with static initializers for
global variables.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Support SRQs on eHCA2. Since an SRQ is a QP for eHCA2, a lot of code
(structures, create, destroy, post_recv) can be shared between QP and SRQ.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Replace init_qp_queues() by a shorter init_qp_queue(), eliminating
duplicate code.
- hipz_h_alloc_resource_qp() doesn't need a pointer to struct ehca_qp any
longer. All input and output data is transferred through the parms
parameter.
- Change the interface to also support SRQ.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>