If a QP is attached to a shared receive queue (SRQ), then it doesn't
have a receive queue (RQ). So don't allocate an RQ doorbell (or map a
doorbell from userspace for userspace QPs) for that QP.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/cm: Improve local id allocation
IPoIB/cm: Fix SRQ WR leak
IB/ipoib: Fix typos in error messages
IB/mlx4: Check if SRQ is full when posting receive
IB/mlx4: Pass send queue sizes from userspace to kernel
IB/mlx4: Fix check of opcode in mlx4_ib_post_send()
mlx4_core: Fix array overrun in dump_dev_cap_flags()
IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions
IB/mthca: Fix RESET to ERROR transition
IB/mlx4: Set GRH:HopLimit when sending globally routed MADs
IB/mthca: Set GRH:HopLimit when building MLX headers
IB/mlx4: Fix check of max_qp_dest_rdma in modify QP
IB/mthca: Fix use-after-free on device restart
IB/ehca: Return proper error code if register_mr fails
IPoIB: Handle P_Key table reordering
IB/core: Use start_port() and end_port()
IB/core: Add helpers for uncached GID and P_Key searches
IB/ipath: Fix potential deadlock with multicast spinlocks
IB/core: Free umem when mm is already gone
The IB CM uses an idr for local id allocations, with a running counter
as start_id. This fails to generate distinct ids if
1. An id is constantly created and destroyed
2. A chunk of ids just beyond the current next_id value is occupied
This in turn leads to an increased chance of connection request being
mis-detected as a duplicate, sometimes for several retries, until
next_id gets past the block of allocated ids. This has been observed
in practice.
As a fix, remember the last id allocated and start immediately above it.
This also fixes a problem with the old code, where next_id might
overflow and become negative.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
SRQ WR leakage has been observed with IPoIB/CM: e.g. flipping ports on
and off will, with time, leak out all WRs and then all connections
will start getting RNR NAKs. Fix this in the way suggested by spec:
move the QP being destroyed to the error state, wait for "Last WQE
Reached" event and then post WR on a "drain QP" connected to the same
CQ. Once we observe a completion on the drain QP, it's safe to call
ib_destroy_qp.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
a) platorm_driver_probe(...) instead of platform_driver_register(&driver);
b) set bfin_spi_enable and bfin_spi_disable static
c) Why is the width flag a u32?
d) maybe use dev_dbg() instead of pr_debug()
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
properly setting up and respecting the read_status_mask / ignore_status_mask fields of the serial core
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.
This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly
Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).
Cross-compile tested on
all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pass the number of WQEs for the send queue and their size from userspace
to the kernel to avoid having to keep the QP size calculations in sync
between the kernel driver and libmlx4. This fixes a bug seen with the
current mlx4_ib driver and current libmlx4 caused by a difference in the
calculated sizes for SQ WQEs. Also, this gives more flexibility for
userspace to experiment with using multiple WQE BBs for a single SQ WQE.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
wr->opcode is invalid if it's >= ARRAY_SIZE(mlx4_ib_opcode), not just
strictly >.
This was spotted by the Coverity checker (CID 1643).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Don't overrun fname[] array when decoding device flags.
This was spotted by the Coverity checker (CID 1642).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
According to the IB spec, a QP can be moved from RESET back to RESET
or to the ERROR state, but mlx4 firmware does not support this and
returns an error if we try. Fix the RESET to RESET transition by
just returning 0 without doing anything, and fix RESET to ERROR by
moving the QP from RESET to INIT with dummy parameters and then
transitioning from INIT to ERROR.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
According to the IB spec, a QP can be moved from RESET to the ERROR
state, but mthca firmware does not support this and returns an error if
we try. Work around this FW limitation by moving the QP from RESET to
INIT with dummy parameters and then transitioning from INIT to ERROR.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Global CM packets used by rmda_cm were being sent with a GRH:hopLimit
of zero, causing them to be dropped by the router. The problem is a
missing initialization of the hop_limit field in mthca_read_ah(),
which was called by build_mlx_header() when sending a MAD on QP1.
Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
max_qp_dest_rdma is already in natural units - no need to shift. This
was discovered by a test that deliberately requests more outstanding
atomic operation than the device supports.
Found by Sagi Rotem at Mellanox.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Set the return code of ehca_register_mr() to ENOMEM if the corresponding
firmware call fails due to out of resources. Some other error codes
were explicitly mapped to EINVAL -- just remove those cases so they
get mapped to the default case, which already returns EINVAL anyway.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
SM reconfiguration or failover possibly causes a shuffling of the values
in the P_Key table. Right now, IPoIB only queries for the P_Key index
once when it creates the device QP, and hence there are problems if the
index of a P_Key value changes. Fix this by using the PKEY_CHANGE event
to trigger a recheck of the P_Key index.
Signed-off-by: Yosef Etigin <yosefe@voltaire.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Clean up ib_query_port() and ib_modify_port() slightly by using the
just-added start_port() and end_port() helpers.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add ib_find_gid() and ib_find_pkey() functions that use uncached device
queries. The calls might block but the returns are always up-to-date.
Cache P_Key and GID table lengths in core to avoid extra port info queries.
Signed-off-by: Yosef Etigin <yosefe@voltaire.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Lockdep found the following potential deadlock between mcast_lock and
n_mcast_grps_lock: mcast_lock is taken from both interrupt context and
process context, so spin_lock_irqsave() must be used to take it.
n_mcast_grps_lock is only taken from process context, so at first it
seems safe to take it with plain spin_lock(); however, it also nests
inside mcast_lock, and hence we could deadlock:
cpu A cpu B
ipath_mcast_add():
spin_lock_irq(&mcast_lock);
ipath_mcast_detach():
spin_lock(&n_mcast_grps_lock);
<enter interrupt>
ipath_mcast_find():
spin_lock_irqsave(&mcast_lock);
spin_lock(&n_mcast_grps_lock);
Fix this by using spin_lock_irq() to take n_mcast_grps_lock.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Free umem when task's mm is already destroyed by the time
ib_umem_release gets called.
Found by Dotan Barak at Mellanox.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix an Oops in the cciss driver caused by system shutdown while a filesystem
on a cciss device is still active. The cciss_remove_one function only
properly removes the device if the device has been cleanly released by its
users, which is not the case when the pci_driver.shutdown method is called.
This patch adds a new cciss_shutdown function to better match the pattern
used by various SCSI drivers: deactivate device interrupts and flush caches.
It also alters the cciss_remove_one function to match and readds the
__devexit annotation that was removed when cciss_remove_one was serving as
the pci_driver.shutdown method.
Signed-off-by: Gerald Britton <gbritton@alum.mit.edu>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
[CRYPTO] tcrypt: Add missing error check
[CRYPTO] padlock: Make CRYPTO_DEV_PADLOCK a tristate again
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (32 commits)
[POWERPC] Remove build warnings in windfarm_core
[POWERPC] Pass per-file CFLAGs for platform specific op codes
[POWERPC] Correct #endif comment
[POWERPC] Fix ppc_rtas_progress_show()
[POWERPC] Fix sed command lines for zlib source construction
[POWERPC] Specify GNUTARGET on $(AR) invocations
[POWERPC] Make sure device node type/name is not NULL on hot-added nodes
[POWERPC] Small fixes for the Ebony device tree
[POWERPC] Fix warning on UP
[POWERPC] cell_defconfig: Disable cpufreq and pmi
[POWERPC] Fix IO space on PCI buses created from of_platform
[POWERPC] Add spinlock to request_phb_iospace()
[POWERPC] Fix make rules for treeImage.initrd
[POWERPC] Remove warning in mpic.c
[POWERPC] Update pasemi_defconfig
[POWERPC] pasemi: CONFIG_GENERIC_TBSYNC no longer needed
[POWERPC] Update iseries_defconfig
[POWERPC] Wire up some more syscalls
[POWERPC] Fix bug adding properties with flatdevtree.c's ft_set_prop()
[POWERPC] Remove fixup_bigphys_addr() for arch/powerpc to avoid link error
...
Turning it into a boolean was unnecessary and caused ALGAPI to be
pinned down as a boolean to. This patch makes it a tristate again.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With STANDBYDOWN tracking added, libata.spindown_compat isn't
necessary anymore. If userspace shutdown(8) issues STANDBYNOW, libata
warns. If userspace shutdown(8) doesn't issue STANDBYNOW, libata does
the right thing. Userspace can tell whether kernel supports spindown
by testing whether sysfs node manage_start_stop exists as before.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
As with all other drivers, sata_nv's hpriv is allocated with
devm_kzalloc() and there's no need to free it explicitly. Kill
nv_remove_one() which incorrectly used kfree() instead of devm_kfree()
and use ata_pci_remove_one() directly.
Original fix is from Peer Chen.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Peer Chen <pchen@nvidia.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Because nvidia SATA controllers onward base on AHCI, so wildcard in sata_nv
driver is unnecessary. Also the wildcard sometimes cause sata_nv driver to
be loaded for AHCI controllers,which is not as expected.
Signed-off-by: Peer Chen <pchen@nvidia.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pci_enable_msi failure is a normal event so we should not print any error.
Going over the code I spotted a missing pci_disable_msi() leak when irq
allocation fails. The whole code also needed a cleanup, so I combined the
two different calls to pci_request_irq into a single call making this
look a lot better. All #ifdef CONFIG_PCI_MSI's have been removed.
Compile tested with both CONFIG_PCI_MSI enabled and disabled.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pci_enable_msi calls can fail for normal operational reasons. Driver
should not print an error message in that case. Fix a leak that leaves
msi enabled if pci_request_irq fails. We can remove CONFIG_PCI_MSI
ifdefs alltogether
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
NetXen: Fix for driver on System-p
This patch will fix a ping issue on system-p
Signed-off by: Milan Bag <mbag@netxen.com>
Signed-off by: Adhiraj Joshi <adhiraj@netxen.com>
Signed-by: Mithlesh Thukral <mithlesh@netxen.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Spidernet was the driver I original did all the node-aware netdevice
allocation for, but after a year it still hasn't hit mainline.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The hardware must not see that is given ownership of a buffer until it is
completely written, and when the driver receives ownership of a buffer,
it must ensure that any other reads to the buffer reflect its final
state. Thus, I/O barriers are added where required.
Without this patch, I have observed GCC reordering the setting of
bdp->length and bdp->status in gfar_new_skb. Hardware reordering
was also theoretically possible.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix link speed detection change.
Thanks to Stefan Roese <sr@denx.de> for finding this bug.
CC: Stefan Roese <sr@denx.de>
Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Original patch is from Jeff Haran <jharan@brocade.com> with my minor style
fixes. His comments follow:
The first problem was in the function that configures the PHY for
autonegotiation, genmii_setup_aneg(). The original code does a
read/modify/write of the autonegotiation advertizement register (reg 4),
followed by a read/modify/write of the control register (reg 0). While
the original code follows the proper procedure as per reading the IEEE
specs, what I found is that on at least one PHY model (National DP83843)
the read of the control register comes back with the soft reset bit set
(bit 15). Because of the read/modify/write operation, this causes the
write to write a 1 back to the reset bit, which initiates a software
reset of the PHY. This software reset causes the PHY to return to its
power up state which advertizes all modes of operation, thus negating
the write to the autoneg advertizement register. The modification is to
spin reading the control register until the soft reset bit is clear
before doing the modify/write.
The second problem was in the function that configures the PHY for
forced operation, genmii_setup_forced(). The original code initiates a
software reset operation via a write of a 1 to bit 15 of the control
register (reg 0), but then proceeds to do a second write to that same
register without waiting until that reset bit is cleared by the PHY
itself (which according to the IEEE specs indicates that the PHY reset
is complete). This is a violation of how one is supposed to use this
software reset feature of these PHYs and I believe was the cause of
mysterious, difficult to reproduce link failures that we've observed on
some of our systems that use this driver. The fix is to modify the
function so that it spins waiting for the reset bit to clear after doing
the soft reset and before doing the subsequent write.
Signed-off-by: Jeff Haran <jharan@brocade.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Do some memory barrier changes for safety/perfomance:
Don't need read after update to index, mmiowb() followed by read at end
of irq is sufficient.
Signed-off-by: Stephn Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This workaround was added to deal with NAPI core and how
it affected dual port shared polling. It turned out not to
be necessary. Stopping device 0 only doesn't stop NAPI from
working completely after that.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Make sure that if we ever get a MIB counter overflow interrupt (normally
masked off), that the IRQ is cleared.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
When driver can't allocate receive buffer it drops incoming
packet. Keep a counter.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Align the PHY setup of the sky2 driver with the vendor sk98lin (10.0.4.3)
driver. The PHY register settings are mostly black magic, even with access
to the documentation it isn't clear what the right values are. The changes
are mostly comments, the code change only affects the Yukon FE (100 mbit only)
version.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The problems with Gigabyte motherboards are system configuration dependent.
Since it works fine for some users, it doesn't make sense to deprive
them.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
After a suspend/resume cycle, the UART may have been reset into
low-speed mode -- either because it's actually been reset, or because
the firmware pokes at the old-style divisor registers. If we detected it
as a NS16550A SuperIO chip in the first place and set baud_base to
921600, then we should do so again in the resume path.
This patch adds that code to serial8250_resume_port(), and also makes
serial8250_resume() actually call serial8250_resume_port() for each port
instead of just calling uart_resume_port() directly. And thus fixes
serial port operation after suspend/resume.
It also fixes a bogus comment where we write the EXCR2 register with a
comment saying /* EXCR1 */
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>