The purpose of this single series of commits from Srivatsa S Bhat (with
a small piece from Gautham R Shenoy) touching multiple subsystems that use
CPU hotplug notifiers is to provide a way to register them that will not
lead to deadlocks with CPU online/offline operations as described in the
changelog of commit 93ae4f978c (CPU hotplug: Provide lockless versions
of callback registration functions).
The first three commits in the series introduce the API and document it
and the rest simply goes through the users of CPU hotplug notifiers and
converts them to using the new method.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTQow2AAoJEILEb/54YlRxW4QQAJlYRDUzwFJzJzYhltQYuVR+
4D74XMtvXgoJfg3cwdSWvMKKpJZnA9BVN0f7Hcx9wYmgdexYUuHeZJmMNyc3S2+g
KjKBIsugvgmZhHbbLd6TJ6GBbhGT5JLt9VmSfL9zIkveInU1YHFUUqL/mxdHm4J0
BSGKjk2rN3waRJgmY+xfliFLtQjDKFwJpMuvrgtoUyfas3f4sIV43UNbqdvA/weJ
rzedxXOlKH/id4b56lj/4iIzcoL3mwvJJ7r6n0CEMsKv87z09kqR0O+69Tsq/cgs
j17CsvoJOmZGk3QTeKVMQWBsvk6aPoDu3zK83gLbQMt+qjOpSTbJLz/3HZw4/TrW
ss4nuZne1DLMGS+6hoxYbTP+6Ni//Kn+l/LrHc5jb7m1X3lMO4W2aV3IROtIE1rv
lEP1IG01NU4u9YwkVj1dyhrkSp8tLPul4SrUK8W+oNweOC5crjJV7vJbIPJgmYiM
IZN55wln0yVRtR4TX+rmvN0PixsInE8MeaVCmReApyF9pdzul/StxlBze5BKLSJD
cqo1kNPpsmdxoDucqUpQ/gSvy+IOl2qnlisB5PpV93sk7De6TFDYrGHxjYIW7jMf
StXwdCDDQhzd2Q8Kfpp895A1dbIl8rKtwA6bTU2eX+BfMVFzuMdT44cvosx1+UdQ
sWl//rg76nb13dFjvF+q
=SW7Q
-----END PGP SIGNATURE-----
Merge tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
"The purpose of this single series of commits from Srivatsa S Bhat
(with a small piece from Gautham R Shenoy) touching multiple
subsystems that use CPU hotplug notifiers is to provide a way to
register them that will not lead to deadlocks with CPU online/offline
operations as described in the changelog of commit 93ae4f978c ("CPU
hotplug: Provide lockless versions of callback registration
functions").
The first three commits in the series introduce the API and document
it and the rest simply goes through the users of CPU hotplug notifiers
and converts them to using the new method"
* tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
net/iucv/iucv.c: Fix CPU hotplug callback registration
net/core/flow.c: Fix CPU hotplug callback registration
mm, zswap: Fix CPU hotplug callback registration
mm, vmstat: Fix CPU hotplug callback registration
profile: Fix CPU hotplug callback registration
trace, ring-buffer: Fix CPU hotplug callback registration
xen, balloon: Fix CPU hotplug callback registration
hwmon, via-cputemp: Fix CPU hotplug callback registration
hwmon, coretemp: Fix CPU hotplug callback registration
thermal, x86-pkg-temp: Fix CPU hotplug callback registration
octeon, watchdog: Fix CPU hotplug callback registration
oprofile, nmi-timer: Fix CPU hotplug callback registration
intel-idle: Fix CPU hotplug callback registration
clocksource, dummy-timer: Fix CPU hotplug callback registration
drivers/base/topology.c: Fix CPU hotplug callback registration
acpi-cpufreq: Fix CPU hotplug callback registration
zsmalloc: Fix CPU hotplug callback registration
scsi, fcoe: Fix CPU hotplug callback registration
scsi, bnx2fc: Fix CPU hotplug callback registration
scsi, bnx2i: Fix CPU hotplug callback registration
...
This patch consists of the usual driver updates (megaraid_sas, scsi_debug,
qla2xxx, qla4xxx, lpfc, bnx2fc, be2iscsi, hpsa, ipr) plus an assortment of
minor fixes and the first precursors of SCSI-MQ (the code path
simplifications) and the bug fix for the USB oops on remove (which involves an
infrastructure change, so is sent via the main tree with a delayed backport
after a cycle in which it is shown to introduce no new bugs).
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJTOsP1AAoJEDeqqVYsXL0MraUIAMCHWIN791cSc/E4d6mw/6nC
j5CG/wwuw3VfqJcJJ8PcItfReWPuS7aLwhAx3wNGDUe7Vcz9pmcgJU9c2/ZWhIJH
D0YXnGSkkfxI9Wc5WJ/NbueS0TFt0G5B6wpIxSLpSEJ1k9I90vxe3symCwv5vS/p
3Cd2nZZCLg6ArzZJ3PJLnNG9FUp2ZBeZwfPu4CuPm+3kEq9oRATg7bS4NNtVTQLP
0zNs5rKAVWfnE5Ii8VFjA7DLduG9W1IBNnSI7EERenrLKMbHG5530Rnl71uvjjgY
0jmQ5YGpTsYcJggLdaijZdK+zuq6Jtc+0DwWJKIE3cEHx3kUrYi4UQWTTRk9ttQ=
=Bp1Y
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull first round of SCSI updates from James Bottomley:
"This patch consists of the usual driver updates (megaraid_sas,
scsi_debug, qla2xxx, qla4xxx, lpfc, bnx2fc, be2iscsi, hpsa, ipr) plus
an assortment of minor fixes and the first precursors of SCSI-MQ (the
code path simplifications) and the bug fix for the USB oops on remove
(which involves an infrastructure change, so is sent via the main tree
with a delayed backport after a cycle in which it is shown to
introduce no new bugs)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (196 commits)
[SCSI] sd: Quiesce mode sense error messages
[SCSI] add support for per-host cmd pools
[SCSI] simplify command allocation and freeing a bit
[SCSI] megaraid: simplify internal command handling
[SCSI] ses: Use vpd information from scsi_device
[SCSI] Add EVPD page 0x83 and 0x80 to sysfs
[SCSI] Return VPD page length in scsi_vpd_inquiry()
[SCSI] scsi_sysfs: Implement 'is_visible' callback
[SCSI] hpsa: update driver version to 3.4.4-1
[SCSI] hpsa: fix bad endif placement in RAID 5 mapper code
[SCSI] qla2xxx: Fix build errors related to invalid print fields on some architectures.
[SCSI] bfa: Replace large udelay() with mdelay()
[SCSI] vmw_pvscsi: Some improvements in pvscsi driver.
[SCSI] vmw_pvscsi: Add support for I/O requests coalescing.
[SCSI] vmw_pvscsi: Fix pvscsi_abort() function.
[SCSI] remove deprecated IRQF_DISABLED from SCSI
[SCSI] bfa: Updating Maintainers email ids
[SCSI] ipr: Add new CCIN definition for Grand Canyon support
[SCSI] ipr: Format HCAM overlay ID 0x21
[SCSI] ipr: Use pci_enable_msi_range() and pci_enable_msix_range()
...
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the bnx2i code in scsi by using this latter form of callback registration.
Cc: Eddie Wai <eddie.wai@broadcom.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The bnx2/bnx2x rings are made up of linked pages. However there is an
upper limit on the page size as some the page size settings are 16-bit
in the hardware/firmware interface. In the current code, some parts
use BNX2_PAGE_SIZE which has a 16K upper limit and some parts use
PAGE_SIZE. On archs with >= 64K PAGE_SIZE, it generates some compile
warnings. Define a new CNIC_PAGE_SZIE which has an upper limit of
16K and use it consistently in all relevant parts.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the session lock with two locks, a forward lock and
a backwards lock named frwd_lock and back_lock respectively.
The forward lock protects resources that change while sending a
request to the target, such as cmdsn, queued_cmdsn, and allocating
task from the commands' pool with kfifo_out.
The backward lock protects resources that change while processing
a response or in error path, such as cmdsn_exp, cmdsn_max, and
returning tasks to the commands' pool with kfifo_in.
Under a steady state fast-path situation, that is when one
or more processes/threads submit IO to an iscsi device and
a single kernel upcall (e.g softirq) is dealing with processing
of responses without errors, this patch eliminates the contention
between the queuecommand()/request response/scsi_done() flows
associated with iscsi sessions.
Between the forward and the backward locks exists a strict locking
hierarchy. The mutual exclusion zone protected by the forward lock can
enclose the mutual exclusion zone protected by the backward lock but not
vice versa.
For example, in iscsi_conn_teardown or in iscsi_xmit_data when there is
a failure and __iscsi_put_task is called, the backward lock is taken while
the forward lock is still taken. On the other hand, if in the RX path a nop
is to be sent, for example in iscsi_handle_reject or __iscsi_complete_pdu
than the forward lock is released and the backward lock is taken for the
duration of iscsi_send_nopout, later the backward lock is released and the
forward lock is retaken.
libiscsi_tcp uses two kernel fifos the r2t pool and the r2t queue.
The insertion and deletion from these queues didn't corespond to the
assumption taken by the new forward/backwards session locking paradigm.
That is, in iscsi_tcp_clenup_task which belongs to the RX (backwards)
path, r2t is taken out from r2t queue and inserted to the r2t pool.
In iscsi_tcp_get_curr_r2t which belong to the TX (forward) path, r2t
is also inserted to the r2t pool and another r2t is pulled from r2t
queue.
Only in iscsi_tcp_r2t_rsp which is called in the RX path but can requeue
to the TX path, r2t is taken from the r2t pool and inserted to the r2t
queue.
In order to cope with this situation, two spin locks were added,
pool2queue and queue2pool. The former protects extracting from the
r2t pool and inserting to the r2t queue, and the later protects the
extracing from the r2t queue and inserting to the r2t pool.
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
[minor fix up to apply cleanly and compile fix]
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Pull trivial tree updates from Jiri Kosina:
"Usual earth-shaking, news-breaking, rocket science pile from
trivial.git"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
doc: add missing files to timers/00-INDEX
timekeeping: Fix some trivial typos in comments
mm: Fix some trivial typos in comments
irq: Fix some trivial typos in comments
NUMA: fix typos in Kconfig help text
mm: update 00-INDEX
doc: Documentation/DMA-attributes.txt fix typo
DRM: comment: `halve' -> `half'
Docs: Kconfig: `devlopers' -> `developers'
doc: typo on word accounting in kprobes.c in mutliple architectures
treewide: fix "usefull" typo
treewide: fix "distingush" typo
mm/Kconfig: Grammar s/an/a/
kexec: Typo s/the/then/
Documentation/kvm: Update cpuid documentation for steal time and pv eoi
treewide: Fix common typo in "identify"
__page_to_pfn: Fix typo in comment
Correct some typos for word frequency
clk: fixed-factor: Fix a trivial typo
...
Correct common misspelling of "identify" as "indentify" throughout
the kernel
Signed-off-by: Maxime Jayat <maxime@artisandeveloppeur.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
commit b9871bcfd2
bnx2x: VF RSS support - PF side
changed the configuration of the doorbell HW and it broke iSCSI and FCoE.
We fix this by making compatible changes to the doorbell address in bnx2i
and bnx2fc. For the userspace driver, we need to pass a modified CID
so that the existing userspace driver will calculate the correct doorbell
address and continue to work.
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking changes from David Miller:
"Noteworthy changes this time around:
1) Multicast rejoin support for team driver, from Jiri Pirko.
2) Centralize and simplify TCP RTT measurement handling in order to
reduce the impact of bad RTO seeding from SYN/ACKs. Also, when
both timestamps and local RTT measurements are available prefer
the later because there are broken middleware devices which
scramble the timestamp.
From Yuchung Cheng.
3) Add TCP_NOTSENT_LOWAT socket option to limit the amount of kernel
memory consumed to queue up unsend user data. From Eric Dumazet.
4) Add a "physical port ID" abstraction for network devices, from
Jiri Pirko.
5) Add a "suppress" operation to influence fib_rules lookups, from
Stefan Tomanek.
6) Add a networking development FAQ, from Paul Gortmaker.
7) Extend the information provided by tcp_probe and add ipv6 support,
from Daniel Borkmann.
8) Use RCU locking more extensively in openvswitch data paths, from
Pravin B Shelar.
9) Add SCTP support to openvswitch, from Joe Stringer.
10) Add EF10 chip support to SFC driver, from Ben Hutchings.
11) Add new SYNPROXY netfilter target, from Patrick McHardy.
12) Compute a rate approximation for sending in TCP sockets, and use
this to more intelligently coalesce TSO frames. Furthermore, add
a new packet scheduler which takes advantage of this estimate when
available. From Eric Dumazet.
13) Allow AF_PACKET fanouts with random selection, from Daniel
Borkmann.
14) Add ipv6 support to vxlan driver, from Cong Wang"
Resolved conflicts as per discussion.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1218 commits)
openvswitch: Fix alignment of struct sw_flow_key.
netfilter: Fix build errors with xt_socket.c
tcp: Add missing braces to do_tcp_setsockopt
caif: Add missing braces to multiline if in cfctrl_linkup_request
bnx2x: Add missing braces in bnx2x:bnx2x_link_initialize
vxlan: Fix kernel panic on device delete.
net: mvneta: implement ->ndo_do_ioctl() to support PHY ioctls
net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
icplus: Use netif_running to determine device state
ethernet/arc/arc_emac: Fix huge delays in large file copies
tuntap: orphan frags before trying to set tx timestamp
tuntap: purge socket error queue on detach
qlcnic: use standard NAPI weights
ipv6:introduce function to find route for redirect
bnx2x: VF RSS support - VF side
bnx2x: VF RSS support - PF side
vxlan: Notify drivers for listening UDP port changes
net: usbnet: update addr_assign_type if appropriate
driver/net: enic: update enic maintainers and driver
driver/net: enic: Exposing symbols for Cisco's low latency driver
...
On some bnx2x devices, iSCSI is determined to be unsupported only after
firmware is downloaded. We need to check max_iscsi_conn again after
NETDEV_UP and block iSCSI init operations. Without this fix, iscsiadm
can hang as the firmware will not respond to the iSCSI init message.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update TCP delayed ACK and timestamp options setup to match latest bnx2x
firmware.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix typo in printk and comments within various drivers.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The intention in bnx2i_send_fw_iscsi_init_msg was to zero out
only the lower 32bits, but instead the whole mask64 is zeroed.
This patch fixes it.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
markings need to be removed.
This change removes the use of __devinit, __devexit_p, __devinitdata,
__devinitconst, and __devexit from these drivers.
Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Adam Radford <linuxraid@lsi.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Removed the individual PCI DEVICE ID checking inside bnx2i. The device
type can easily be read from the corresponding cnic->flags. This will
free bnx2i from having to get updated for every new device ID that gets
added.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch fixes the following kernel panic invoked by uninitialized fields
in the chip initialization for the 1G bnx2 iSCSI offload.
One of the bits in the chip initialization is being used by the latest
firmware to control overflow packets. When this control bit gets enabled
erroneously, it would ultimately result in a bad packet placement which would
cause the bnx2 driver to dereference a NULL ptr in the placement handler.
This can happen under certain stress I/O environment under the Linux
iSCSI offload operation.
This change only affects Broadcom's 5709 chipset.
Unable to handle kernel NULL pointer dereference at 0000000000000008 RIP:
[<ffffffff881f0e7d>] :bnx2:bnx2_poll_work+0xd0d/0x13c5
Pid: 0, comm: swapper Tainted: G ---- 2.6.18-333.el5debug #2
RIP: 0010:[<ffffffff881f0e7d>] [<ffffffff881f0e7d>] :bnx2:bnx2_poll_work+0xd0d/0x13c5
RSP: 0018:ffff8101b575bd50 EFLAGS: 00010216
RAX: 0000000000000005 RBX: ffff81007c5fb180 RCX: 0000000000000000
RDX: 0000000000000ffc RSI: 00000000817e8000 RDI: 0000000000000220
RBP: ffff81015bbd7ec0 R08: ffff8100817e9000 R09: 0000000000000000
R10: ffff81007c5fb180 R11: 00000000000000c8 R12: 000000007a25a010
R13: 0000000000000000 R14: 0000000000000005 R15: ffff810159f80558
FS: 0000000000000000(0000) GS:ffff8101afebc240(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 0000000000201000 CR4: 00000000000006a0
Process swapper (pid: 0, threadinfo ffff8101b5754000, task ffff8101afebd820)
Stack: 000000000000000b ffff810159f80000 0000000000000040 ffff810159f80520
ffff810159f80500 00cf00cf8008e84b ffffc200100939e0 ffff810009035b20
0000502900000000 000000be00000001 ffff8100817e7810 00d08101b575bea8
Call Trace:
<IRQ> [<ffffffff8008e0d0>] show_schedstat+0x1c2/0x25b
[<ffffffff881f1886>] :bnx2:bnx2_poll+0xf6/0x231
[<ffffffff8000c9b9>] net_rx_action+0xac/0x1b1
[<ffffffff800125a0>] __do_softirq+0x89/0x133
[<ffffffff8005e30c>] call_softirq+0x1c/0x28
[<ffffffff8006d5de>] do_softirq+0x2c/0x7d
[<ffffffff8006d46e>] do_IRQ+0xee/0xf7
[<ffffffff8005d625>] ret_from_intr+0x0/0xa
<EOI> [<ffffffff801a5780>] acpi_processor_idle_simple+0x1c5/0x341
[<ffffffff801a573d>] acpi_processor_idle_simple+0x182/0x341
[<ffffffff801a55bb>] acpi_processor_idle_simple+0x0/0x341
[<ffffffff80049560>] cpu_idle+0x95/0xb8
[<ffffffff80078b1c>] start_secondary+0x479/0x488
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
DRV_MODULE_VERSION here is "2.7.2.2" which is only 8 chars but we copy
12 bytes from the stack so it's a small information leak.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Acked-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netdev->base_addr parameter has been deprecated in the L2 bnx2
driver. This is used by bnx2i for the BARn iomapping.
This patch will directly reference the pci_resource_start instead
of using the deprecated netdev->base_addr.
This patch is actually a critical bug fix as the 1G bnx2 driver no
longer supports the netdev->base_addr in the current kernel of the scsi
tree. This means that Broadcom's 1G Linux iSCSI offload solution would
not work at all without this patch.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
1. When FCoE offload driver is registered, copy its capabilities to the chip
scratchpad.
2. Copy FCoE/iSCSI MAC addresses in aligned manner to chip scratchpad.
3. Add FCoE/iSCSI statistics collection support
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Old version: 2.7.0.3
New version: 2.7.2.2
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This will set the target can_queue limit to the number of preallocated
session tasks set during creation.
"Could not send nopout" messages were observed without this when the
iSCSI connection experiences dropped frames under heavy I/O stress.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Pull networking fixes from David Miller:
1) L2TP doesn't get autoloaded when you try to open an L2TP socket due
to a missing module alias, fix from Benjamin LaHaise.
2) Netlabel and RDS should propagate gfp flags given to them by
callers, fixes from Dan Carpeneter.
3) Recursive locking fix in usbnet wasn't bulletproof and can result in
objects going away mid-flight due to races, fix from Ming Lei.
4) Fix up some confusion about a bool module parameter in netfilter's
iptable_filter and ip6table_filter, from Rusty Russell.
5) If SKB recycling is used via napi_reuse_skb() we end up with
different amounts of headroom reserved than we had at the original
SKB allocation. Fix from Eric Dumazet.
6) Fix races in TG3 driver ring refilling, from Michael Chan.
7) We have callbacks for IPSEC replay notifiers, but some call sites
were not using the ops method and instead were calling one of the
implementations directly. Oops. Fix from Steffen Klassert.
8) Fix IP address validation properly in the bonding driver, the
previous fix only works with netlink where the subnet mask and IP
address are changed in one atomic operation. When 'ifconfig' ioctls
are used the IP address and the subnet mask are changed in two
distinct operations. Fix from Andy Gospodarek.
9) Provide a sky2 module operation to work around power management
issues with some BIOSes. From Stephen Hemminger.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
usbnet: consider device busy at each recieved packet
bonding: remove entries for master_ip and vlan_ip and query devices instead
netfilter: remove forward module param confusion.
usbnet: don't clear urb->dev in tx_complete
usbnet: increase URB reference count before usb_unlink_urb
xfrm: Access the replay notify functions via the registered callbacks
xfrm: Remove unused xfrm_state from xfrm_state_check_space
RDS: use gfp flags from caller in conn_alloc()
netlabel: use GFP flags from caller instead of GFP_ATOMIC
l2tp: enable automatic module loading for l2tp_ppp
cnic: Fix parity error code conflict
tg3: Fix RSS ring refill race condition
sky2: override for PCI legacy power management
net: fix napi_reuse_skb() skb reserve
The recently added parity error handling used an error code that was
already defined for a different error. This could lead to bnx2x
firmware assert. We need to fix this with new error codes that are
defined for parity error only.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The error_mask module param overrides has a bug which prevented
the new module param values to take effect.
Also changed the type attribute of the error_mask1/2 module params
from int to uint to allow the MSB to be set.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Anil Veerabhadrappa <anilgv@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
bnx2i_percpu_thread_create() create per cpu kthread, and should use
proper NUMA aware API.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
During session recovery, the conn_stop call will trigger a flush
to all outstanding SCSI cmds in the xmit queue. This will set
all outstanding task->sc to NULL prior to the session_teardown
call which frees the task memory.
In the bnx2i SCSI response processing path, only the task was being checked
for NULL under the session lock before the task->sc->request dereferencing.
If there are outstanding SCSI cmd responses pending for process, the
following kernel panic can be exposed where task->sc was found to be NULL.
Call Trace:
[ 69.720205] [<ffffffffa040d0d0>] bnx2i_process_new_cqes+0x290/0x3c0 [bnx2i]
[ 69.804289] [<ffffffffa040d233>] bnx2i_fastpath_notification+0x33/0xa0 [bnx2
i]
[ 69.891490] [<ffffffffa040d37b>] bnx2i_indicate_kcqe+0xdb/0x330 [bnx2i]
[ 69.971427] [<ffffffffa03eac5e>] service_kcqes+0x16e/0x1d0 [cnic]
[ 70.045132] [<ffffffffa03eacea>] cnic_service_bnx2x_kcq+0x2a/0x50 [cnic]
[ 70.126105] [<ffffffffa03ead53>] cnic_service_bnx2x_bh+0x43/0x140 [cnic]
[ 70.207081] [<ffffffff81060676>] tasklet_action+0x66/0x110
[ 70.273521] [<ffffffff8106025f>] __do_softirq+0xef/0x220
[ 70.337887] [<ffffffff81447ebc>] call_softirq+0x1c/0x30
This patch adds the !task->sc check and also protects the sc dereferencing
under the session lock.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The iscsi_nopout task's TTT is defined as __be32 while the DMA
memory to the chip is CPU specific. This creates a problem for
unsolicited NOP-In responses where the TTT is not the RESERVED
tag of 0xFFs. This patch adds a call to be32_to_cpu for the TTT
specified.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and driver's host attrs to use the attribute
container sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and driver's session attrs to use the attribute
container sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and drivers to use the attribute container
sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Moves the drivers for Broadcom devices into
drivers/net/ethernet/broadcom/ and the necessary Kconfig and Makefile
changes.
CC: Eilon Greenstein <eilong@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
CC: Matt Carlson <mcarlson@broadcom.com>
CC: Gary Zambrano <zambrano@broadcom.com>
CC: "Maciej W. Rozycki" <macro@linux-mips.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: Convert to DIV_ROUND_UP_SECTOR_T usage for sectors / dev_max_sectors
kernel.h: Add DIV_ROUND_UP_ULL and DIV_ROUND_UP_SECTOR_T macro usage
iscsi-target: Add iSCSI fabric support for target v4.1
iscsi: Add Serial Number Arithmetic LT and GT into iscsi_proto.h
iscsi: Use struct scsi_lun in iscsi structs instead of u8[8]
iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi
struct scsi_lun is also just a struct with an array of 8 octets (64 bits)
but using it instead in iscsi structs lets us call scsilun_to_int
without a cast, and also lets us copy it using assignment, instead of
memcpy().
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch renames the following iscsi_proto.h structures to avoid
namespace issues with drivers/target/iscsi/iscsi_target_core.h:
*) struct iscsi_cmd -> struct iscsi_scsi_req
*) struct iscsi_cmd_rsp -> struct iscsi_scsi_rsp
*) struct iscsi_login -> struct iscsi_login_req
This patch includes useful ISCSI_FLAG_LOGIN_[CURRENT,NEXT]_STAGE*,
and ISCSI_FLAG_SNACK_TYPE_* definitions used by iscsi_target_mod, and
fixes the incorrect definition of struct iscsi_snack to following
RFC-3720 Section 10.16. SNACK Request.
Also, this patch updates libiscsi, iSER, be2iscsi, and bn2xi to
use the updated structure definitions in a handful of locations.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (77 commits)
[SCSI] fix crash in scsi_dispatch_cmd()
[SCSI] sr: check_events() ignore GET_EVENT when TUR says otherwise
[SCSI] bnx2i: Fixed kernel panic due to illegal usage of sc->request->cpu
[SCSI] bfa: Update the driver version to 3.0.2.1
[SCSI] bfa: Driver and BSG enhancements.
[SCSI] bfa: Added support to query PHY.
[SCSI] bfa: Added HBA diagnostics support.
[SCSI] bfa: Added support for flash configuration
[SCSI] bfa: Added support to obtain SFP info.
[SCSI] bfa: Added support for CEE info and stats query.
[SCSI] bfa: Extend BSG interface.
[SCSI] bfa: FCS bug fixes.
[SCSI] bfa: DMA memory allocation enhancement.
[SCSI] bfa: Brocade-1860 Fabric Adapter vHBA support.
[SCSI] bfa: Brocade-1860 Fabric Adapter PLL init fixes.
[SCSI] bfa: Added Fabric Assigned Address(FAA) support
[SCSI] bfa: IOC bug fixes.
[SCSI] bfa: Enable ASIC block configuration and query.
[SCSI] bnx2i: Updated copyright and bump version
[SCSI] bnx2i: Modified to skip CNIC registration if iSCSI is not supported
...
Fix up some trivial conflicts in:
- drivers/scsi/bnx2fc/{bnx2fc.h,bnx2fc_fcoe.c}:
Crazy broadcom version number conflicts
- drivers/target/tcm_fc/tfc_cmd.c
Just trivial cleanups done on adjacent lines
A kernel panic was observed when passing the sc->request->cpu = -1 to
retrieve the per_cpu variable pointer:
#0 [ffff880011203960] machine_kexec at ffffffff81022bc3
#1 [ffff8800112039b0] crash_kexec at ffffffff81088630
#2 [ffff880011203a80] __die at ffffffff8139ea20
#3 [ffff880011203aa0] no_context at ffffffff8102f3a7
#4 [ffff880011203ae0] __bad_area_nosemaphore at ffffffff8102f665
#5 [ffff880011203ba0] retint_signal at ffffffff8139dd1f
#6 [ffff880011203cc8] bnx2i_indicate_kcqe at ffffffffa03dc4f2
#7 [ffff880011203da8] service_kcqes at ffffffffa03cb04f
#8 [ffff880011203e68] cnic_service_bnx2x_kcq at ffffffffa03cb14a
#9 [ffff880011203e88] cnic_service_bnx2x_bh at ffffffffa03cb1b3
The problem lies in the slow path sg_io (and perhaps sg_scsi_ioctl) call to
blk_get_request->get_request/wait->blk_alloc_request->blk_rq_init which
re-initializes the request->cpu to -1. There is no assignment for cpu from
that to the request_fn call to low level drivers.
When this happens, the sc->request->cpu will be using the init value of
-1. This will create a kernel panic when it hits bnx2i because the code
refers it to get the per_cpu variables ptr.
This change is to put in a guard against that and also for cases when
bio affinity/queue completion to the same cpu is not enabled. In those
cases, the request->cpu will remain a -1 also.
This bug was created from commit: b5cf6b63f7
For the case when the blk layer did not setup the request->cpu, bnx2i
will complete the sc with the current CPU of the thread.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The bnx2fc driver needs to handle netdev events on VLAN devices.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The init routine will now examine the cnic->max_iscsi_conn variable
before registering to CNIC during ulp_init.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch breaks the SCSI cmd completion into two parts:
1. The bh will allocate and queued work to the cmd specific CPU IO
completion kthread. The CPU for the cmd is from the sc->request->cpu.
2. The CPU specific IO completion kthread will call the scsi_cmd_resp
routine to do the actual cmd completion.
In the normal case, these IO completion kthreads should complete before
the blk IO times out at 60s. However, in the case when these kthreads
are blocked for whatever reason and exceeded the timeout, the call
to conn_destroy will have to iterate and exhaust all related work in the
percpu work list for all online CPUs. This will guarantee the protection
of the work->session and conn pointers before they get freed.
Also modified the event coalescing formula to have at least the
event_coal_min outstanding cmds in the pipeline so the SCSI producer
would not get underrun.
Also changed the following SCSI parameters:
- can_queue from 1024 to 2048
- cmds_per_lun from 24 to 128
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Benjamin Li <benli@broadcom.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
struct scsi_lun is also just a struct with an array of 8 octets (64 bits)
but using it instead in iscsi structs lets us call scsilun_to_int
without a cast, and also lets us copy it using assignment, instead of
memcpy().
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
And change iSCSI RQ doorbell size from 16B to 64B to match new firmware.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New FW/HSI (7.0):
- Added support to 578xx chips
- Improved HSI - much less driver's direct access to the FW internal
memory needed.
New implementation of the HSI handling layer in the bnx2x (bnx2x_sp.c):
- Introduced chip dependent objects that have chip independent interfaces
for configuration of MACs, multicast addresses, Rx mode, indirection table,
fast path queues and function initialization/cleanup.
- Objects functionality is based on the private function pointers, which
allows not only a per-chip but also PF/VF differentiation while still
preserving the same interface towards the driver.
- Objects interface is not influenced by the HSI changes which do not require
providing new parameters keeping the code outside the bnx2x_sp.c invariant
with regard to such HSI chnages.
Changes in a CNIC, bnx2fc and bnx2i modules due to the new HSI.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Modified the event coalescing code for iSCSI offload to combat both
corner cases and optimize performance as follows:
1. Added mechanism to loop back a second time to process any leftover
CQEs that was generated by the hardware during the time the driver is
busy processing previous CQEs in the bh. This not only helps the
performance but also fixes the corner case when no more CQEs are being
generated in the pipeline; so those leftover CQEs will get a a chance
to be processed.
2. Added ARM_CQE_FP to distinguish between fast path arming versus
slow path arming. This change will guarantee that the CQEs will
always get a chance to be re-armed during fast path completions.
3. Removed the inline event coalescing division for perf optimization.
Also fixed a division-by-zero error when the event_coal_div module
param was set to 0.
4. Changed the default SQ WQEs size from 256 to 128 to match chip
default.
5. Changed the cmd_per_lun from 32 to 24.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <jbottomley@parallels.com>