We recently had a system crash in the cnic module. Vmcore analysis confirmed
that "ip link up" was executed which failed due to an allocation failure
because of memory fragmentation. Futher analysis revealed that the cnic irq
vector was still allocated after the "ip link up" that failed. When
"ip link down" was executed it called free_msi_irqs() which crashed the system
because the cnic irq was still inuse.
PANIC: "kernel BUG at drivers/pci/msi.c:411!"
The code execution was:
cnic_netdev_event()
if (event == NETDEV_UP) {
.
.
▹ if (!cnic_start_hw(dev))
cnic_start_hw()
calls cnic_cm_open() which failed with -ENOMEM
cnic_start_hw() then took the err1 path:
err1:↩
cp->free_resc(dev);↩ <---- frees resources but not irq vector
pci_dev_put(dev->pcidev);↩
return err;↩
}↩
This returns control back to cnic_netdev_event() but now the cnic irq vector
is still allocated even although cnic_cm_open() failed. The next
"ip link down" while trigger the crash.
The cnic_start_hw() routine is not handling the allocation failure correctly.
Fix this by checking whether CNIC_DRV_STATE_HANDLES_IRQ flag is set indicating
that the hardware has been started in cnic_start_hw(). If it has then call
cp->stop_hw() which frees the cnic irq vector and cnic resources. Otherwise
just maintain the previous behaviour and free cnic resources.
I reproduced this by injecting an ENOMEM error into cnic_cm_alloc_mem()s return
code.
# ip link set dev enpX down
# ip link set dev enpX up <--- hit's allocation failure
# ip link set dev enpX down <--- crashes here
With this patch I confirmed there was no crash in the reproducer.
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adheer Chandravanshi <adheer.chandravanshi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides additional changes as a part of BNX2 and CNIC driver
re-branding effort.
Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Remove the rcu_read_lock/unlock around rcu_access_pointer
2. Replace the rcu_dereference with rcu_access_pointer
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The cnic module needs to ensure that if ipv6 support is compiled as a module,
then the cnic module cannot be compiled as built-in as it depends on ipv6.
Made this check cleaner via Kconfig
Use simpler IS_ENABLED for CONFIG_VLAN_8021Q check
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "rcu_dereference()" calls are used directly in conditions.
Since their return values are never dereferenced it is recommended to use
"rcu_access_pointer()" instead of "rcu_dereference()".
Therefore, this patch makes the replacements.
The following Coccinelle semantic patch was used:
@@
@@
(
if(
(<+...
- rcu_dereference
+ rcu_access_pointer
(...)
...+>)) {...}
|
while(
(<+...
- rcu_dereference
+ rcu_access_pointer
(...)
...+>)) {...}
)
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o QLogic has acquired the NetXtremeII products and drivers from Broadcom.
This patch re-brands cnic driver as a QLogic driver
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking updates from David Miller:
1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.
2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
Benniston.
3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
Mork.
4) BPF now has a "random" opcode, from Chema Gonzalez.
5) Add more BPF documentation and improve test framework, from Daniel
Borkmann.
6) Support TCP fastopen over ipv6, from Daniel Lee.
7) Add software TSO helper functions and use them to support software
TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia.
8) Support software TSO in fec driver too, from Nimrod Andy.
9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.
10) Handle broadcasts more gracefully over macvlan when there are large
numbers of interfaces configured, from Herbert Xu.
11) Allow more control over fwmark used for non-socket based responses,
from Lorenzo Colitti.
12) Do TCP congestion window limiting based upon measurements, from Neal
Cardwell.
13) Support busy polling in SCTP, from Neal Horman.
14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.
15) Bridge promisc mode handling improvements from Vlad Yasevich.
16) Don't use inetpeer entries to implement ID generation any more, it
performs poorly, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
tcp: fixing TLP's FIN recovery
net: fec: Add software TSO support
net: fec: Add Scatter/gather support
net: fec: Increase buffer descriptor entry number
net: fec: Factorize feature setting
net: fec: Enable IP header hardware checksum
net: fec: Factorize the .xmit transmit function
bridge: fix compile error when compiling without IPv6 support
bridge: fix smatch warning / potential null pointer dereference
via-rhine: fix full-duplex with autoneg disable
bnx2x: Enlarge the dorq threshold for VFs
bnx2x: Check for UNDI in uncommon branch
bnx2x: Fix 1G-baseT link
bnx2x: Fix link for KR with swapped polarity lane
sctp: Fix sk_ack_backlog wrap-around problem
net/core: Add VF link state control policy
net/fsl: xgmac_mdio is dependent on OF_MDIO
net/fsl: Make xgmac_mdio read error message useful
net_sched: drr: warn when qdisc is not work conserving
...
The iSCSI netlink message needs to be sent before the ulp_ops is cleared
as it is sent through a function pointer in the ulp_ops. This bug
causes iscsid to not get the message when the bnx2i driver is unloaded.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are allocating memory with GFP_KERNEL under spinlock. Since this is
the only call manipulating the cnic_udev_list and it is always under
rtnl_lock, cnic_dev_lock can be safely removed.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because the called function, such as bnx2fc_indicate_netevent(), can sleep,
we cannot take rcu_lock(). To prevent the rcu protected ulp_ops from going
away, we use the cnic_lock mutex and set the ULP_F_CALL_PENDING flag.
The code already waits for ULP_F_CALL_PENDING flag to clear in
cnic_unregister_device().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bnx2/bnx2x rings are made up of linked pages. However there is an
upper limit on the page size as some the page size settings are 16-bit
in the hardware/firmware interface. In the current code, some parts
use BNX2_PAGE_SIZE which has a 16K upper limit and some parts use
PAGE_SIZE. On archs with >= 64K PAGE_SIZE, it generates some compile
warnings. Define a new CNIC_PAGE_SZIE which has an upper limit of
16K and use it consistently in all relevant parts.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For per device operations, cnic needs to dereference the RCU protected
cp->ulp_ops instead of the global cnic_ulp_tbl. In 2 locations,
cnic_send_nlmsg() and cnic_copy_ulp_stats(), it was referencing the
global table. If the device has been unregistered and these functions
are still being called (very unlikely scenarios), it could lead to NULL
pointer dereference.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The buffer that is used to pass doorbell offset to the userspace UIO
driver may contain nonzero value in older versions of bnx2x driver.
Userspace cannot easily tell whether it contains a valid doorbell
offset or not. With the added signature, userspace will only use
the doorbell offset if the signature is present.
Update version to 2.5.19.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the memset/memcpy uses of 6 to ETH_ALEN
where appropriate.
Also convert some struct definitions and u8 array
declarations of [6] to ETH_ALEN.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 104a43edb2
cnic: Use CHIP_NUM macros from bnx2x.h
changed the code to use the bnx2x macro NO_FCOE() to determine if FCoE
is supported or not. There is another place in cnic that is still using
the old method to determine if FCoE is supported or not. The 2 methods
may not yield the same result after the network interface is brought down
and up. This will cause the crash as cnic_bnx2x_service_kcq() will access
the uninitialized cp->kcq2.
The fix is to consistently use the same macro CNIC_SUPPORTS_FCOE() which
uses the bnx2x NO_FCOE() macro. As a follow-up, we can clean up the code
to remove the old method as it is no longer needed.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit b9871bcfd2
bnx2x: VF RSS support - PF side
changed the configuration of the doorbell HW and it broke iSCSI and FCoE.
We fix this by making compatible changes to the doorbell address in bnx2i
and bnx2fc. For the userspace driver, we need to pass a modified CID
so that the existing userspace driver will calculate the correct doorbell
address and continue to work.
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use bp->pfid from bnx2x instead to avoid duplication.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use BP_PORT and chip_port_mode directly from bnx2x.h to avoid duplication.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This eliminates duplication and ensures that all bnx2x chips will be
supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some bnx2x devices, iSCSI is determined to be unsupported only after
firmware is downloaded. We need to check max_iscsi_conn again after
NETDEV_UP and block iSCSI init operations. Without this fix, iscsiadm
can hang as the firmware will not respond to the iSCSI init message.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Completion status field should also be checked for non-zero error
condition.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update TCP delayed ACK and timestamp options setup to match latest bnx2x
firmware.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without resetting it, the bnx2i driver cannot use different options for
different iSCSI connections.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since unregister_netdevice_notifier() will replay the NETDEV_DOWN and
NETDEV_UNREGISTER_EVENTS, the cnic_dev_list will be cleaned up automatically.
The loop to cleanup the cnic_dev_list can be removed in cnic_release().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After this earlier commit to simplify probing:
commit 4bd9b0fffb
cnic, bnx2x, bnx2: Simplify cnic probing.
we can now reliably receive netdev events and we can simplify the handling
of these events. We now remove the logic that tries to handle missed
NETDEV_REGISTER events.
This change will allow cleanup to be simplified in the next patch. We can
now rely on the play back of netdev events during
unregister_netdevice_notifier() to cleanup the structures.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
v2->v3: fix typo on simeth
shortened dev_getter
shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller <davem@davemloft.net>
The firmware supports a maximum of 4K FCoE exchanges. In 4-port devices,
or when working in multi-function mode, this resource needs to be distributed
between the various possible FCoE functions.
This information needs to be calculated by bnx2x and propagated into bnx2fc
via cnic. bnx2fc can then use this value to calculate corresponding xid
resources instead of using global constants.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc failures already get standardized OOM
messages and a dump_stack.
Convert kzalloc's with multiplies to kcalloc.
Convert kmalloc's with multiplies to kmalloc_array.
Fix a few whitespace defects.
Convert a constant 6 to ETH_ALEN.
Use parentheses around sizeof.
Convert vmalloc/memset to vzalloc.
Remove now unused size variables.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In INTA mode, cnic and bnx2x share the same IRQ. During chip reset,
for example, cnic will stop servicing IRQs after it has shutdown the
cnic hardware resources. However, the shared IRQ is still active as
bnx2x needs to finish the reset. There is a window when bnx2x does
not know that cnic is no longer handling IRQ and things don't always
work properly.
Add a flag to tell bnx2x that cnic is handling IRQ. The flag is set
before the first cnic IRQ is expected and cleared when no more cnic
IRQs are expected, so there should be no race conditions.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of using symbol_get(), cnic can now directly call the cnic_probe
functions in struct bnx2x and struct bnx2. symbol_get() is not reliable
as it fails when the module is still initializing.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
by removing duplicate symbols and removing some redundant code.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the initiator and target try to close the connection at about the same
time, there is a race condition in the termination sequence for bnx2x.
Fix the problem by waiting for the remote termination to complete before
deleting the Connection ID. This will prevent the firmware assert.
Update version to 2.5.15.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without the reset, reloading the cnic driver can cause the iSCSI
Event Queue to be out of sync with the driver and cause intermittent
crash.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CONFIG_HOTPLUG is going away as an option. As result the __dev*
markings will be going away.
Remove use of __devinit, __devexit_p, __devinitdata, __devinitconst,
and __devexit.
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch moves the bnx2x and cnic drivers into using FW 7.8.2
which was recently submitted into the linux-firmware tree.
A short summary of minor bugs fixed by this FW:
1. In switch dependent mode, fix several issues regarding inner vlan
vs. DCB priorities.
2. iSCSI - not all packets were completed on a forward channel.
3. DCB - fixed for 4-port devices.
4. Fixed false parity reported in CAM memories when operating near -5%
on the 1.0V core supply.
5. ETS default settings are set to fairness between traffic classes
(rather than strict priority), and uses the same chip receive buffer
configuration for both PFC and pause.
For a complete list of fixes made by this FW, see commit 236367db
in the linux-firmware git repository.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update version to 2.5.13.
Reviewed-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To save memory and to exit IRQ loop quicker on devices that don't support
FCoE.
Reviewed-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will make it easier to exit IRQ loop and re-arm IRQ on devices that
don't support FCoE.
Reviewed-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will free up unneeded memory.
Reviewed-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These functions are needed to free up memory when the rings are no longer
needed.
Reviewed-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>