A Gentoo bug report [1] showed that as of 2.6.31 lpfc only uses INTx interrupts.
This patch restores lpfc's ability to support MSI-X/MSI interrupts that the
"Addition of SLI4 Interface - Base Support" patch [2] broke.
It reestablishes MSI-X as the default interrupt method and in case MSI-X is not
supported lpfc_sli{4,}_enable_intr fallbacks to MSI and then to INTx.
[1]: http://bugs.gentoo.org/show_bug.cgi?id=296319
[2]: commit da0436e915
[James Smart:
Background:
Nothing Broke. This was intended.
We had originally enabled MSI-X by default, but in qualification within the
last 12 months, we encountered a major catch-22:
There were at least 4 platforms, from 2 major OEMs, that :
- Say they support MSI-X - platform routines work and act as if they do.
- We enable it, generate a test interrupt to check they really do deliver it,
and it works.
- But shortly after attachment, the system hangs or loses interrupts,
resulting in a bad system behavior.
Given the distro's picking up the 2.6.32 kernel, we had to stick with a
default of MSI-X off, with user-enabled MSI-X as these platforms couldn't get
fixed.
However, we're also now encountering platforms that require MSI-X and never
INTx, so we must change. It's desired also for also for performance reasons.
So - now (2.6.33) is the right time to re-enable MSI-X by default.
]
[jejb: fix up comment on default values]
Signed-off-by: George Kadianakis <desnacked@gmail.com>
Acked-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Firmware is loaded from flash for these ISP types.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
qla2xxx: EEH added call to pci_restore_state.
qla2xxx: EEH added delay in slot reset routine.
qla2xxx: EEH moved call to pci_save_state(), see (1).
qla2xxx: EEH additional changes for RHEL5.5.
qla2xxx: EEH added function call, removed function call, see (2).
(1) In qla2xxx_probe_one the call to pci_save_state() has been
moved to after the call to qla2xxx_request_irqs().
(2) Add call to pci_disable_pcie_error_reporting() in remove_one.
Delete call to pci_cleanup_aer_uncorrect_error_status() in pci_resume.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch enables TEXT Request / Response for the driver
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch fixes can_queue being uninitiallized since it
was done before beiscsi_get_params was called.
Thanks to Mike Christie for identifying this
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This fixes a situation where the sessions were being killed whenever
LinkUP is notified rather than LinkDown
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patches enables async mode for mcc rings so that
multiple requests can be queued.
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch removes the endianess change that was wrongly
added for data_count
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch decides whether ack based completion is required or not
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch enables use of opcode that is passed in
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch moves pci_set_drvdata to inside beiscsi_hba_alloc
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch adds opcodes in thecompletion path that were
missed out earlier
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch will link the current allocated wrb with the next
wrb that will be allocated. This is a requirement from the chip.
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
We need to hold on to ep resources untill invalidate and
close connection are completed
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch enablesi be2iscsi to use the start number and number
of cids/icd provided by FW rather than hard coded values.
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Further to the lsml thread titled:
"does scsi_io_completion need to dump sense data for ata pass through (ck_cond =
1) ?"
This is a patch to skip logging when the sense data is
associated with a SENSE_KEY of "RECOVERED_ERROR" and the
additional sense code is "ATA PASS-THROUGH INFORMATION
AVAILABLE". This only occurs with the SAT ATA PASS-THROUGH
commands when CK_COND=1 (in the cdb). It indicates that
the sense data contains ATA registers.
Smartmontools uses such commands on ATA disks connected via
SAT. Periodic checks such as those done by smartd cause
nuisance entries into logs that are:
- neither errors nor warnings
- pointless unless the cdb that caused them are also logged
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The check on MAX_SCSI_TAR should be >= instead of > or we could go past the
end of the array.
Joe Eykholt aslo correctly points out that the check on MAX_LUN should be
>= as well. That matches with how it is used in the rest of the file.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Stanse found that c3cn is poked many times around in
cxgb3i_conn_pdu_ready, there is no need to check if it is NULL.
Remove the test.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Karen Xie <kxie@chelsio.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use kzalloc rather than kcalloc(1,...)
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
@@
- kcalloc(1,
+ kzalloc(
...)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by:Jack Wang <jack_wang@usish.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The mac_esp PIO algorithm no longer works in 2.6.31 and crashes my Centris
660av. So here's a better one.
Also, force async with esp_set_offset() rather than esp_slave_configure().
One of the SCSI drives I tested still doesn't like the PIO mode and fails
with "esp: esp0: Reconnect IRQ2 timeout" (the same drive works fine in
PDMA mode).
This failure happens when esp_reconnect_with_tag() tries to read in two
tag bytes but the chip only provides one (0x20). I don't know what causes
this. I decided not to waste any more time trying to fix it because the
best solution is to rip out the PIO mode altogether and use the DMA
engine.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Currently dev_loss_tmo is capped by SCSI_DEVICE_BLOCK_MAX_TIMEOUT.
This causes problem with multipathing when the 'no_path_retry' setting
exceeds the dev_loss_tmo setting, as then the system might run into
a deadlock when all paths have been removed temporarily for longer
than dev_loss_tmo.
The principal reasons for the capping has been that we should
not allow a remote port to remain in status 'blocked' indefinitely,
so the capping is there to ensure that the port status is being reset
eventually.
However, the fast_io_fail_tmo will also move the remote port out of
the 'blocked' state, so for any HBA driver implementing both the
capping should really be on the fast_io_fail_tmo, and not on the
dev_loss_tmo.
This patch implements just that, ie the fast_io_fail_tmo is capped
to SCSI_DEVICE_BLOCK_TIMEOUT and the capping is removed from
dev_loss_tmo when fast_io_fail_tmo is set.
This allows us to synchronize the dev_loss_tmo setting to the
'no_path_retry' setting from multipathing thus avoiding the deadlock.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Allows i == IM_MAX_HOSTS, which is out of range.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This allows i == MAX_INT_PARAM, which is out of range for ints[]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Allows i == MAX_INT_PARAM, which is out of range.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fixed a typo in libsrp.c: replaced two occurrences of 'RDAM' by 'RDMA'.
Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The best way to fix this is to eliminate the intenal kmalloc() and
make the caller allocate the required amount of storage.
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When removing several devices aic79xx will occasionally Oops
in ahd_handle_nonpkt_busfree during rescan. Looking at the
code I found that we're indeed not checking if the scb in
question is NULL. So check for it before accessing it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The hardware used with zfcp cannot abort a currently pending CT or ELS
request. Therefore we need the option to postpone the timeout
triggered request abort within the fc layer, since there is nothing
zfcp can do to stop the request at this point.
Cc: James Smart <James.Smart@emulex.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The driver did not account for non-tape devices needing to employ
proper FCP2 recovery. Driver now checks the FCP2-capable flag
only, rather than using a midlayer-determined flag (TYPE_TAPE).
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Because of the terrible structuring of scsi-bidi-commands
it breaks some of the life time rules of a scsi-command.
It is now not allowed to free up the block-request before
cleanup and partial deallocation of the scsi-command. (Which
is not so for none bidi commands)
The right fix to this problem would be to make bidi command
a first citizen by allocating a scsi_sdb pointer at scsi command
just like cmd->prot_sdb. The bidi sdb should be allocated/deallocated
as part of the get/put_command (Again like the prot_sdb) and the
current decoupling of scsi_cmnd and blk-request should be kept.
For now make sure scsi_release_buffers() is called before the
call to blk_end_request_all() which might cause the suicide of
the block requests. At best the leak of bidi buffers, at worse
a crash, as there is a race between the existence of the bidi_request
and the free of the associated bidi_sdb.
The reason this was never hit before is because only OSD has the potential
of doing asynchronous bidi commands. (So does bsg but it is never used)
And OSD clients just happen to do all their bidi commands synchronously, up
until recently.
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
These particular problems were reported by Cisco and SAP and customers
as well. Cisco reported on RHEL4 U6 and SAP reported on SLES9 SP4 and
SLES10 SP2. We added these fixes on RHEL4 U6 and gave a private build
to IBM and Cisco. Cisco and IBM tested it for more than 15 days and
they reported that they did not see the issue so far. Before the fix,
Cisco used to see the issue within 5 days. We generated a patch for
SLES9 SP4 and SLES10 SP2 and submitted to Novell. Novell applied the
patch and gave a test build to SAP. SAP tested and reported that the
build is working properly.
We also tested in our lab using the tools "dishogsync", which is IO
stress tool and the tool was provided by Cisco.
Issue1: File System going into read-only mode
Root cause: The driver tends to not free the memory (FIB) when the
management request exits prematurely. The accumulation of such
un-freed memory causes the driver to fail to allocate anymore memory
(FIB) and hence return 0x70000 value to the upper layer, which puts
the file system into read only mode.
Fix details: The fix makes sure to free the memory (FIB) even if the
request exits prematurely hence ensuring the driver wouldn't run out
of memory (FIBs).
Issue2: False Raid Alert occurs
When the Physical Drives and Logical drives are reported as deleted or
added, even though there is no change done on the system
Root cause: Driver IOCTLs is signaled with EINTR while waiting on
response from the lower layers. Returning "EINTR" will never initiate
internal retry.
Fix details: The issue was fixed by replacing "EINTR" with
"ERESTARTSYS" for mid-layer retries.
Signed-off-by: Penchala Narasimha Reddy <ServeRAIDDriver@hcl.in>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
lpfc_hbadisc.c and lpfc_hw4.h accidentally got set executable.
Reported-by: Thomas Backlund <tmb@mandriva.org>
Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
/sys/bus/pci/drivers/megaraid_sas/poll_mode_io defaults to being
world-writable, which seems bad (letting any user affect kernel driver
behavior).
This turns off group and user write permissions, so that on typical
production systems only root can write to it.
Signed-off-by: Bryn M. Reeves <bmr@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix discovery failures:
- Move all accesses to the fc_flag field inside the host lock.
- Restore link state after going through linkdown processing for FCF DEAD event.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix SCSI protocol related errors:
- Avoid I/O failures during EEH and HBA/CNA reset by correcting when
we block the targets on the adapter.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix hardware/SLI relates issues:
- Fix CNA uses more than one EQ when in INTx interrupt mode.
- Fix driver tries to process failed read FCF record mailbox request.
- Fix allocating single receive buffer breaks FCoE receive queue.
- Support new read FCF record mailbox error case.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix NPIV operation errors:
- Fix vport not logging out of fabric when being deleted
- Fix vport fails to discover targets after devloss timeout.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix FC protocol errors:
- Fix multi-frame unsolicited sequences not queued properly
- Fix frames for unsolicited sequences not being associated with sequence.
- Fix unsolicited frame buffer sizes are not set properly
- Fix Sequence count for unsolicited frame headers not byte swapped.
- Fix Multi-frame sequence response frames go to wrong DID.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
During a manual scan, a user can send command to a nonexistent
lun, precisely at the point of max_lun. Normally it's possible
(but not required) that the firmware has the knowledge that it
is an invalid lun. In the particular case when max_lun is 256,
however, the nonexistent lun 256 will be confused with lun 0,
because the lun member in a request message is only u8, and 256
will become 0. So we need to fix the problem, at least, at the
driver level.
Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For a particular driver error condition, driver was doing double
scsi_dma_unmaps. Driver was calling scsi_dma_unmap in
pmcraid_error_handler and return 0. This pmcraid_error_handler is called
by pmcraid_io_done which will do scsi_dma_unmap again when it has
return 0 from pmcraid_error_handler.
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Added fundamental reset and pci save state.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>