Despite the fact that PCI devices are enabled in this order:
1. pci_enable_device
2. pci_request_regions
Documentation/PCI/pci.txt specifies that they be undone
in this order
1. pci_disable_device
2. pci_release_regions
Tested by injecting error in the call to pci_enable_device
in hpsa_init_one -> hpsa_pci_init:
[ 9.095001] hpsa 0000:04:00.0: failed to enable PCI device
[ 9.095005] hpsa: probe of 0000:04:00.0 failed with error -22
(-22 is -EINVAL)
and then in the call pci_request_regions:
[ 9.178623] hpsa 0000:04:00.0: failed to obtain PCI resources
[ 9.178671] hpsa: probe of 0000:04:00.0 failed with error -16
(-16 is -EBUSY)
and then by adding
reset_devices
to the kernel command line and inject errors into the two
calls to pci_enable_device and the call to pci_request_regions
in hpsa_init_one -> hpsa_init_reset_devices.
(inject on 6th call, 1st to hpsa2)
[ 62.413750] hpsa 0000:04:00.0: Failed to enable PCI device
(inject on 7th call, 2nd to hpsa2)
[ 62.807571] hpsa 0000:04:00.0: failed to enable device.
(inject on 8th call, 3rd to hpsa2)
[ 62.697198] hpsa 0000:04:00.0: failed to obtain PCI resources
[ 62.697234] hpsa: probe of 0000:04:00.0 failed with error -16
The reset_devices path calls return -ENODEV on failure
rather than passing the result, which apparently doesn't
cause the pci driver to print anything.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Divide the loop in hpsa_scatter_gather() into two, one for the initial SG list
and a second one for the chained list, if any. This allows the conditional
check which resets the indicies for the chained list to be performed outside
the loop instead of being done on every iteration inside the loop.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Factor out the code which sends the TEST_UNIT_READY from
wait_for_device_to_become_ready() into its own function.
Move the code which waits for the TEST_UNIT_READY from
wait_for_device_to_become_ready() into its own function.
If a logical drive has failed, resetting it will ensure
outstanding commands are completed, but polling it with
TURs after the reset will not work because the TURs will
never report good status. So successful TUR should not
be a condition of success for the device reset error
handler.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Don't return from the abort request until the target command is complete.
Mark outstanding commands which have a pending abort, and do not send them
to the host if we can avoid it.
If the current command has been aborted, do not call the SCSI command
completion routine from the I/O path: when the abort returns successfully,
the SCSI mid-layer will handle the completion implicitly.
The following race was possible in theory.
1. LLD is requested to abort a scsi command
2. scsi command completes
3. The struct CommandList associated with 2 is made available.
4. new io request to LLD to another LUN re-uses struct CommandList
5. abort handler follows scsi_cmnd->host_scribble and
finds struct CommandList and tries to aborts it.
Now we have aborted the wrong command.
Fix by resetting the scsi_cmd field of struct CommandList
upon completion and making the abort handler check that
the scsi_cmd pointer in the CommadList struct matches the
scsi_cmnd that it has been asked to abort.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
cleanup command completions
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
add support for tmf when in ioaccel2 mode
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
The SCSI midlayer already prints more detail about completions,
and has logging level options to filter them if not wanted.
These just slow down the system if a lot of errors occur,
stressing error handling even more.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
report more useful information on aborts
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Improve initialization error handling in hpsa_init_one
Clean up style and indent issues
Rename functions for consistency
Improve error messaging on allocations
Fix return status from hpsa_put_ctlr_into_performant_mode
Correct free order in hpsa_init_one using new function
hpsa_free_performant_mode
Prevent inadvertent use of null pointers by nulling out the parent structures
and zeroing out associated size variables.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
correct return codes for error conditions
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
cmd_alloc can no longer return NULL, so don't check for NULL any more
(which is unreachable code).
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
improve ioaccel2 error handling, including better handling of
underrun statuses
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Increase the request size for ioaccel2 path.
The error, if any, returned by hpsa_allocate_ioaccel2_sg_chain_blocks
to hpsa_alloc_ioaccel2_cmd_and_bft should be returned upstream rather
than assumed to be -ENOMEM.
This differs slightly from hpsa_alloc_ioaccel1_cmd_and_bft,
which does not call another hpsa_allocate function and only
has -ENOMEM to return from some kmalloc calls.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
refactor freeing of resources into more logical functions
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
refactor error cleanup and shutdown
disable interrupts and pci_disable_device on critical failures
add hpsa_free_cfgtables function
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
replace calls to hpsa_free_irqs_and_disable_msix with
hpsa_free_irqs and hpsa_disable_interrupt_mode
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
get drive queue depth to help avoid task set full conditions.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
use ioaccel2 path to submit I/O to physical drives in HBA mode
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
offload_enabled changes are deferred until after the
added/updated prints occur, so the values are incorrect.
defer printing SSD Smart Path Enabled status information until the
information is correct
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
clean up command submission
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
allow the controller firmware to queue up commands when the ioaccel device
queue is full.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
add error handling for failure when registering with SCSI subsystem.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Factor out hpsa_cmd_init from cmd_alloc(). We also need
this for resubmitting commands down the default RAID path
when they have returned from the ioaccel paths with errors.
In particular, reinitialize the cmd_type and busaddr fields as these
will not be correct for submitting down the RAID stack path
after ioaccel command completion.
This saves time when submitting commands.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
make function names more consistent and meaningful
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
expose a detected lockup via sysfs
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
In hba mode, we could get sense data in descriptor format so
we need to handle that.
It's possible for CommandStatus to have value 0x0D
"TMF Function Status", which we should handle. We will get
this from a P1224 when aborting a non-existent tag, for
example. The "ScsiStatus" field of the errinfo field
will contain the TMF function status value.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
make tracking of outstanding commands more robust
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Do not send aborts to logical devices that do not support aborts
Instead of relying on what the Smart Array claims for supporting logical
drives, simply try an abort and see how it responds at device discovery
time. This way devices that do support aborts (e.g. MSA2000) can work
and we do not waste time trying to send aborts to logical drives that do
not support them (important for high IOPS devices.)
While rescanning devices only test whether devices support aborts
the first time we encounter a device rather than every time.
Some Smart Arrays required aborts to be sent with tags in
the wrong endian byte order. To avoid having to know about
this, we would send two aborts with tags with each endian order.
On high IOPS devices, this turns out to be not such a hot idea.
So we now have a list of the devices that got the tag backwards,
and we only send it one way.
If all available commands are outstanding and the abort handler
is invoked, the abort handler may not be able to allocate a command
and may busy-wait excessivly. Reserve a small number of commands
for the abort handler and limit the number of concurrent abort
requests to the number of reserved commands.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Allow driver initiated commands to have a timeout. It does not
yet try to do anything with timeouts on such commands.
We are sending a reset in order to get rid of a command we want to abort.
If we make it return on the same reply queue as the command we want to abort,
the completion of the aborted command will not race with the completion of
the reset command.
Rename hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd(), since
this function is the interface for issuing commands to the controller and
not the "core" of that implementation. Add a parameter to it which allows
the caller to specify the reply queue to be used. Modify existing callers
to specify the default reply queue.
Rename __hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd_core(),
since this routine is the "core" implementation of the "do simple command"
function and there is no longer any other function with a similar name.
Modify the existing callers of this routine (other than
hpsa_scsi_do_simple_cmd()) to instead call hpsa_scsi_do_simple_cmd(), since
it will now accept the reply_queue paramenter, and it provides a controller
lock-up check. (Also, tweak two related message strings to make them
distinct from each other.)
Submitting a command to a locked up controller always results in a timeout,
so check for controller lock-up before submitting.
This is to enable fixing a race between command completions and
abort completions on different reply queues in a subsequent patch.
We want to be able to specify which reply queue an abort completion
should occur on so that it cannot race the completion of the command
it is trying to abort.
The following race was possible in theory:
1. Abort command is sent to hardware.
2. Command to be aborted simultaneously completes on another
reply queue.
3. Hardware receives abort command, decides command has already
completed and indicates this to the driver via another different
reply queue.
4. driver processes abort completion finds that the hardware does not know
about the command, concludes that therefore the command cannot complete,
returns SUCCESS indicating to the mid-layer that the scsi_cmnd may be
re-used.
5. Command from step 2 is processed and completed back to scsi mid
layer (after we already promised that would never happen.)
Fix by forcing aborts to complete on the same reply queue as the command
they are aborting.
Piggybacking device rescanning functionality onto the lockup
detection thread is not a good idea because if the controller
locks up during device rescanning, then the thread could get
stuck, then the lockup isn't detected. Use separate work
queues for device rescanning and lockup detection.
Detect controller lockup in abort handler.
After a lockup is detected, return DO_NO_CONNECT which results in immediate
termination of commands rather than DID_ERR which results in retries.
Modify detect_controller_lockup() to return the result, to remove the need for
a separate check.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
We had a mix of formats used for specifying controller, bus, target,
and lun address of devices.
change to the format used by the scsi midlayer and upper layer (2:3:0:0)
so you can easily follow the information from hpsa to scsi midlayer
to sd upper layer.
Also add this information:
- product ID
- vendor ID
- RAID level
- SSD Smath Path capable and enabled
- exposure level (sg-only)
Example:
hpsa 0000:04:00.0: added scsi 2:0:0:0: Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap+ En+ Exp=4
scsi 2:0:0:0: Direct-Access HP LOGICAL VOLUME 10.0 PQ: 0 ANSI: 5
sd 2:0:0:0: [sdr] 12501713072 512-byte logical blocks: (6.40 TB/5.82 TiB)
sd 2:0:0:0: [sdr] 4096-byte physical blocks
sd 2:0:0:0: [sdr] Attached SCSI disk
sd 2:0:0:0: Attached scsi generic sg20 type 0
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Cache the ioaccel handle so that when we need to abort commands sent
down the ioaccel2 path, we can look up the LUN ID in h->dev[] instead of
having to do I/O to the controller.
Add a field to elements in h->dev[] to keep track of how the device is exposed
to the SCSI mid layer: Not at all, without an upper level driver
(no_uld_attach) or normally exposed.
Since masked physical devices are now present in h->dev[] array
it would be perfectly possible to do
echo scsi add-single-device 2 2 0 0 > /proc/scsi/scsi
and bring them online. This was previously not allowed for masked
physical devices.
Ensure that the mapping of physical disks to logical drives gets updated in a
consistent way when a RAID migration occurs and is not touched until updates
to it are complete.
now instead of doing CISS_REPORT_PHYSICAL to get the LUNID for
the physical disk in hpsa_get_pdisk_of_ioaccel2(), just get
it out of h->dev[] where we already have it cached.
do not touch phys_disk[] for ioaccel enabled logical drives during rescan
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@Suse.de>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
The hpsa driver touches the hardware before checking the pci-id table.
This way, especially in kdump, it may confuse the proper driver (cciss).
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Don Brace <Don.Brace@pmcs.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
This may be OK in archs with contiguous CPU numbers and without
hotplug CPUs, but it sets a terrible example.
And open-coding it like drivers/scsi/hpsa.c is just weird.
BTRFS has a weird comparison with num_online_cpus() too, but since
BTRFS just screwed up my test machines' root partition, I'm not
touching it :)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Oleg Drokin <green@linuxhacker.ru>
Correct compiler warning introduced by hpsa-add-local-workqueue patch
6636e7f455 hpsa: Use local workqueues
instead of system workqueues
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Suggested-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Webb Scales <webbnh@hp.com>
Reviewed-by: Kevin Barnett <Kevin.Barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Add in P840ar model name for gen9
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Add in gen9 controller model names
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Detect failues when attempting to change controller to use simple
or performant transport modes (mode change ack) rather than just
proceeding ahead after timeouts.
Return values are added to:
hpsa_put_ctlr_into_performant_mode
hpsa_wait_for_mode_change_ack
and all their callers check/propagate the result.
More consistency in printing errors and whether
dev_err is used.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Shorten the wait for the CISS configuration table doorbell mode
change acknowledgment from 300-600 s to 20 s, which is the value
specified in the CISS specification that should be honored by
all controllers.
Wait using interruptible msleep() rather than uninterruptible
usleep_range(), which triggers rt_sched timeout errors if the
wait is long.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Hoist the conditional out of do_not_scan_if_controller_locked_up() and
place it in the caller (this improves the code structure, making it
more consistent with other uses and enabling tail-call optimization);
rename the function to hpsa_scan_complete(), and use it at the end of
hpsa_scan_start() as well.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Move the code which sets up the SG descriptor out of hpsa_scatter_gather()
and into a subroutine where it can be reused (in the next patch). The Ext
field is now assigned unconditionally: this makes the refactor much simpler,
but more importantly it removes a conditional operation from inside the
loop. The case for which the conditional formerly tested is now executed
(unconditionally) after the loop is exited.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Performance tweak, avoid unnecessary function calls.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Printing the address of the command pointer is of little value, change
to print the CDB.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There's no reason for it to be a void *, it should be a struct scsi_cmnd *
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Returning failed from the device reset handler will get the device
kicked offline, which is fine if the controller is locked up anyhow.
Cannot abort a command from a failed controller.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Justin Lindley <justin.lindley@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Command allocation is the thing that takes the longest in the main i/o
path, so check for controller lockup immediately after this to prevent
submitting commands to locked up controller as much as possible.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In the code that translates logical drive LBAs to physical
drive LBAs if we overflow the raid map disk data array we
will get the wrong answers. We do not expect that to happen,
but best to be on the safe side and guard against it anyway.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acking controller events on controllers that do not support
it can cause such controllers to lock up.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In set_encrypt_ioaccel2() and in hpsa_scsi_ioaccel_raid_map
there were BUG_ONs that looked like this:
BUG_ON(!(dev->offload_config && dev->offload_enabled));
But, In hpsa_ack_ctlr_events() we have this,
/* Stop sending new RAID offload reqs via the IO accelerator */
scsi_block_requests(h->scsi_host);
for (i = 0; i < h->ndevices; i++)
h->dev[i]->offload_enabled = 0;
hpsa_drain_accel_commands(h);
So, we set offload_enabled = 0 for all drives, then do this
drain_accel_commands, so that means accel commands could still
be in flight, ie. perhaps having just been submitted into
hpsa_scsi_ioaccel_raid_map concurrent with ->offload_enabled
having just been set to zero.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Performance enhancement. Remove spin_locks from the driver.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Empirically, this improves performance slightly (~2% max IOPS) by
allowing cmd_alloc to remember where it left off searching for
free commands between calls instead of always starting its search
at command 0.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This means changing the allocator to reference count commands.
The reference count is now the authoritative indicator of whether a
command is allocated or not. The h->cmd_pool_bits bitmap is now
only a heuristic hint to speed up the allocation process, it is no
longer the authoritative record of allocated commands.
Since we changed the command allocator to use reference counting
as the authoritative indicator of whether a command is allocated,
fail_all_outstanding_cmds needs to use the reference count not
h->cmd_pool_bits for this purpose.
Fix hpsa_drain_accel_commands to use the reference count as the
authoritative indicator of whether a command is allocated instead of
the h->cmd_pool_bits bitmap.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When using the ioaccel submission methods, requests destined for RAID volumes
are sometimes diverted to physical devices. The OS has no or limited
knowledge of these physical devices, so it is up to the driver to avoid
pushing the device too hard. It is better to honor the physical device queue
limit rather than making the device spew zillions of TASK SET FULL responses.
This is so that hpsa based devices support /sys/block/sdNN/device/queue_type
of simple, which lets the SCSI midlayer automatically adjust the queue_depth
based on TASK SET FULL and GOOD status.
Adjust the queue depth for a new device after it is created based on the
maximum queue depths of the physical devices that constitute the
device. This drops the maximum queue depth from .can_queue of 1024 to
something like 174 for single-drive RAID-0, 348 for two-drive RAID-1, etc.
It also adjusts for the ratio of data to parity drives.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Instead of kicking the commands all the way back to the mid
layer, use a work queue. This enables having a mechanism for
the driver to be able to resubmit the commands down the "normal"
raid path without turning off the ioaccel feature entirely
whenever an error is encountered on the ioaccel path, and
prevent excessive rescanning of devices.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Factor out the bottom part of the queuecommand function
which is the part that builds commands for submitting down
the "normal' RAID stack path of a Smart Array.
Need to factor this out to improve how commands that
were initially sent down one of the "ioaccellerated"
paths but which have some sort of error condition are
retried down the "normal" path.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The original reasoning behind doing this was faulty. An error
of some sort would be encountered, accelerated i/o would be
disabled for that logical drive, the command would be kicked
back out to the SCSI midlayer for a retry, and since i/o accelerator
mode was disabled, it would get retried down the RAID path.
However, something needs to turn ioaccellerator mode back on,
and this rescan request was what did that. However, it was racy,
and extremely bad for performance to rescan all devices, so,
don't do that.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
By not doing maintaining a list of queued commands, we can eliminate some spin
locking in the main i/o path and gain significant improvement in IOPS. Remove
the queuing code and the code that calls it; remove now-unused interrupt code;
remove DIRECT_LOOKUP_BIT.
Now that the passthru commands share the same command pool as
the main i/o path, and the total size of the pool is less than
or equal to the number of commands that will fit in the hardware
fifo, there is no need to check to see if we are exceeding the
hardware fifo's depth.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We have commands reserved for internal use.
This is laying the groundwork for removing the internal
queue of commands from the driver so that the locks that
protect that queue may be removed.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We need to reserve some commands for device rescans,
aborts, and the pass through ioctls, etc. so we cannot
give them all to the scsi mid layer.
This is in preparation for removing cmd_special_alloc and
cmd_special_free so that we can stop queuing commands internally
in the driver so that we can remove the locks thta protect the
queue that we will no longer have.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
If hpsa_allocate_cmd_pool failed, we were calling two functions unnecessarily:
hpsa_free_sg_chain_blocks(h);
hpsa_free_cmd_pool(h);
This didn't cause any problem, as those functions can tolerate being called
when what they free hasn't been allocated (relevant pointers would be NULL)
but it is potentially confusing.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Partial allocation failure wasn't handled correctly
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Return the actual error code instead of a generic error code.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Make the function name more descriptive. We use more than
one interrupt.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Enhance error reporting.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cleanup comments to be more specific. Make messages more
informational.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Encapsulate the conditional predicate which tests for legacy controllers
in a separate function and rework the code comments.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There is nothing worrisome about the "Waiting for controller to
respond to no-op" print, so use dev_info rather than dev_warn.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
If the board ID lookup function fails, return the return
code rather than return -ENODEV.
The only board ID failure reason right now is -ENODEV,
so this just provides more informative prints in kdump
and adapts to future changes.
Tested with error injection while booting with
reset_devices
on the kernel command line:
[ 62.804324] injecting error in inj_hpsa_lookup_board_id: 1 11
[ 62.804423] hpsa 0000:04:00.0: Board ID not found
(the pci probe layer does not print an additional
message if -ENODEV is the reason)
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Return the real reason for kdump_hard_reset failure rather
than change them all to -ENODEV.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The queue depth printed at startup is in decimal, so
shouldn't have a 0x prefix.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In MSI and MSI-X mode, where hpsa asks for more than one interrupt,
hpsa_request_irqs forgets if the first request_irq call failed
if later ones succeed.
It needs to exit the loop on any failure rather than continue,
freeing all irqs that were requested until that point.
Also, it needs to clear out the q numbers up to MAX_REPLY_QUEUES.
The same is true for the general hpsa_free_irqs function.
Tested with error injection of -ENOSYS on the 4th call:
[ 9.277691] injecting error in inj_request_irq: 1 4
[ 9.277780] hpsa 0000:02:00.0: failed to get irq 35 for hpsa1
[ 10.711623] scsi host1: Error handler scsi_eh_1 exiting
[ 10.739170] hpsa: probe of 0000:02:00.0 failed with error -38
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Remove unused variable in hpsa_free_cmd_pool.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Change the function names to have hpsa prefix.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
HP now uses RAID-6 rather than RAID-ADG (Advanced Data Guarding)
as the marketing name for our implementation of RAID-6.
The driver considers RAID-1 and RAID-1+0 to be the same level, and
considers RAID-1ADM and RAID-1+0ADM to be the same level. Parenthesis
can be used to reflect the optional +0 portion of both those RAID levels.
Rename: RAID-ADG to RAID-6
RAID-1(1+0) to RAID-1(+0)
RAID-1(ADM) to RAID-1(+0)ADM
Also, add another const after the pointer type as suggested
by checkpatch.pl so the array is:
static const char * const raid_label[]
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We change drive queue depths to match drive reported queue depths.
The name of the SML function was changed from scsi_adjust_queue_depth
changed to scsi_change_queue_depth.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Change how SA controllers are reset by changing PCI power levels.
The hpsa driver was finding the PCI_PM_CTRL_STATE_MASK offset
then reading/writing a bitmask to change the power state. There
are kernel functions that do the same operations. Better to use
the kernel functions.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sometimes when the card is restarted it may cause -
"irq 16: nobody cared (try booting with the "irqpoll" option)"
that is likely caused so, that the card, after the hard reset
finishes, pulls on the irq. Disabling the ints before or after
the hpsa_kdump_hard_reset_controller fixes it.
At this point we can't know in which state the card is,
so using SA5_INTR_OFF + SA5_REPLY_INTR_MASK_OFFSET defines directly,
instead of the function the drivers provides, seems to be apropriate.
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There is a potential memory leak in hpsa_kdump_hard_reset_controller.
Reviewed-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Correct endiness issues reported by sparse. SA controllers are
little endian. This patch ensures endiness correctness.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Drop the now unused reason argument from the ->change_queue_depth method.
Also add a return value to scsi_adjust_queue_depth, and rename it to
scsi_change_queue_depth now that it can be used as the default
->change_queue_depth implementation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
We won't ever queue more commands than the host allows. Instead of
letting drivers either reject or ignore this case handle it in
common code. Note that various driver use internal constant or
variables that are assigned to both shost->can_queue and checked
in ->change_queue_depth - I did remove those checks as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
It is already using atomic test_and_set_bit to do the
allocation.
There is some microscopic chance of starvation, but it is
so microscopic that it should never happen in reality.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
If the kernel is booted with the reset_device parameter, which
is done for kdump, then the driver needs to call pci_set_master
after pci_enable_device to reenable bus mastering (since
the preceding pci_disable_device call disables bus mastering).
Also, place that after pci_request_regions both in the
kdump code and the normal pci_init code.
Remove the comment summarizing what pci_set_master
does, with the incomplete commentary on the impact of
pci_disable_device.
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There isn't anything in hpsa that requires the host lock to be held
during queuecommand.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Reviewed-by: Stephen M. Cameron <stephenmcameron@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We were printing a lot of useless information before ultimately
just passing things up to the SCSI mid layer. Just let the
midlayer handle it without LLD chatter.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Stephen M. Cameron <stephenmcameron@gmail.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Reviewed-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Use atomics for commands_outstanding instead of protecting with spin locks.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Stephen M. Cameron <stephenmcameron@gmail.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Using bit fields for hardware command fields isn't portable and
relies on assumptions about how the compiler lays out the bits.
We can fix this in the driver's internal command structure, but the
ioctl interface we can't change because it is part of the
userland ABI.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Webb Scales <webb.scales@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The hardware needs little endian scatter gather addresses and
lengths but we were not bothering to convert from cpu byte
order as we should have been. On Intel, this is all just
a bunch of no-ops macros, but it makes the code endian-clean(er).
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Reviewed-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We were allocating roughly double the amount of memory
we should be due to ReportLUNdata and ExtendedReportLUNdata
containing a non-zero sized array but adding extra memory
to allocate as if the array were zero sized.
Track the logical and physical sizes separately.
Allocate the memory based on the specific data
structure sizes.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Webb Scales <webb.scales@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In the case of LUN data changing, the driver will
auto rescan and so it's not even true that "action" is
"required".
Remove "action required" phrases from warning messages and
replace with description phrases.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Stephen M. Cameron <stephenmcameron@gmail.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Reviewed-by: Webb Scales <webb.scales@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Correct the size calculation of the chained SG block
Signed-off-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Reviewed-by: Stephen M. Cameron <stephenmcameron@gmail.com>
Reviewed-by: Don Brace <don.brace@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fix a couple of pci id table mistakes:
Subdevice ID 0x3323 missing from product[] table
(another name for HP Smart Storage 1210m)
Bogus 0x1925 subdevice id removed from hpsa_pci_device_id[] (no such thing.)
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
RAID-1ADM is unusable with dev_warn called on every command.
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Stephen M. Cameron <stephenmcameron@gmail.com>
Reviewed-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Clean up issues reported when running sparse.
Signed-off-by: Don Brace <don.brace@pmcs.com>
Reviewed-by: Webb Scales <webb.scales@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Remove the tagged argument from scsi_adjust_queue_depth, and just let it
handle the queue depth. For most drivers those two are fairly separate,
given that most modern drivers don't care about the SCSI "tagged" status
of a command at all, and many old drivers allow queuing of multiple
untagged commands in the driver.
Instead we start out with the ->simple_tags flag set before calling
->slave_configure, which is how all drivers actually looking at
->simple_tags except for one worke anyway. The one other case looks
broken, but I've kept the behavior as-is for now.
Except for that we only change ->simple_tags from the ->change_queue_type,
and when rejecting a tag message in a single driver, so keeping this
churn out of scsi_adjust_queue_depth is a clear win.
Now that the usage of scsi_adjust_queue_depth is more obvious we can
also remove all the trivial instances in ->slave_alloc or ->slave_configure
that just set it to the cmd_per_lun default.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a call to pci_set_master(...) missing in the previous
patch "hpsa: refine the pci enable/disable handling".
Found thanks to Rob Elliot.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Robert Elliott <elliott@hp.com>
Tested-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When a second(kdump) kernel starts and the hard reset method is used
the driver calls pci_disable_device without previously enabling it,
so the kernel shows a warning -
[ 16.876248] WARNING: at drivers/pci/pci.c:1431 pci_disable_device+0x84/0x90()
[ 16.882686] Device hpsa
disabling already-disabled device
...
This patch fixes it, in addition to this I tried to balance also some other pairs
of enable/disable device in the driver.
Unfortunately I wasn't able to verify the functionality for the case of a sw reset,
because of a lack of proper hw.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() or pci_enable_msi_exact()
and pci_enable_msix_range() or pci_enable_msix_exact()
interfaces.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: "Stephen M. Cameron" <scameron@beardog.cce.hp.com>
Cc: iss_storagedev@hp.com
Cc: linux-scsi@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Currently the driver falls back to INTx mode when MSI-X
initialization failed. This is a suboptimal behaviour
for chips that also support MSI. This update changes that
behaviour and falls back to MSI mode in case MSI-X mode
initialization failed.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: "Stephen M. Cameron" <scameron@beardog.cce.hp.com>
Cc: iss_storagedev@hp.com
Cc: linux-scsi@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
When copy_from_user fails, return -EFAULT, not -ENOMEM
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-by: Robert Elliott <elliott@hp.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Reviewed-by: Scott Teel <scott.teel@hp.com>
Reviewed by: Mike MIller <michael.miller@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When devices come on line, they should be removed from the list of
offline devices that are monitored.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Scott Teel <scott.teel@hp.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Reviewed by: Mike MIller <michael.miller@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
commit 28e1344647 "[SCSI] hpsa: enable unit attention reporting"
turns on unit attention notifications, but got the change wrong for
all architectures other than x86, which now store an uninitialized
value into the device register.
Gcc helpfully warns about this:
../drivers/scsi/hpsa.c: In function 'hpsa_set_driver_support_bits':
../drivers/scsi/hpsa.c:6373:17: warning: 'driver_support' is used uninitialized in this function [-Wuninitialized]
driver_support |= ENABLE_UNIT_ATTN;
^
This moves the #ifdef so only the prefetch-enable is conditional
on x86, not also reading the initial register contents.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 28e1344647 "[SCSI] hpsa: enable unit attention reporting"
Cc: stable@vger.kernel.org # v3.14+
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
a 6-byte READ/WRITE CDB with a 0 block data transfer really
means a 256 block data transfer. The RAID mapping code failed
to handle this case. For 10/12/16 byte READ/WRITEs, 0 just means
no data should be transferred, and should not trigger BUG_ON.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-by: Robert Elliott <elliott@hp.com>
Reviewed-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The SCSI standard defines 64-bit values for LUNs, and large arrays
employing large or hierarchical LUN numbers become more and more
common.
So update the linux SCSI stack to use 64-bit LUN numbers.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Make return value an int instead of an unsigned char so that
we do not lose negative error return values.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Webb Scales <webb.scales@hp.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
They are annoying and do not help anyone.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Justin Lindley <justin.lindley@hp.com>
Reviewed-by: Mike Miller <michael.miller@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
It shouldn't happen that we get a check condition with no sense data, but if it
does, we shouldn't just drop the check condition on the floor.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Justin Lindley <justin.lindley@hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Justin Lindley <justin.lindley@hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There's nothing the user can or should do about these messages,
the commands are retried down the normal RAID path, and the
messages just flood the logs and sap performance.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
for controllers which support either of the ioaccel transport methods.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Avoid excessive locking by using per-cpu variable for lockup_detected
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Reviewed-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Now that we can allocate more than 4 reply queues (up to 64)
we shouldn't try to make them share the same allocation but
should allocate them separately.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Reviewed-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
No sense having 8 or 16 reply queues if you only have 4 cpus,
and likewise no sense limiting to 8 reply queues if you have
many more cpus.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Reviewed-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Webb Scales <webb.scales@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
After 3.22 firmware, PMC firmware guys tell us the
previous 5 second delay after a reset now needs to
be 10 secs to avoid a PCIe error due to the driver
looking at the controller too soon after the reset.
Signed-off-by: Justin Lindley <justin.lindley@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Treat the the data direction bits as a bit mask allowing both
READ and WRITE at the same time instead of testing for equality
to see if it's a exclusively a READ or a WRITE.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Reviewed-by: Webb Scales <webb.scales@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The fields "major", "max_outstanding", and "usage_count"
of struct ctlr_info were not used for anything.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Mike Miller <michael.miller@canonical.com>
Reviewed-by: Webb Scales <webb.scales@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
rescan_hba_mode was defined as a u8 so could never be less than zero:
rescan_hba_mode = hpsa_hba_mode_enabled(h);
if (rescan_hba_mode < 0)
goto out;
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
And while we're at it fix a magic number
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Checking for a NULL return from a kzalloc call in hpsa_get_pdisk_of_ioaccel2.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Initialize local variable trans_support before it is used rather
than after. It is supposed to contain the value of a register on the
controller containing bits that describe which transport modes the
controller supports (e.g. "performant", "ioaccel1", "ioaccel2"). A
NULL pointer dereference will almost certainly occur if trans_support
is not initialized at the right point. If for example the uninitialized
trans_support value does not have the bit set for ioaccel2 support when it
should be, then ioaccel2_alloc_cmds_and_bft() will not get called as it
should be and the h->ioaccel2_blockFetchTable array will remain NULL
instead of being allocated. Too late, trans_support finally gets
initialized with the correct value with ioaccel2 mode bit set,
which later causes calc_bucket_map() to be called to fill in
h->ioaccel2_blockFetchTable[]. However h->ioaccel2_blockFetchTable
is NULL because it didn't get allocated because earlier trans_support
wasn't initialized at the right point.
Fixes: e1f7de0cdd
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-by: Baoquan He <bhe@redhat.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
It caused the i/o request to always be counted as ineligible for
the accelerated i/o path on 32 bit systems and negatively affected
performance.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Structure was already memset to zero at the top
of hpsa_scsi_ioaccel2_queue_command
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This allows exposing physical disks behind Smart
Array controllers to the OS (if the controller
has the right firmware and is in "hba" mode)
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
rc is set in the loop, and it isn't set back to zero anywhere
this patch fixes it
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Do not expose drives that are undergoing a format immediately
to the OS, instead wait until they are ready before bringing
them online. This is so that logical drives created with
"rapid parity initialization" do not get immediately kicked
off the system for being unresponsive.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
On encountering unexpected error conditions from driver initiated
commands, print something useful like CDB and sense data rather than
something useless like the kernel virtual address of the command buffer.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Do no rescan on every events -- way too many rescans are
triggered if we don't filter the events. Limit rescans
to be triggered by the following set of events:
* controller state change
* enclosure hot plug
* physical drive state change
* logical drive state change
* redundant controller state change
* accelerated io enabled/disabled
* accelerated io configuration change
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Don't wait for *all* commands to complete, only for accelerated mode
commands.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Add controller-based data-at-rest encryption compatibility
to ioaccel2 path (HP SSD Smart Path).
Encryption feature requires driver to supply additional fields
for encryption enable, tweak index, and data encryption key index
in the ioaccel2 request structure.
Encryption enable flag and data encryption key index come from
raid_map data structure from raid offload command.
During ioaccel2 submission, check device structure's raid map to see if
encryption is enabled for the device. If so, call new function below.
Add function set_encrypt_ioaccel2 to set encryption flag, data encryption key
index, and calculate tweak value from request's logical block address.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Change the handling of HP SSD Smart Path errors with status:
0x02 CHECK CONDITION
0x08 BUSY
0x18 RESERVATION CONFLICT
0x40 TASK ABORTED
So that they get retried on the RAID Path.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Code was confused and assumed that page zero was not
VPD page and all non-zero pages were VPD pages.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Allow driver to schedule a rescan whenever a request fails on the ioaccel2 path.
This eliminates the possibility of driver getting stuck in non-ioaccel mode.
IOaccel mode (HP SSD Smart Path) is disabled by driver upon error detection.
Driver relied on idea that request would be retried through normal path, and a
subsequent error would occur on that path, and be processed by controller
firmware. As part of that process, controller disables ioaccel mode and later
reinstates it, signalling driver to change modes.
In some error cases, the error will not duplicate on the standard path,
so the driver could get stuck in non-ioaccel mode.
To avoid that, we allow driver to request a rescan during the next run of the
rescan thread.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Allow SSD Smart Path for a controller to be disabled by
the user, regardless of settings in controller firmware
or array configuration.
To disable: echo 0 > /sys/class/scsi_host/host<id>/acciopath_status
To re-enable: echo 1 > /sys/class/scsi_host/host<id>/acciopath_status
To check state: cat /sys/class/scsi_host/host<id>/acciopath_status
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Load balance across members of a N-way mirror set, and
handle the meta-RAID levels: R10, R50, R60.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Otherwise we could wind up using incorrect raid map data, and
then very bad things would likely happen.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Underlying firmware cannot handle task abort on accelerated path (SSD Smart Path).
Change abort requests for accelerated path commands to physical target reset.
Send reset request on normal IO path.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Joe Handzik <Joseph.T.Handzik@hp.com>
Signed-off-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Mike MIller <michael.miller@canonical.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* Do not check event bits on locked up controllers to
see if they need to be rescanned.
* Do not initiate any device rescans on controllers
which are known to be locked up.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
For shared SAS configurations, hosts need to poll Smart Arrays
periodically in order to be able to detect configuration changes
such as logical drives being added or removed from remote hosts.
A register on the controller indicates when such events have
occurred, and the driver polls the register via a workqueue
and kicks off a rescan of devices if such an event is detected.
Additionally, changes to logical drive raid offload eligibility
are autodetected in this way.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When rescanning for logical drives, store information about whather
raid offload is enabled for each logical drive, and update the driver's
internal record of this.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This enables sending i/o's destined for RAID logical drives
which can be serviced by a single physical disk down a different,
faster i/o path directly to physical drives for certain logical
volumes on SSDs bypassing the Smart Array RAID stack for a
performance improvement.
Signed-off-by: Matt Gates <matthew.gates@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Don Brace <brace@beardog.cce.hp.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
For "mode 1" io accelerated commands, the command tag is in
a different location than for commands that go down the normal
RAID path, so the abort handler needs to take this into account.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Mike Miller <michael.miller@canonical.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When commands sent down the "fast path" fail, they must be re-tried down the
normal RAID path. We do this by kicking i/o's back to the scsi mid layer with
a DID_SOFT_ERROR status, which causes them to be retried. This won't work for
SG_IO's and other non REQ_TYPE_FS i/o's which could get kicked all the way back
to the application, which may have no idea that the command needs resubmitting
and likely no way to resubmit it in such a way the that driver can recognize it
as a resubmit and send it down the normal RAID path. So we just always send
non REQ_TYPE_FS i/o's down the normal RAID path, never down the "fast path".
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
For certain i/o's to certain devices (unmasked physical disks) we
can bypass the RAID stack firmware and do the i/o to the device
directly and it will be faster.
Signed-off-by: Matt Gates <matthew.gates@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This is normally optional, but for SSD Smart Path support (in
subsequent patches) it is required.
Signed-off-by: Matt Gates <matthew.gates@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
There is an extended report luns command which contains
additional information about physical devices. In particular
we need to get the physical device handle so we can use an
alternate i/o path for fast physical devices like SSDs so
we can speed up certain i/o's by bypassing the RAID stack
code in the controller firmware.
Signed-off-by: Matt Gates <matthew.gates@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Commit 254f796b9f updated
the driver to use 16 MSI-X vectors, despite the fact that
older controllers would provide only 4.
This was causing MSI-X registration to drop down to INTx
mode. But as the controller support performant mode, the
initialisation will become confused and cause the machine
to stall during boot.
This patch fixes up the MSI-X registration to re-issue
the pci_enable_msix() call with the correct number of
MSI-X vectors. With that the hpsa driver continues to
works on older controllers like the P200.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
We were clobbering the SCSI status and setting
cmd->result = DID_SOFT_ERROR << 16; to get a retry,
but better to let the mid layer handle the unit
attention.
Signed-off-by: Matt Gates <matthew.gates@hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Immediately following a hard board reset, There are some
mandatory delays during which we must not access the board
and during which we might miss the "not ready" status,
therefore it is a mistake to look for and expect to see
the "not ready" status.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This used to be the default, but at some point the firmware guys
changed the default and I failed to notice. Now to get unit
attention notifications, you must twiddle a bit indicating you
want them.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The field contains more bits than just the one
to indicate whether scsi prefetch should be turned on.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Much simpler and avoids races starting/stopping the thread.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If a fifo full condition is encountered, i/o requests will stack
up in the h->reqQ queue. The only thing which empties this queue
is start_io, which only gets called when new i/o requests come in.
If none are forthcoming, i/o in h->reqQ will be stalled.
To fix this, whenever fifo full condition is encountered, this
is recorded, and the interrupt handler examines this to see
if a fifo full condition was recently encountered when a
command completes and will call start_io to prevent i/o's in
h->reqQ from getting stuck.
I've only ever seen this problem occur when running specialized
test programs that pound on the the CCISS_PASSTHRU ioctl.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Cap CCISS_BIG_PASSTHRU as well. If an attempt is made
to exceed this, ioctl() will return -1 with errno == EAGAIN.
This is to prevent a userland program from exhausting all of
pci_alloc_consistent memory. I've only seen this problem when
running a special test program designed to provoke it. 20
concurrent commands via the passthru ioctls (not counting SG_IO)
should be more than enough.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
We were leaking a command buffer if a DMA mapping error was
encountered in the CCISS_BIG_PASSTHRU ioctl.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Scott Teel <scott.teel@hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The hardware guys tell us that after initiating a software
reset via the doorbell register we need to wait 5 seconds before
attempting to talk to the board *at all*. This means that we
cannot watch the board to verify it transitions from "ready" to
to "not ready" then back "ready", since this transition will
most likely happen during those 5 seconds (though we can still
verify the reset happens by watching the "driver version" field
get cleared.)
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
There's no point in trying since it can't work, and if you do
try, it will just hang the system on shutdown.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
A return value of 1 is interpreted as an error. See pci_driver.
in local_pci_probe(). If you're wondering how this ever could
have worked, it's because it used to be the case that only return
values less than zero were interpreted as failure. But even in
the current kernel if the driver registers its various entry
points with the kernel, and then returns a value which is
interpreted as failure, those registrations aren't undone, so
the driver still mostly works. However, the driver's remove
function wouldn't be called on rmmod, and pci power management
functions wouldn't work. In the case of Smart Array, since it
has a battery backed cache (or else no cache) even if the driver
is not shut down properly as long as there is no outstanding
i/o, nothing too bad happens, which is why it took so long to
notice.
Requesting backport to stable because the change to pci-driver.c
which requires driver probe functions to return 0 occurred between
2.6.35 and 2.6.36 (the pci power management breakage) and again
between 3.7 and 3.8 (pci_dev->driver getting set to NULL in
local_pci_probe() preventing driver remove function from being
called on rmmod.)
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
We inadvertantly discarded the scsi status for aborted commands.
For some commands (e.g. reads from tape drives) these can't be retried,
and if we discarded the scsi status, the scsi mid layer couldn't notice
anything was wrong and the error was not reported.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Some host adapters do not pass commands through to the target disk
directly. Instead they provide an emulated target which may or may not
accurately report its capabilities. In some cases the physical device
characteristics are reported even when the host adapter is processing
commands on the device's behalf. This can lead to adapter firmware hangs
or excessive I/O errors.
This patch disables WRITE SAME for devices connected to host adapters
that provide an emulated target. Driver writers can disable WRITE SAME
by setting the no_write_same flag in the host adapter template.
[jejb: fix up rejections due to eh_deadline patch]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Pull trivial tree updates from Jiri Kosina:
"Usual earth-shaking, news-breaking, rocket science pile from
trivial.git"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
doc: add missing files to timers/00-INDEX
timekeeping: Fix some trivial typos in comments
mm: Fix some trivial typos in comments
irq: Fix some trivial typos in comments
NUMA: fix typos in Kconfig help text
mm: update 00-INDEX
doc: Documentation/DMA-attributes.txt fix typo
DRM: comment: `halve' -> `half'
Docs: Kconfig: `devlopers' -> `developers'
doc: typo on word accounting in kprobes.c in mutliple architectures
treewide: fix "usefull" typo
treewide: fix "distingush" typo
mm/Kconfig: Grammar s/an/a/
kexec: Typo s/the/then/
Documentation/kvm: Update cpuid documentation for steal time and pv eoi
treewide: Fix common typo in "identify"
__page_to_pfn: Fix typo in comment
Correct some typos for word frequency
clk: fixed-factor: Fix a trivial typo
...
Since commit 0998d06310
(device-core: Ensure drvdata = NULL when no driver is bound),
the driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch set is a set of driver updates (megaraid_sas, fnic, lpfc, ufs,
hpsa) we also have a couple of bug fixes (sd out of bounds and ibmvfc error
handling) and the first round of esas2r checker fixes and finally the much
anticipated big endian additions for megaraid_sas.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJSNheiAAoJEDeqqVYsXL0MueMIAKD1kaB0oooRawE1+0vpKmyV
eE2M6trA8ofTeq0z1eNfRsVMkRsUuG9exW0CKS2z6mHiWwQ/zGbqT7ukveW+dMi3
mjKD0yO5ODk6bohWX/LiwZ6NGZSwC0dbIacXNy5ZsXKEizqwo1Jcc7qC/0AWn+o7
WpIL48XLPH0HqjQZ3dvgC6TWeFZOn9cKOWvQQq0S3ENALOx/eLZ+C7VrJLx5Magv
myNOUkTLzdlYglQfjaNO6et98k2oHTrzKwH7U2X6U75q7L8Pkj4RbNzce/Ge301V
u+R1w+BlbeTPdHopTBoTJupsvqDYBZxVwS7rr8nhSvfKduQppHnN6jX8yR4XNeM=
=RG3j
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull misc SCSI driver updates from James Bottomley:
"This patch set is a set of driver updates (megaraid_sas, fnic, lpfc,
ufs, hpsa) we also have a couple of bug fixes (sd out of bounds and
ibmvfc error handling) and the first round of esas2r checker fixes and
finally the much anticipated big endian additions for megaraid_sas"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (47 commits)
[SCSI] fnic: fnic Driver Tuneables Exposed through CLI
[SCSI] fnic: Kernel panic while running sh/nosh with max lun cfg
[SCSI] fnic: Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset
[SCSI] fnic: Remove QUEUE_FULL handling code
[SCSI] fnic: On system with >1.1TB RAM, VIC fails multipath after boot up
[SCSI] fnic: FC stat param seconds_since_last_reset not getting updated
[SCSI] sd: Fix potential out-of-bounds access
[SCSI] lpfc 8.3.42: Update lpfc version to driver version 8.3.42
[SCSI] lpfc 8.3.42: Fixed issue of task management commands having a fixed timeout
[SCSI] lpfc 8.3.42: Fixed inconsistent spin lock usage.
[SCSI] lpfc 8.3.42: Fix driver's abort loop functionality to skip IOs already getting aborted
[SCSI] lpfc 8.3.42: Fixed failure to allocate SCSI buffer on PPC64 platform for SLI4 devices
[SCSI] lpfc 8.3.42: Fix WARN_ON when driver unloads
[SCSI] lpfc 8.3.42: Avoided making pci bar ioremap call during dual-chute WQ/RQ pci bar selection
[SCSI] lpfc 8.3.42: Fixed driver iocbq structure's iocb_flag field running out of space
[SCSI] lpfc 8.3.42: Fix crash on driver load due to cpu affinity logic
[SCSI] lpfc 8.3.42: Fixed logging format of setting driver sysfs attributes hard to interpret
[SCSI] lpfc 8.3.42: Fixed back to back RSCNs discovery failure.
[SCSI] lpfc 8.3.42: Fixed race condition between BSG I/O dispatch and timeout handling
[SCSI] lpfc 8.3.42: Fixed function mode field defined too small for not recognizing dual-chute mode
...
Changes the version of hpsa so we know something has changed. Please consider
this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch does a bit of housekeeping for hpsa. Change lowercase alpha hex
digits to uppercase for consistency within the driver. Also moves the P822se
in the tables to keep controllers of each family grouped together.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Add the marketing names for HP Smart Array Gen8 controllers. Also removes an
unused ID. Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch adds the PCI ID's for HP Smart Array Gen9 controllers. Please
consider this patch for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Pull trivial tree from Jiri Kosina:
"The usual trivial updates all over the tree -- mostly typo fixes and
documentation updates"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (52 commits)
doc: Documentation/cputopology.txt fix typo
treewide: Convert retrun typos to return
Fix comment typo for init_cma_reserved_pageblock
Documentation/trace: Correcting and extending tracepoint documentation
mm/hotplug: fix a typo in Documentation/memory-hotplug.txt
power: Documentation: Update s2ram link
doc: fix a typo in Documentation/00-INDEX
Documentation/printk-formats.txt: No casts needed for u64/s64
doc: Fix typo "is is" in Documentations
treewide: Fix printks with 0x%#
zram: doc fixes
Documentation/kmemcheck: update kmemcheck documentation
doc: documentation/hwspinlock.txt fix typo
PM / Hibernate: add section for resume options
doc: filesystems : Fix typo in Documentations/filesystems
scsi/megaraid fixed several typos in comments
ppc: init_32: Fix error typo "CONFIG_START_KERNEL"
treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks
page_isolation: Fix a comment typo in test_pages_isolated()
doc: fix a typo about irq affinity
...
section Signed-off-by: John Kacur <jkacur@redhat.com>
On a 3.6-rt (real-time patch) kernel we are seeing the following BUG
However, it appears to be relevant for non-realtime (mainline) as well.
[ 49.688847] hpsa 0000:03:00.0: hpsa0: <0x323a> at IRQ 67 using DAC
[ 49.749928] scsi0 : hpsa
[ 49.784437] BUG: using smp_processor_id() in preemptible [00000000
00000000] code: kworker/u:0/6
[ 49.784465] caller is enqueue_cmd_and_start_io+0x5a/0x100 [hpsa]
[ 49.784468] Pid: 6, comm: kworker/u:0 Not tainted
3.6.11.5-rt37.52.el6rt.x86_64.debug #1
[ 49.784471] Call Trace:
[ 49.784512] [<ffffffff812abe83>] debug_smp_processor_id+0x123/0x150
[ 49.784520] [<ffffffffa009043a>] enqueue_cmd_and_start_io+0x5a/0x100
[hpsa]
[ 49.784529] [<ffffffffa00905cb>]
hpsa_scsi_do_simple_cmd_core+0xeb/0x110 [hpsa]
[ 49.784537] [<ffffffff812b09c8>] ? swiotlb_dma_mapping_error+0x18/0x30
[ 49.784544] [<ffffffff812b09c8>] ? swiotlb_dma_mapping_error+0x18/0x30
[ 49.784553] [<ffffffffa0090701>]
hpsa_scsi_do_simple_cmd_with_retry+0x91/0x280 [hpsa]
[ 49.784562] [<ffffffffa0093558>]
hpsa_scsi_do_report_luns.clone.2+0xd8/0x130 [hpsa]
[ 49.784571] [<ffffffffa00935ea>]
hpsa_gather_lun_info.clone.3+0x3a/0x1a0 [hpsa]
[ 49.784580] [<ffffffffa00963df>] hpsa_update_scsi_devices+0x11f/0x4f0
[hpsa]
[ 49.784592] [<ffffffff81592019>] ? sub_preempt_count+0xa9/0xe0
[ 49.784601] [<ffffffffa00968ad>] hpsa_scan_start+0xfd/0x150 [hpsa]
[ 49.784613] [<ffffffff8158cba8>] ? rt_spin_lock_slowunlock+0x78/0x90
[ 49.784626] [<ffffffff813b04d7>] do_scsi_scan_host+0x37/0xa0
[ 49.784632] [<ffffffff813b05da>] do_scan_async+0x1a/0x30
[ 49.784643] [<ffffffff8107c4ab>] async_run_entry_fn+0x9b/0x1d0
[ 49.784655] [<ffffffff8106ae92>] process_one_work+0x1f2/0x620
[ 49.784661] [<ffffffff8106ae20>] ? process_one_work+0x180/0x620
[ 49.784668] [<ffffffff8106d4fe>] ? worker_thread+0x5e/0x3a0
[ 49.784674] [<ffffffff8107c410>] ? async_schedule+0x20/0x20
[ 49.784681] [<ffffffff8106d5d3>] worker_thread+0x133/0x3a0
[ 49.784688] [<ffffffff8106d4a0>] ? manage_workers+0x190/0x190
[ 49.784696] [<ffffffff81073236>] kthread+0xa6/0xb0
[ 49.784707] [<ffffffff815970a4>] kernel_thread_helper+0x4/0x10
[ 49.784715] [<ffffffff81082a7c>] ? finish_task_switch+0x8c/0x110
[ 49.784721] [<ffffffff8158e44b>] ? _raw_spin_unlock_irq+0x3b/0x70
[ 49.784727] [<ffffffff8158e85d>] ? retint_restore_args+0xe/0xe
[ 49.784734] [<ffffffff81073190>] ? kthreadd+0x1e0/0x1e0
[ 49.784739] [<ffffffff815970a0>] ? gs_change+0xb/0xb
-------------------
This is caused by
enqueue_cmd_and_start_io()->
set_performant_mode()->
smp_processor_id()
Which if you have debugging enabled calls debug_processor_id() and triggers the warning.
The code here is
c->Header.ReplyQueue = smp_processor_id() % h->nreply_queues;
Since it is not critical that the code complete on the same processor,
but the cpu is a hint used in generating the ReplyQueue and will still work if
the cpu migrates or is preempted, it is safe to use the raw_smp_processor_id()
to surpress the false positve warning.
Signed-off-by: John Kacur <jkacur@redhat.com>
Acked-by: Stephen Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>