Commit Graph

16844 Commits

Author SHA1 Message Date
Christoph Hellwig
6600593cbd block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE
The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
is not what is happening - instead the driver already completed the
command.  Fix the symbolic name to reflect that a little better.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-29 08:59:21 -06:00
James Smart
1b5c2cb196 scsi: lpfc: update driver version to 12.0.0.4
Update the driver version to 12.0.0.4

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
b92dc72df3 scsi: lpfc: Fix port initialization failure.
The driver exits port setup after failing the lpfc_sli4_get_parameters
command (messages 0356, 2541, & 1412).

The older CNA adapters do not support the MBX command. In the past
the code was allowed to fail and continue on with initialization.
However a nvme change moved a closing bracket and now makes all
failures terminal.

Revise the logic so that terminal failure only occurs if the command
failed on the newer adapters. Additionally, if parameters are set
that require information from the command and the command failed,
the parameters are erroneous and port set up should fail even on
the older adapters.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
c221768bd4 scsi: lpfc: Fix 16gb hbas failing cq create.
The lancer G5 chip family fails the CQ create with 16k page size.  The
hardware incorrectly reports it supports large page sizes when it is
actually limited to 4k pages.

A prior patch resolved this for the A0 chip revision only.  This patch
excludes all revisions of the G5 asic from using large page sizes. As
knowing the actual chip revision is unnecessary, the now unused definitions
are removed

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
7438273fa2 scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
modprobe -r lpfc produces the following:

Call Trace:
 __blk_mq_run_hw_queue+0xa2/0xb0
 __blk_mq_delay_run_hw_queue+0x9d/0xb0
 ? blk_mq_hctx_has_pending+0x32/0x80
 blk_mq_run_hw_queue+0x50/0xd0
 blk_mq_sched_insert_request+0x110/0x1b0
 blk_execute_rq_nowait+0x76/0x180
 nvme_keep_alive_work+0x8a/0xd0 [nvme_core]
 process_one_work+0x17f/0x440
 worker_thread+0x126/0x3c0
 ? manage_workers.isra.24+0x2a0/0x2a0
 kthread+0xd1/0xe0
 ? insert_kthread_work+0x40/0x40
 ret_from_fork_nospec_begin+0x21/0x21
 ? insert_kthread_work+0x40/0x40

However, rmmod lpfc would run correctly.

When an nvme remoteport is unregistered with the host nvme transport, it
needs to set the remoteport->dev_loss_tmo value 0 to indicate an immediate
termination of device loss and prevent any further keep alives to that
rport.  The driver was never setting dev_loss_tmo causing the nvme
transport to continue to send the keep alive.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
4d5e789a2e scsi: lpfc: correct oversubscription of nvme io requests for an adapter
Under large configurations, the driver would start to log message 6065 -
NVME out of buffers (exchanges).

The driver is using the ndlp cmd_qdepth value when determining the max
outstanding ios for an adapter. This value, by default, is set to 65536,
which exceeds the maximum exchange counts supported on an adapter. The ndlp
cmd_qdepth has no relevance and outstanding io count should be capped at
the max exchange count with IO requests beyond that level getting bounced
back with an EBUSY status so that they are retried by the block layer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
dc19e3b4a8 scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx)
MDS diagnostics fail because of frame count mismatch.

Unavailability of SGL is the trigger for this issue. If ELS SGL is not
available to process MDS frame, IOCB is put in FCP txq but not attempted to
post afterwards. So, driver stops processing incoming frames as it runs out
of IOCB.  lpfc_drain_txq attempts to submit IOCBS that are queued in ELS
txq but MDS frames are posted to FCP WQ.

Attempt to submit IOCBs that are present in FCP txq when MDS loopback is
running.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
3e1fb1b8ab scsi: hisi_sas: Mark PHY as in reset for nexus reset
When issuing a nexus reset for directly attached device, we want to ignore
the PHY down events so libsas will not deform and reform the port.

In the case that the attached SAS changes for the reset, libsas will deform
and form a port.

For scenario that the PHY does not come up after a timeout period, then
report the PHY down to libsas.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
d87e72fb4f scsi: hisi_sas: Fix return value when get_free_slot() failed
It is an step of executing task to get free slot. If the step fails, we
will cleanup LLDD resources and should return failure to upper layer or
internal caller to abort task execution of this time.

But in the current code, the caller of get_free_slot() doesn't return
failure when get_free_slot() failed. This patch is to fix it.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
31709548d2 scsi: hisi_sas: Terminate STP reject quickly for v2 hw
For v2 hw, STP link from target is rejected after host reset because of a
SoC bug. The STP reject will be terminated after we have sent IO from each
PHY of a port.

This is not an problem before, as we don't need to setup STP link from
target immediately after host reset. But now, it is.  Because we want to
send soft-reset immediately after host reset.

In order to terminate STP reject quickly, this patch send ATA reset command
through each PHY of a port. Notes: ATA reset command don't need target's
response.

Besides, we do abort dev for each device before terminating STP reject.
This is a quirk of v2 hw.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
b09fcd09e9 scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
This patch adds a force PHY function for internal ATA command for v2 hw.

Because there is an SoC bug in v2 hw, and need send an IO through each PHY
of a port to work around a bug which occurs after a controller reset.

This force PHY function will be used in the later patch.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
78bd2b4f6e scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
In future scenarios we will want to use the TMF struct for more task types
than SSP.

As such, we can add struct hisi_sas_tmf_task directly into struct
hisi_sas_slot, and this will mean we can remove the TMF parameters from the
task prep functions.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
a865ae14ff scsi: hisi_sas: Try wait commands before before controller reset
We may reset the controller in many scenarios, such as SCSI EH and HW
errors. There should be no IO which returns from target when SCSI EH is
active. But for other scenarios, there may be.  It is not necessary to make
such IOs fail.

This patch adds an function of trying to wait for any commands, or IO, to
complete before host reset. If no more CQ returned from host controller in
100ms, we assume no more IO can return, and then stop waiting. We wait 5s
at most.

The HW has a register CQE_SEND_CNT to indicate the total number of CQs that
has been reported to driver. We can use this register and it is reliable to
resd this register in such scenarios that require host reset.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
6175abdeae scsi: hisi_sas: Init disks after controller reset
After the controller is reset, it is possible that the disks attached still
have outstanding IO to complete.

Thus, when the PHYs come back up after controller reset, it is possible
that these IOs complete at some unknown point later.

We want to ensure that all IOs are complete after the controller reset so
that all associated IPTT and other resources can be recycled safely.

To achieve this, re-init the disks by TMF or softreset (in case of ATA
devices).

If the init fails - maybe because the device was removed or link has not
come up - then do not release the device resources, but rather rely on SCSI
EH to handle the timeout for these resources later on.

This patch also does some cleanup to hisi_sas_init_disk(), including
removing superfluous cases in the switch statement.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
235bfc7ff6 scsi: hisi_sas: Create a scsi_host_template per HW module
When a SCSI host is registered, the SCSI mid-layer takes a reference to a
module in Scsi_host.hostt.module. In doing this, we are prevented from
removing the driver module for the host in dangerous scenario, like when a
disk is mounted.

Currently there is only one scsi_host_template (sht) for all HW versions,
and this is the main.c module. So this means that we can possibly remove
the HW module in this dangerous scenario, as SCSI mid-layer is only
referencing the main.c module.

To fix this, create a sht per module, referencing that same module to
create the Scsi host.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
d5a60dfdb3 scsi: hisi_sas: Reset disks when discovered
When a disk is discovered, it may be in an error state, or there may be
residual commands remaining in the disk.

To ensure any disk is in good state after discovery, reset via TMF (for SAS
disk) or softreset (for a SATA disk).

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
428f1b3424 scsi: hisi_sas: Add LED feature for v3 hw
This patch implements LED feature of directly attached disk for v3 hw.

In fact, this hw has created an SGPIO component for LED feature, and we can
control LEDs just by internal registers.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
1b86518581 scsi: hisi_sas: Change common allocation mode of device id
To reduce possibility of hitting unknown SoC bugs and aid debugging and
test, change allocation mode of device id from last used device id instead
of lowest available index.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:31 -04:00
Xiang Chen
fa3be0f231 scsi: hisi_sas: change slot index allocation mode
Currently we find the lowest available empty bit in the IPTT bitmap to
allocate the IPTT for a command.

To reduce possibility of hitting unknown SoC bugs and also aid in the
debugging of those same bugs, change the allocation mode.

The next allocation method is to use the next free slot adjacent to the
most recently allocated slot, in a round-robin fashion.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:31 -04:00
John Garry
757db2dae2 scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
There is much common code and functionality between the HW versions to set
the PHY linkrate.

As such, this patch factors out the common code into a generic function
hisi_sas_phy_set_linkrate().

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:31 -04:00
Wei Yongjun
eb217359eb scsi: hisi_sas: fix a typo in hisi_sas_task_prep()
Fix a typo in hisi_sas_task_prep().

Fixes: 7eee4b9218 ("scsi: hisi_sas: relocate smp sg map")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 21:36:51 -04:00
Colin Ian King
9af3c47924 scsi: pm80xx: fix spelling mistake "UNSORPORTED" -> "SUPPORTED"
Trivial fix to spelling mistake in pm8001_printk message text; also I
believe NOT_UNSUPPORTED should probably be NOT_SUPPORTED. Also fix the
indent of the pm8001_printk statement.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 21:31:07 -04:00
Douglas Gilbert
e37c7d9a03 scsi: core: sanitize++ in progress
Commit 505aa4b6a8 ("scsi: sd: Defer spinning up drive while SANITIZE is
in progress") may not be sufficient, especially if the SCSI SANITIZE
command is sent via the bsg or sg pass-throughs, since they don't use the
sd driver.

Add "Sanitize in progress" plus some other recent "... in progress"
additional sense codes into the scsi mid-level so they are treated in a
similar fashion to "Format in progress".

[mkp: checkpatch]

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 21:29:46 -04:00
Bart Van Assche
c9ddf73476 scsi: scsi_transport_srp: Fix shost to rport translation
Since an SRP remote port is attached as a child to shost->shost_gendev
and as the only child, the translation from the shost pointer into an
rport pointer must happen by looking up the shost child that is an
rport. This patch fixes the following KASAN complaint:

BUG: KASAN: slab-out-of-bounds in srp_timed_out+0x57/0x110 [scsi_transport_srp]
Read of size 4 at addr ffff880035d3fcc0 by task kworker/1:0H/19

CPU: 1 PID: 19 Comm: kworker/1:0H Not tainted 4.16.0-rc3-dbg+ #1
Workqueue: kblockd blk_mq_timeout_work
Call Trace:
dump_stack+0x85/0xc7
print_address_description+0x65/0x270
kasan_report+0x231/0x350
srp_timed_out+0x57/0x110 [scsi_transport_srp]
scsi_times_out+0xc7/0x3f0 [scsi_mod]
blk_mq_terminate_expired+0xc2/0x140
bt_iter+0xbc/0xd0
blk_mq_queue_tag_busy_iter+0x1c7/0x350
blk_mq_timeout_work+0x325/0x3f0
process_one_work+0x441/0xa50
worker_thread+0x76/0x6c0
kthread+0x1b2/0x1d0
ret_from_fork+0x24/0x30

Fixes: e68ca75200 ("scsi_transport_srp: Reduce failover time")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 21:23:38 -04:00
David S. Miller
5b79c2af66 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of easy overlapping changes in the confict
resolutions here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-26 19:46:15 -04:00
Linus Torvalds
b68ea0ee03 for-linus-20180524
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJbBramAAoJEPfTWPspceCm2gsP/R5p/Vo8Ml303cdgCOrw+Q+T
 W/Wp6hGmNUZ7HjMlfF85rF1bqpgGoj226qIibHvKi7eYwUScDNgJwcrrWnNazbGx
 Rnl/+NQ/H/38pBvvDJKjth9n0LY/O1geemhnWzYZmqmZJe3jFNTDpw5pGy4RkTpJ
 4DCuXg2n81x+kDt7Nslb0dYONkrc1yUjelHS/sJKlUoPs1MvsICGsWNS+Lw0WMIj
 Ls282QNs1Z6Y1gcM18UXsTNJSXQFTBmsbH3CIBukHckP2wjZrMEvgM+uxIORqwID
 0DJWBSVNJMxrsyZtkMMkkz2wMjPjSvx3HjD/ULpglJD6nGSwRAiQr6XIxARSEEDL
 3prQyhEeVGSioOjt5IR2Llt9QDYVhb1GJqqHYmDZux0SMCVg1Pv7mclrudpXGpzq
 v2mSJqnwLofOuTFrSh6afgxClD/yh1UNKByf4Ni78PagAgNS4SjEIz8VDG4m1iNu
 UgWMbM+vqpZhgIlSDsmnqFEVyGwCWwUOHGCeshQXIy9xaWbCtmaGQpNh4Tho273O
 wH6F/jwCnw1BcUeiwOQi+qcwEf1z6nRTGGb3hZt7u2Tj6r8HZzNHtkRaFXsIrnvA
 Vc3Y9+YzRWBVV6wzqPC5K+alvGs5ZXLj7BIxQorEkoMkNCYWiPsiBS/rtc0OYiST
 b2rC+5eYzquEZ42aqIIn
 =QBaF
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20180524' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Two fixes that should go into this release:

   - a loop writeback error clearing fix from Jeff

   - the sr sense fix from myself"

* tag 'for-linus-20180524' of git://git.kernel.dk/linux-block:
  loop: clear wb_err in bd_inode when detaching backing file
  sr: pass down correctly sized SCSI sense buffer
2018-05-24 08:53:20 -07:00
Ganesh Goudar
1d19023fa6 cxgb4: change the port capability bits definition
MDI Port Capabilities bit definitions were inconsistent with
regard to the MDI enum values. 2 bits used to define MDI in
the port capabilities are not really separable, it's a 2-bit
field with 4 different values. Change the port capability bit
definitions to be "AUTO" and "STRAIGHT" in order to get them
to line up with the enum's.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:00:56 -04:00
Manish Rangankar
269afb3603 qedi: Add get_generic_tlv_data handler.
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Manish Rangankar
534bbdf883 qedi: Add support for populating ethernet TLVs.
This patch adds callbacks for providing the ethernet protocol driver TLVs.

Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Chad Dupuis
8673daf4f5 qedf: Add get_generic_tlv_data handler.
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Chad Dupuis
642a0b37e6 qedf: Add support for populating ethernet TLVs.
This patch adds callbacks for providing the ethernet protocol driver TLVs.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
David S. Miller
6f6e434aa2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
S390 bpf_jit.S is removed in net-next and had changes in 'net',
since that code isn't used any more take the removal.

TLS data structures split the TX and RX components in 'net-next',
put the new struct members from the bug fix in 'net' into the RX
part.

The 'net-next' tree had some reworking of how the ERSPAN code works in
the GRE tunneling code, overlapping with a one-line headroom
calculation fix in 'net'.

Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
that read the prog members via READ_ONCE() into local variables
before using them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:01:54 -04:00
Jens Axboe
f7068114d4 sr: pass down correctly sized SCSI sense buffer
We're casting the CDROM layer request_sense to the SCSI sense
buffer, but the former is 64 bytes and the latter is 96 bytes.
As we generally allocate these on the stack, we end up blowing
up the stack.

Fix this by wrapping the scsi_execute() call with a properly
sized sense buffer, and copying back the bits for the CDROM
layer.

Cc: stable@vger.kernel.org
Reported-by: Piotr Gabriel Kosinski <pg.kosinski@gmail.com>
Reported-by: Daniel Shapira <daniel@twistlock.com>
Tested-by: Kees Cook <keescook@chromium.org>
Fixes: 82ed4db499 ("block: split scsi_request out of struct request")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-21 12:21:14 -06:00
Colin Ian King
017b3f8a10 scsi: snic: fix a couple of spelling mistakes: "COMPLETE"
Trivial fix to spelling mistakes/typos:
"SNIC_IOREQ_ABTS_COMPELTE" -> "SNIC_IOREQ_ABTS_COMPLETE"
"SNIC_IOREQ_LR_COMPELTE"   -> "SNIC_IOREQ_LR_COMPLETE"
"SNIC_IOREQ_CMD_COMPELTE"  -> "SNIC_IOREQ_CMD_COMPLETE"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:49 -04:00
Christophe Jaillet
51b910c3c7 scsi: qlogicpti: Fix an error handling path in 'qpti_sbus_probe()'
The 'free_irq()' call is not at the right place in the error handling
path.  The changed order has been introduced in commit 3d4253d9af
("[SCSI] qlogicpti: Convert to new SBUS device framework.")

Fixes: 3d4253d9af ("[SCSI] qlogicpti: Convert to new SBUS device framework.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:48 -04:00
Vijay Viswanath
10e5e37581 scsi: ufs: Add clock ungating to a separate workqueue
UFS driver can receive a request during memory reclaim by kswapd.  So
when ufs driver puts the ungate work in queue, and if there are no idle
workers, kthreadd is invoked to create a new kworker. Since kswapd task
holds a mutex which kthreadd also needs, this can cause a deadlock
situation. So ungate work must be done in a separate work queue with
WQ_MEM_RECLAIM flag enabled.  Such a workqueue will have a rescue thread
which will be called when the above deadlock condition is possible.

Signed-off-by: Vijay Viswanath <vviswana@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:47 -04:00
Venkat Gopalakrishnan
7f6ba4f12e scsi: ufs: make sure all interrupts are processed
As multiple requests are submitted to the ufs host controller in
parallel there could be instances where the command completion interrupt
arrives later for a request that is already processed earlier as the
corresponding doorbell was cleared when handling the previous
interrupt. Read the interrupt status in a loop after processing the
received interrupt to catch such interrupts and handle it.

Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:47 -04:00
Subhash Jadavani
69a6fff068 scsi: ufs: ufs-qcom: remove broken hci version quirk
UFSHCD_QUIRK_BROKEN_UFS_HCI_VERSION is only applicable for QCOM UFS host
controller version 2.x.y and this has been fixed from version 3.x.y
onwards, hence this change removes this quirk for version 3.x.y onwards.

[mkp: applied by hand]

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:35 -04:00
Subhash Jadavani
38135535dc scsi: ufs: add reference counting for scsi block requests
Currently we call the scsi_block_requests()/scsi_unblock_requests()
whenever we want to block/unblock scsi requests but as there is no
reference counting, nesting of these calls could leave us in undesired
state sometime. Consider following call flow sequence:

1. func1() calls scsi_block_requests() but calls func2() before
   calling scsi_unblock_requests()
2. func2() calls scsi_block_requests()
3. func2() calls scsi_unblock_requests()
4. func1() calls scsi_unblock_requests()

As there is no reference counting, we will have scsi requests unblocked
after #3 instead of it to be unblocked only after #4. Though we may not
have failures seen with this, we might run into some failures in future.
Better solution would be to fix this by adding reference counting.

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:20:48 -04:00
Subhash Jadavani
b334456ec2 scsi: ufs: ufshcd: fix possible unclocked register access
Vendor specific setup_clocks ops may depend on clocks managed by ufshcd
driver so if the vendor specific setup_clocks callback is called when
the required clocks are turned off, it results into unclocked register
access.

This change make sure that required clocks are enabled before vendor
specific setup_clocks callback is called.

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:20:48 -04:00
Maya Erez
2e3611e954 scsi: ufs: fix exception event handling
The device can set the exception event bit in one of the response UPIU,
for example to notify the need for urgent BKOPs operation.  In such a
case, the host driver calls ufshcd_exception_event_handler to handle
this notification.  When trying to check the exception event status (for
finding the cause for the exception event), the device may be busy with
additional SCSI commands handling and may not respond within the 100ms
timeout.

To prevent that, we need to block SCSI commands during handling of
exception events and allow retransmissions of the query requests, in
case of timeout.

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:20:48 -04:00
Kees Cook
60d6d22d85 scsi: dpt_i2o: Remove VLA usage
On the quest to remove all VLAs from the kernel[1] this moves the
sg_list variable off the stack, as already done for other allocated
buffers in adpt_i2o_passthru(). Additionally consolidates the error path
for kfree().

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:03:51 -04:00
Bjorn Andersson
092b45583c scsi: ufs: Use freq table with devfreq
devfreq requires that the client operates on actual frequencies, not only 0
and UMAX_INT and as such UFS brok with the introduction of f1d981eaec ("PM
/ devfreq: Use the available min/max frequency").

This patch registers the frequencies of the first clock with devfreq and use
these to determine if we're trying to step up or down.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> [for devfreq & OPP part]
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:28:18 -04:00
Bjorn Andersson
deac444f4e scsi: ufs: Extract devfreq registration
Failing to register with devfreq leaves hba->devfreq assigned, which causes
the error path to dereference the ERR_PTR(). Rather than bolting on more
conditionals, move the call of devm_devfreq_add_device() into it's own
function and only update hba->devfreq once it's successfully registered.

The subsequent patch builds upon this to make UFS actually work again, as
it's been broken since f1d981eaec ("PM / devfreq: Use the available
min/max frequency")

Also switch to use DEVFREQ_GOV_SIMPLE_ONDEMAND constant.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:28:18 -04:00
Michael Kelley
1b25a8c4d2 scsi: storvsc: Avoid allocating memory for temp cpumasks
Current code allocates 240 Kbytes (in typical configs) for each synthetic
SCSI controller to use as temp cpumask variables.  Recode to avoid needing
the temp cpumask variables and remove the memory allocation.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:28:17 -04:00
Uma Krishnan
cd43c221bb scsi: cxlflash: Isolate external module dependencies
Depending on the underlying transport, cxlflash has a dependency on either
the CXL or OCXL drivers, which are enabled via their Kconfig option.
Instead of having a module wide dependency on these config options, it is
better to isolate the object modules that are dependent on the CXL and OCXL
drivers and adjust the module dependencies accordingly.

This commit isolates the object files that are dependent on CXL and/or
OCXL. The cxl/ocxl fops used in the core driver are tucked under an ifdef to
avoid compilation errors.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
de5d35aff5 scsi: cxlflash: Abstract hardware dependent assignments
As a staging cleanup to support transport specific builds of the cxlflash
module, relocate device dependent assignments to header files. This will
avoid littering the core driver with conditional compilation logic.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
5e12397a97 scsi: cxlflash: Add include guards to backend.h
The new header file, backend.h, that was recently added is missing the
include guards. This commit adds the guards.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Matthew R. Ochs
e63a8d886d scsi: cxlflash: Use local mutex for AFU serialization
AFUs can only process a single AFU command at a time. This is enforced with
a global mutex situated within the AFU send routine. As this mutex has a
global scope, it has the potential to unnecessarily block commands destined
for other AFUs.

Instead of using a global mutex, transition the mutex to be per-AFU. This
will allow commands to only be blocked by siblings of the same AFU.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
32a9ae415b scsi: cxlflash: Acquire semaphore before invoking ioctl services
When a superpipe process that makes use of virtual LUNs is terminated or
killed abruptly, there is a possibility that the cxlflash driver could hang
and deprive other operations on the adapter.

The release fop registered to be invoked on a context close, detaches every
LUN associated with the context. The underlying service to detach the LUN
assumes it has been called with the read semaphore held, and releases the
semaphore before any operation that could be time consuming.

When invoked without holding the read semaphore, an opportunity is created
for the semaphore's count to become negative when it is temporarily released
during one of these potential lengthy operations. This negative count
results in subsequent acquisition attempts taking forever, leading to the
hang.

To support the current design point of holding the semaphore on the ioctl()
paths, the release fop should acquire it before invoking any ioctl services.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
d58188c306 scsi: cxlflash: Limit the debug logs in the IO path
The kernel log can get filled with debug messages from send_cmd_ioarrin()
when dynamic debug is enabled for the cxlflash module and there is a lot of
legacy I/O traffic.

While these messages are necessary to debug issues that involve command
tracking, the abundance of data can overwrite other useful data in the
log. The best option available is to limit the messages that should serve
most of the common use cases.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
e0f76ad130 scsi: cxlflash: Yield to active send threads
The following Oops may be encountered if the device is reset, i.e. EEH
recovery, while there is heavy I/O traffic:

59:mon> t
[c000200db64bb680] c008000009264c40 cxlflash_queuecommand+0x3b8/0x500
					[cxlflash]
[c000200db64bb770] c00000000090d3b0 scsi_dispatch_cmd+0x130/0x2f0
[c000200db64bb7f0] c00000000090fdd8 scsi_request_fn+0x3c8/0x8d0
[c000200db64bb900] c00000000067f528 __blk_run_queue+0x68/0xb0
[c000200db64bb930] c00000000067ab80 __elv_add_request+0x140/0x3c0
[c000200db64bb9b0] c00000000068daac blk_execute_rq_nowait+0xec/0x1a0
[c000200db64bba00] c00000000068dbb0 blk_execute_rq+0x50/0xe0
[c000200db64bba50] c0000000006b2040 sg_io+0x1f0/0x520
[c000200db64bbaf0] c0000000006b2e94 scsi_cmd_ioctl+0x534/0x610
[c000200db64bbc20] c000000000926208 sd_ioctl+0x118/0x280
[c000200db64bbcc0] c00000000069f7ac blkdev_ioctl+0x7fc/0xe30
[c000200db64bbd20] c000000000439204 block_ioctl+0x84/0xa0
[c000200db64bbd40] c0000000003f8514 do_vfs_ioctl+0xd4/0xa00
[c000200db64bbde0] c0000000003f8f04 SyS_ioctl+0xc4/0x130
[c000200db64bbe30] c00000000000b184 system_call+0x58/0x6c

When there is no room to send the I/O request, the cached room is refreshed
by reading the memory mapped command room value from the AFU. The AFU
register mapping is refreshed during a reset, creating a race condition that
can lead to the Oops above.

During a device reset, the AFU should not be unmapped until all the active
send threads quiesce. An atomic counter, cmds_active, is currently used to
track internal AFU commands and quiesce during reset. This same counter can
also be used for the active send threads.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiaofei Tan
2f6bca202b scsi: hisi_sas: add check of device in hisi_sas_task_exec()
Currently we don't check that device is not gone before dereferencing
its elements in the function hisi_sas_task_exec() (specifically, the DQ
pointer).

This patch fixes this issue by filling in the DQ pointer in
hisi_sas_task_prep() after we check that the device pointer is still
safe to reference.

[mkp: typo]

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiang Chen
e85d93b212 scsi: hisi_sas: Use device lock to protect slot alloc/free
The IPTT of a slot is unique, and we currently use hisi_hba lock to
protect it.

Now slot is managed on hisi_sas_device.list, so use DQ lock to protect
for allocating and freeing the slot.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiang Chen
fa222db0b0 scsi: hisi_sas: Don't lock DQ for complete task sending
Currently we lock the DQ to protect whole delivery process.  So this
stops us building slots for the same queue in parallel, and can affect
performance.

To optimise it, only lock the DQ during special periods, specifically
when allocating a slot from the DQ and when delivering a slot to the HW.

This approach is now safe, thanks to the previous patches to ensure that
we always deliver a slot to the HW once allocated.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiang Chen
3de0026dad scsi: hisi_sas: allocate slot buffer earlier
Currently we allocate the slot's memory buffer after allocating the DQ
slot.

To aid DQ lockout reduction, and allow slots to be built in parallel,
move this step (which can fail) prior to allocating the slot.

Also a stray spin_unlock_irqrestore() is removed from internal task exec
function.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiang Chen
a2b3820bdd scsi: hisi_sas: make return type of prep functions void
Since the task prep functions now should not fail, adjust the return
types to void.

In addition, some checks in the task prep functions are relocated to the
main module; this is specifically the check for the number of elements
in an sg list exceeded the HW SGE limit.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiang Chen
7eee4b9218 scsi: hisi_sas: relocate smp sg map
Currently we use DQ lock to protect delivery of DQ entry one by one.

To optimise to allow more than one slot to be built for a single DQ in
parallel, we need to remove the DQ lock when preparing slots, prior to
delivery.

To achieve this, we rearrange the slot build order to ensure that once
we allocate a slot for a task, we do cannot fail to deliver the task.

In this patch, we rearrange the slot building for SMP tasks to ensure
that sg mapping part (which can fail) happens before we allocate the
slot in the DQ.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Alim Akhtar
0d846e703d scsi: ufs: make ufshcd_config_pwr_mode of non-static func
This makes ufshcd_config_pwr_mode non-static so that other vendors like
exynos can use it.

Signed-off-by: Seungwon Jeon <essuuj@gmail.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 10:50:59 -04:00
Alim Akhtar
4404c5de77 scsi: ufs: add quirk to enable host controller without hce
Some host controllers don't support host controller enable via HCE.

Signed-off-by: Seungwon Jeon <essuuj@gmail.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 10:47:10 -04:00
Alim Akhtar
5ac6abc941 scsi: ufs: add quirk to disallow reset of interrupt aggregation
Some host controllers support interrupt aggregation but don't allow
resetting counter and timer in software.

Signed-off-by: Seungwon Jeon <essuuj@gmail.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 10:44:52 -04:00
Alim Akhtar
1399c5b02c scsi: ufs: add quirk to fix mishandling utrlclr/utmrlclr
In the right behavior, setting the bit to '0' indicates clear and '1'
indicates no change. If host controller handles this the other way,
UFSHCI_QUIRK_BROKEN_REQ_LIST_CLR can be used.

[mkp: typo]

Signed-off-by: Seungwon Jeon <essuuj@gmail.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Reviewed-by: "Asutosh Das (asd)" <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 10:43:19 -04:00
Kees Cook
bbe21d7a97 scsi: ufs: ufshcd: Remove VLA usage
On the quest to remove all VLAs from the kernel[1] this moves buffers
off the stack. In the second instance, this collapses two separately
allocated buffers into a single buffer, since they are used
consecutively, which saves 256 bytes (QUERY_DESC_MAX_SIZE + 1) of stack
space.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 10:36:09 -04:00
Alexander Potapenko
a45b599ad8 scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()
This shall help avoid copying uninitialized memory to the userspace when
calling ioctl(fd, SG_IO) with an empty command.

Reported-by: syzbot+7d26fc1eea198488deab@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 10:26:01 -04:00
Christoph Hellwig
b8b1483d79 sg: simplify procfs code
Use remove_proc_subtree to remove the whole subtree on cleanup, and
unwind the registration loop into individual calls.  Switch to use
proc_create_seq where applicable.

Also don't bother handling proc_create* failures - the driver works
perfectly fine without the proc files, and the cleanup will handle
missing files gracefully.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16 07:24:30 +02:00
Christoph Hellwig
f7680bec04 megaraid: simplify procfs code
Use remove_proc_subtree to remove the whole subtree on cleanup, and
unwind the registration loop into individual calls.  Switch to use
proc_create_single.

Also don't bother handling proc_create* failures - the driver works
perfectly fine without the proc files, and the cleanup will handle
missing files gracefully.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16 07:24:30 +02:00
Randy Dunlap
a406b0a069 scsi: core: clean up generated file scsi_devinfo_tbl.c
"make clean" should remove the generated file "scsi_devinfo_tbl.c", so
list it in the clean-files variable so that the file gets cleaned up.

Fixes: 345e29608b ("scsi: scsi: Export blacklist flags to sysfs")
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-14 22:48:03 -04:00
Wen Xiong
81471b07b7 scsi: ipr: new IOASC update
This patch adds new adapter error log for P9 system with the new AZ SAS
cable.

Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-14 22:42:53 -04:00
Colin Ian King
eaa6d0df67 scsi: esas2r: fix spelling mistake: "requestss" -> "requests"
Trivial fix to spelling mistake in esas2r_debug message.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-14 22:40:30 -04:00
Kees Cook
97e066184b scsi: libosd: Remove VLA usage
On the quest to remove all VLAs from the kernel[1] this rearranges the
code to avoid a VLA warning under -Wvla (gcc doesn't recognize "const"
variables as not triggering VLA creation). Additionally cleans up
variable naming to avoid 80 character column limit.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-14 22:32:45 -04:00
Christoph Hellwig
0eb0b63c1d block: consistently use GFP_NOIO instead of __GFP_NORECLAIM
Same numerical value (for now at least), but a much better documentation
of intent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-14 08:55:18 -06:00
Christoph Hellwig
4accf5fc79 block: pass an explicit gfp_t to get_request
blk_old_get_request already has it at hand, and in blk_queue_bio, which
is the fast path, it is constant.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-14 08:55:14 -06:00
Christoph Hellwig
ff005a0662 block: sanitize blk_get_request calling conventions
Switch everyone to blk_get_request_flags, and then rename
blk_get_request_flags to blk_get_request.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-14 08:55:12 -06:00
Christoph Hellwig
ac613e4566 scsi/osd: remove the gfp argument to osd_start_request
Always GFP_KERNEL, and keeping it would cause serious complications for
the next change.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-14 08:55:09 -06:00
Wenwen Wang
9899e4d352 scsi: 3w-xxxx: fix a missing-check bug
In tw_chrdev_ioctl(), the length of the data buffer is firstly copied
from the userspace pointer 'argp' and saved to the kernel object
'data_buffer_length'. Then a security check is performed on it to make
sure that the length is not more than 'TW_MAX_IOCTL_SECTORS *
512'. Otherwise, an error code -EINVAL is returned. If the security
check is passed, the entire ioctl command is copied again from the
'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various
operations are performed on 'tw_ioctl' according to the 'cmd'. Given
that the 'argp' pointer resides in userspace, a malicious userspace
process can race to change the buffer length between the two
copies. This way, the user can bypass the security check and inject
invalid data buffer length. This can cause potential security issues in
the following execution.

This patch checks for capable(CAP_SYS_ADMIN) in tw_chrdev_open() to
avoid the above issues.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:32:43 -04:00
Wenwen Wang
c9318a3e02 scsi: 3w-9xxx: fix a missing-check bug
In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from
the userspace pointer 'argp' and saved to the kernel object
'driver_command'.  Then a security check is performed on the data buffer
size indicated by 'driver_command', which is
'driver_command.buffer_length'. If the security check is passed, the
entire ioctl command is copied again from the 'argp' pointer and saved
to the kernel object 'tw_ioctl'. Then, various operations are performed
on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer
resides in userspace, a malicious userspace process can race to change
the buffer size between the two copies. This way, the user can bypass
the security check and inject invalid data buffer size. This can cause
potential security issues in the following execution.

This patch checks for capable(CAP_SYS_ADMIN) in twa_chrdev_open()t o
avoid the above issues.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:32:18 -04:00
Dan Carpenter
27e833daba scsi: megaraid: silence a static checker bug
If we had more than 32 megaraid cards then it would cause memory
corruption.  That's not likely, of course, but it's handy to enforce it
and make the static checker happy.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:29:00 -04:00
Colin Ian King
7afc0ce912 scsi: lpfc: fix spelling mistakes: "mabilbox" and "maibox"
Trivial fix to spelling mistakes in lpfc_printf_log log message

"mabilbox" -> "mailbox"
"maibox" -> "mailbox"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:26:44 -04:00
Andrei Vagin
4b83cb8b06 scsi: qla2xxx: remove the unused tcm_qla2xxx_cmd_wq
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:24:25 -04:00
Xiaofei Tan
f70c1251de scsi: hisi_sas: workaround a v3 hw hilink bug
There is an SoC bug of v3 hw development version. When hot- unplugging a
directly attached disk, the PHY down interrupt may not happen. It is
very easy to appear on some boards.

When this issue occurs, the controller will receive many invalid dword
frames, and the "alos" fields of register HILINK_ERR_DFX can indicate
that disk was unplugged.

As an workaround solution, this patch detects this issue in the channel
interrupt, and workaround it by following steps:

 - Disable the PHY
 - Clear error code and interrupt
 - Enable the PHY

Then the HW will reissue PHY down interrupt.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
John Garry
9b8addf302 scsi: hisi_sas: add readl poll timeout helper wrappers
It is common to use readl poll timeout helpers in the driver, so create
custom wrappers.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiaofei Tan
bf081d5da4 scsi: hisi_sas: remove redundant handling to event95 for v3
Event95 is used for DFX purpose. The relevant bit for this interrupt in
the ENT_INT_SRC_MSK3 register has been disabled, so remove the
processing.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiang Chen
9413532788 scsi: hisi_sas: config ATA de-reset as an constrained command for v3 hw
As a unconstrained command, a command can be sent to SATA disk even if
SATA disk status is BUSY, ERR or DRQ.

If an ATA reset assert is successful but ATA reset de-assert fails, then
it will retry the reset de-assert. If reset de- assert retry is
successful, we think it is okay to probe the device but actually it
still has Err status.

Apparently we need to retry the ATA reset assertion and de- assertion
instead for this mentioned scenario.

As such, we config ATA reset assert as a constrained command, if ATA
reset de-assert fails, then ATA reset de-assert retry will also
fail. Then we will retry the proper process of ATA reset assert and
de-assert again.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiang Chen
c2c1d9ded0 scsi: hisi_sas: update PHY linkrate after a controller reset
After the controller is reset, we currently may not honour the PHY max
linkrate set via sysfs, in that after a reset we always revert to max
linkrate of 12Gbps, ignoring the value set via sysfs.

This patch modifies to policy to set the programmed PHY linkrate,
honouring the max linkrate programmed via sysfs.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
John Garry
6f7c32d605 scsi: hisi_sas: stop controller timer for reset
We should only have the timer enabled after PHY up after controller
reset, so disable prior to reset.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiang Chen
c6ef895472 scsi: hisi_sas: check sas_dev gone earlier in hisi_sas_abort_task()
It is possible to dereference a NULL-pointer in hisi_sas_abort_task() in
special scenario when the device has been removed.

If an SMP task times-out, it will call hisi_sas_abort_task() to
recover. And currently there is a check in hisi_sas_abort_task() to
avoid the situation of processing the abort for the removed device.

However we have an ordering problem, in that we may reference a task for
the removed device before checking if the device has been removed.

Fix this by only referencing the sas_dev after we know it is still
present.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiang Chen
a14da7a20d scsi: hisi_sas: fix PI memory size
There are 28 bytes of protection information record of SSP for v3 hw, 16
bytes for v2 hw, and probably 24 for v1 hw (forgotten now).

So use a value big enough in hisi_sas_command_table_ssp.prot to cover
all cases.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiang Chen
cd938e535e scsi: hisi_sas: check host frozen before calling "done" function
When the host is frozen in SCSI EH state, at any point after the LLDD
sets SAS_TASK_STATE_DONE for the sas_task task state, libsas may free
the task; see sas_scsi_find_task().

This puts the LLDD in a difficult position, in that once it sets
SAS_TASK_STATE_DONE for the task state it should not reference the
sas_task again. But the LLDD needs will check the sas_task indirectly in
calling task->task_done()->sas_scsi_task_done() or sas_ata_task_done()
(to check if the host is frozen state actually).

And the LLDD cannot set SAS_TASK_STATE_DONE for the task state after
task->task_done() is called (as the sas_task is free'd at this point).

This situation would seem to be a problem made by libsas.

To work around, check in the LLDD whether the host is in frozen state to
ensure it is ok to call task->task_done() function. If in the frozen
state, we rely on SCSI EH and libsas to free the sas_task directly.

We do not do this for the following IO types:

 - SMP - they are managed in libsas directly, outside SCSI EH
 - Any internally originated IO, for similar reason

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:44 -04:00
Xiang Chen
b81b6cce58 scsi: hisi_sas: Add some checks to avoid free'ing a sas_task twice
If the SCSI host enters EH, any pending IO will be processed by SCSI
EH. However it is possible that SCSI EH will try to abort the IO and
also at the same time the IO completes in the driver. In this situation
there is a small chance of freeing the sas_task twice.

Then if another IO re-uses freed sas_task before the second time of
free'ing sas_task, it is possible to free incorrect sas_task.

To avoid this situation, add some checks to increase reliability.  The
sas_task task state flag SAS_TASK_STATE_ABORTED is used to mutually
protect the LLDD and libsas freeing the task.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:43 -04:00
Xiang Chen
24cf43612d scsi: hisi_sas: optimise the usage of DQ locking
In the DQ tasklet processing it is not necessary to take the DQ lock, as
there is no contention between adding slots to the CQ and removing slots
from the matching DQ.

In addition, since we run each DQ in a separate tasklet context, there
would be no possible contention between DQ processing running for the
same queue in parallel.

It is still necessary to take hisi_hba lock when free'ing slots.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:10:43 -04:00
James Smart
3e21d1cb0f scsi: lpfc: Comment cleanup regarding Broadcom copyright header
Fix small formatting and wording nits in Broadcom copyright header

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
e65d8c3411 scsi: lpfc: update driver version to 12.0.0.3
Update the driver version to 12.0.0.3

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
11f0e34ff4 scsi: lpfc: Enhance log messages when reporting CQE errors
Enhance log messages for CQEs as they were not reporting certain fields.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
44c2757b76 scsi: lpfc: Fix up log messages and stats counters in IO submit code path
Fix up log messages and add an fcp error stat counter in the IO submit
code path to make diagnosing problems easier

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
d38f33b304 scsi: lpfc: Driver NVME load fails when CPU cnt > WQ resource cnt
If the cpu count is larger than the number of WQ resources available,
adapter attachment eventually failes due to a WQ_CREATE failure.

Calculate the number of WQs desired (which initializes to cpu count)
after accounting for the number of queues the adapter supports and the
number allocated to SCSI and the control/ELS path, and scale down if
necessary.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
23288b78a1 scsi: lpfc: Handle new link fault code returned by adapter firmware.
The driver encounters a link event ACQE with a fault code it doesn't
recognize, it logs an "Invalid" fault type and futher treats the unknown
value as a mailbox command failure.  First off, there is no "invalid"
value, only values that are unknown. Secondly, the fault code doesn't
indicate status - the rest of the ACQE contains that status so there is
no reason to "fail the commands".

Change the "Invalid" to "Unknown". There is no "invalid" code value.

Separate fault code parsing and message genaration from any mbx handling
status.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
a72d56b2a6 scsi: lpfc: Correct fw download error message
In situations when the firmware image in inappropriate for the chip
type, initial validation checks were light, allowing the checks to pass,
thus allowing the firmware to be downloaded.  Eventually, after the
download, the chip rejects the firmware but it is logged as a generic
firmware download error.

Revise the initial checks to validate the image vs asic type so that the
correct message is displayed and the download process is avoided.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
48f8fdb4b4 scsi: lpfc: enhance LE data structure copies to hardware
The driver builds the control structures in host memory using
definitions that are based on 32-bit words. After building the structure
it is then written to the adapter.

This patch slightly optimizes LE hosts by copying the structures via
64-bit copies.  This is doable as the adapter interface is LE thus there
is no byteswapping as the copy is performed.

The same optimization would be nice on BE systems, but when byteswapping
occurs, it swaps 32-bit words as well, thus trashing the control
structure. Given amount of code that is dependent upon the 32-bit word
definition, it was decided to not change things for the minor
optimization. Thus PPC 64-bit systems sticks with doing 32-bit copies.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:15 -04:00
James Smart
cd2400715c scsi: lpfc: Change IO submit return to EBUSY if remote port is recovering
I/O submission paths in the lpfc nvme path are rejecting the io with an
error code that reflects back to the callee as a hard io failure. Many
of these conditions are transient and would likely resolve if retried.

Correct by returning -EBUSY, which the FC transport triggers off of to
return busy status codes to the blk-mq layer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:15 -04:00
Chad Dupuis
ab7ad49d01 scsi: qedf: Update version number to 8.33.16.20
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:11 -04:00
Chad Dupuis
5d1c8b5ba0 scsi: qedf: Update copyright for 2018
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:11 -04:00
Chad Dupuis
f3690a89f9 scsi: qedf: Add more defensive checks for concurrent error conditions
During an uplink toggle test all error handling is done via timeout and
firmware error conditions which can occur concurrently:

 - SCSI layer timeouts
 - Error detect CQEs
 - Firmware detected underruns
 - ABTS timeouts

All these concurrent events require more defensive checks in the driver
including:

 - Check both internally and externally generated aborts to make sure the
   xid is not already been aborted in another context or in cleanup.

 - Check back pointers in qedf_cmd_timeout to verify the context of the
   io_req, fcport and qedf_ctx

 - Check rport state in host reset handler to not reset the whole host
   if the rport is already uploaded or in the process of relogin

 - Check to state for an fcport before initiating a middle path ELS
   request

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:11 -04:00
Chad Dupuis
4f4616ceeb scsi: qedf: Set the UNLOADING flag when removing a vport
Similar to what we do when we remove a PCI function, set the
QEDF_UNLOADING flag to prevent any requests from being queued while a
vport is being deleted.  This prevents any requests from getting stuck
in limbo when the vport is unloaded or deleted.

Fixes the crash:

PID: 106676  TASK: ffff9a436aa90000  CPU: 12  COMMAND: "multipathd"
 #0 [ffff9a43567d3550] machine_kexec+522 at ffffffffaca60b2a
 #1 [ffff9a43567d35b0] __crash_kexec+114 at ffffffffacb13512
 #2 [ffff9a43567d3680] crash_kexec+48 at ffffffffacb13600
 #3 [ffff9a43567d3698] oops_end+168 at ffffffffad117768
 #4 [ffff9a43567d36c0] no_context+645 at ffffffffad106f52
 #5 [ffff9a43567d3710] __bad_area_nosemaphore+116 at ffffffffad106fe9
 #6 [ffff9a43567d3760] bad_area+70 at ffffffffad107379
 #7 [ffff9a43567d3788] __do_page_fault+1247 at ffffffffad11a8cf
 #8 [ffff9a43567d37f0] do_page_fault+53 at ffffffffad11a915
 #9 [ffff9a43567d3820] page_fault+40 at ffffffffad116768
    [exception RIP: qedf_init_task+61]
    RIP: ffffffffc0e13c2d  RSP: ffff9a43567d38d0  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: ffffbe920472c738  RCX: ffff9a434fa0e3e8
    RDX: ffff9a434f695280  RSI: ffffbe920472c738  RDI: ffff9a43aa359c80
    RBP: ffff9a43567d3950   R8: 0000000000000c15   R9: ffff9a3fb09b9880
    R10: ffff9a434fa0e3e8  R11: ffff9a43567d35ce  R12: 0000000000000000
    R13: ffff9a434f695280  R14: ffff9a43aa359c80  R15: ffff9a3fb9e005c0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:11 -04:00
Chad Dupuis
92bbccdf72 scsi: qedf: Add additional checks when restarting an rport due to ABTS timeout
There are a couple of kernel cases when we restart a remote port due to
ABTS timeout that we need to handle:

 1. Flush any outstanding ABTS requests when flushing I/Os so that we do
    not hold up the eh_abort handler indefinitely causing process hangs.

 2. Check if we are currently uploading a connection before issuing an
    ABTS.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:11 -04:00
Chad Dupuis
96673e1e22 scsi: qedf: If qed fails to enable MSI-X fail PCI probe
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:11 -04:00
Chad Dupuis
65b7beca42 scsi: qedf: Honor default_prio module parameter even if DCBX does not converge
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
4b9b7fabb3 scsi: qedf: Improve firmware debug dump handling
Get all firmware debug data instead of just a grc dump.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Saurav Kashyap
f9a4a7f2c0 scsi: qedf: Remove setting DCBX pending during soft context reset
PROBLEM DESCRIPTION:

According to the logs, STAG was changing and it was triggering soft
reset.  In soft reset we used to virtual link down and up and also we
were disabling DCBx flag. Since this was virtual link flap, DCBx never
used to converge again.

SOLUTION:

Code change is to remove disabling DCBx flag from soft reset.

Signed-off-by: Saurav Kashyap <saurav.kashyap@cavium.com>
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
8025c84208 scsi: qedf: Add task id to kref_get_unless_zero() debug messages when flushing requests
Helps to corroborate which requests we can't get reference on and if
it's real bug or not.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
3f9de7f041 scsi: qedf: Check if link is already up when receiving a link up event from qed
[mkp: typo]

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
a8f192bce1 scsi: qedf: Return request as DID_NO_CONNECT if MSI-X is not enabled
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
adf4884252 scsi: qedf: Release RRQ reference correctly when RRQ command times out
When an RRQ request times out the reference is not getting decremented
correctly as there are still ELS commands leftover when we flush any
pending I/Os during offload:

[  281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a.
...
[  281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a.
[  281.788772] [0000:21:00.3]:[qedf_rrq_compl:182]:4: Entered.
[  281.788774] [0000:21:00.3]:[qedf_rrq_compl:200]:4: rrq_compl: orig io = ffffc90004c556f8, orig xid = 0x81b, rrq_xid = 0x96a, refcount=1
...
[  331.448032] [0000:21:00.3]:[qedf_flush_els_req:1512]:4: Flushing ELS request xid=0x96a refcount=2.

The fix is to call kref_put on the rrq_req in case of timeout as the
timeout handler will call rrq_compl directly vs. a normal completion
where it is call from els_compl.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
84b2ba6e42 scsi: qedf: Honor priority from DCBX FCoE App tag
We currently hard code the priority in the 8021q tag to 3 for FCoE
traffic.  The vast majority of the time this is fine but if the priority
is something else besides 3, any VLAN ID comparison either in the
non-offload path or offload path will fail and cause dropped frames
where none are expected.

Change the behavior so that the driver default is 3 if we do not get any
DCBX convergence.

If DCBX does converge, then set the FIP/FCoE priority in the following
manner:

 1. If the qedf_default_prio modparam is set use that
 2. If the DCBX FCoE priority is not in range (0..7) use 3
 3. Use the DCBX FCoE priority we get in the driver's DCBX handler

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
ba17d379c2 scsi: qedf: Add dcbx_not_wait module parameter so we won't wait for DCBX convergence to start discovery
This module parameter is to work around cases where we do not receive
the DCBX handler notification from qed but discovery is still possible
if we send out a FIP VLAN request irregardless of the DCBX state.

[mkp: zeroday warning]

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
a93755cf7e scsi: qedf: Sanity check FCoE/FIP priority value to make sure it's between 0 and 7
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
766639cab0 scsi: qedf: Add check for offload before flushing I/Os for target
We need to check that a fcport is offloaded before we try to flush any
requests.  No doing so could lead to undefined results and most likely a
crash.

Fixes the oops:

[  343.971886] [0000:42:00.3]:[qedf_execute_tmf:2070]:8: wait for tm_cmpl timeout!
[  343.971933] BUG: unable to handle kernel paging request at 00000000000024a8
[  343.971949] IP: [<ffffffffa06b8cc6>] qedf_flush_active_ios+0x46/0x260 [qedf]
[  343.971952] PGD 42c569067 PUD 4160fe067 PMD 0
[  343.971954] Oops: 0000 [#1] SMP
[  343.972008] Modules linked in: qedf(OEX) qed(OEX) bnx2i cnic fuse af_packet iscsi_ibft msr xfs intel_rapl sb_edac edac_core x86_pkg_temp_thermal bnx2x geneve intel_powerclamp vxlan coretemp ipmi_ssif ipmi_devintf kvm_intel kvm libiscsi joydev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel tg3 ip6_udp_tunnel udp_tunnel mdio libcrc32c iTCO_wdt scsi_transport_iscsi uio drbg iTCO_vendor_support iscsi_boot_sysfs dcdbas(X) ipmi_si ansi_cprng aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper ptp pps_core pcspkr libphy lpc_ich mfd_core cryptd fjes wmi ipmi_msghandler button crc8 libfcoe libfc scsi_transport_fc mei_me mei shpchp processor acpi_pad btrfs xor hid_generic usbhid raid6_pq sd_mod sr_mod cdrom mgag200 crc32c_intel i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[  343.972020]  fb_sys_fops ttm ahci ehci_pci libahci ehci_hcd drm libata usbcore megaraid_sas usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 [last unloaded: qedf]
[  343.972022] Supported: Yes, External
[  343.972026] CPU: 30 PID: 12777 Comm: sg_reset Tainted: G        W  OE    X 4.4.73-5-default #1
[  343.972027] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.1.3 11/20/2013
[  343.972029] task: ffff88018dfc0e80 ti: ffff88042bd7c000 task.ti: ffff88042bd7c000
[  343.972036] RIP: 0010:[<ffffffffa06b8cc6>]  [<ffffffffa06b8cc6>] qedf_flush_active_ios+0x46/0x260 [qedf]
[  343.972038] RSP: 0018:ffff88042bd7fbe0  EFLAGS: 00010286
[  343.972039] RAX: 0000000000000000 RBX: ffff88042ce37800 RCX: 0000000000000400
[  343.972040] RDX: 000000000000060e RSI: ffffffffa06be830 RDI: ffff8807e5072cc0
[  343.972041] RBP: 0000000000001000 R08: ffffffffa06bff4d R09: ffff88018dd84580
[  343.972042] R10: 000000000000018b R11: 0000000000000002 R12: 0000000000002003
[  343.972043] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8807e5072cc0
[  343.972046] FS:  00007fc1c8809700(0000) GS:ffff88042fbc0000(0000) knlGS:0000000000000000
[  343.972048] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  343.972049] CR2: 00000000000024a8 CR3: 00000004236ec000 CR4: 00000000001406e0
[  343.972050] Stack:
[  343.972053]  504c78750607e154 ffffffff810a7d10 ffff88042ce37800 0000000000000010
[  343.972055]  0000000000002003 ffff8807ff480c48 ffff8807e5072cc0 ffffc90004ec4ff8
[  343.972057]  ffffffffa06b9b86 ffff880800000010 0000000000000282 ffff88042ce37800
[  343.972058] Call Trace:
[  343.972094]  [<ffffffffa06b9b86>] qedf_initiate_tmf+0x346/0x3e0 [qedf]
[  343.972120]  [<ffffffffa000fa06>] scsi_try_bus_device_reset+0x26/0x40 [scsi_mod]
[  343.972133]  [<ffffffffa001038e>] scsi_ioctl_reset+0x13e/0x260 [scsi_mod]
[  343.972145]  [<ffffffffa000f416>] scsi_ioctl+0x136/0x3d0 [scsi_mod]
[  343.972154]  [<ffffffff812ff6eb>] blkdev_ioctl+0x6bb/0x950
[  343.972164]  [<ffffffff8123cfed>] block_ioctl+0x3d/0x40
[  343.972170]  [<ffffffff81217e2d>] do_vfs_ioctl+0x2cd/0x4a0
[  343.972186]  [<ffffffff81218074>] SyS_ioctl+0x74/0x80
[  343.972193]  [<ffffffff8160916e>] entry_SYSCALL_64_fastpath+0x12/0x6d
[  343.975285] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x6d

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
15a93de7e9 scsi: qedf: Fix VLAN display when printing sent FIP frames
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:10 -04:00
Chad Dupuis
f32803bb45 scsi: qedf: Add missing skb frees in error path
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:09 -04:00
Chad Dupuis
c3ef86f3ec scsi: qedf: Increase the number of default FIP VLAN request retries to 60
Some configurations need more than 30 seconds to respond to a FIP VLAN
request so increase the default to 60 seconds.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:09 -04:00
Chad Dupuis
44c7c85911 scsi: qedf: Synchronize rport restarts when multiple ELS commands time out
If multiple ELS commands time out, such as aborts, they could all try to
restart the same rport and the same time.  This could mean multiple
multiple processes trying to clean up any outstanding commands or trying
to upload the same port.

Add a new flag (QEDF_RPORT_IN_RESET) and check other fcport state flags
before trying to reset the port.

Fixes the crash:

[17501.824701] ------------[ cut here ]------------
[17501.824733] kernel BUG at include/asm-generic/dma-mapping-common.h:65!
[17501.824760] invalid opcode: 0000 [#1] SMP
[17501.824781] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ses enclosure dm_service_time vfat fat sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass joydev btrfs hpilo raid6_pq iTCO_wdt iTCO_vendor_support xor hpwdt ipmi_ssif sg crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul ioatdma lpc_ich glue_helper ablk_helper i2c_i801 shpchp cryptd ipmi_si pcspkr acpi_power_meter ipmi_devintf pcc_cpufreq dca wmi ipmi_msghandler dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod cdrom sd_mod
[17501.825119]  crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qedf(OE) drm libfcoe ahci qedi(OE) crct10dif_pclmul libfc libahci uio crct10dif_common crc32c_intel libiscsi libata scsi_transport_iscsi scsi_transport_fc tg3 qede(OE) scsi_tgt hpsa qed(OE) i2c_core ptp scsi_transport_sas pps_core iscsi_boot_sysfs dm_mirror dm_region_hash dm_log dm_mod
[17501.825292] CPU: 8 PID: 10531 Comm: kworker/u96:1 Tainted: G           OE  ------------   3.10.0-693.el7.x86_64 #1
[17501.825330] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 06/02/2016
[17501.825372] Workqueue: fc_rport_eq fc_rport_work [libfc]
[17501.825395] task: ffff88101bca8000 ti: ffff881025278000 task.ti: ffff881025278000
[17501.825424] RIP: 0010:[<ffffffffc042def9>]  [<ffffffffc042def9>] qedf_unmap_sg_list.isra.15+0x89/0x90 [qedf]
[17501.825471] RSP: 0018:ffff88102527bb98  EFLAGS: 00010212
[17501.825493] RAX: ffff8800224eac00 RBX: ffffc9000cd05210 RCX: 0000000000001000
[17501.825520] RDX: 000000007e655e40 RSI: 0000000000001000 RDI: ffff88107fe3b098
[17501.826683] RBP: ffff88102527bba0 R08: ffffffff81a13200 R09: 0000000000000286
[17501.827747] R10: 0000000000000004 R11: 0000000000000005 R12: ffffc9000cd051b8
[17501.828804] R13: ffff881037640c28 R14: 0000000000000007 R15: ffffc9000cd05200
[17501.829850] FS:  0000000000000000(0000) GS:ffff88103fa00000(0000) knlGS:0000000000000000
[17501.830910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[17501.831966] CR2: 00007f9b94005f38 CR3: 00000000019f2000 CR4: 00000000003407e0
[17501.833027] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[17501.834087] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[17501.835142] Stack:
[17501.836201]  ffff881033ddbb80 ffff88102527bc30 ffffffffc042f834 0000000000002710
[17501.837264]  ffff88102527bbd0 ffffffff8133d9dd ffffc9000cd052a0 ffff88102527bc30
[17501.838325]  ffffffff816a9c65 0000000000000001 ffff88101bca8000 ffffffff810c4810
[17501.839388] Call Trace:
[17501.840446]  [<ffffffffc042f834>] qedf_scsi_done+0x54/0x1d0 [qedf]
[17501.841504]  [<ffffffff8133d9dd>] ? list_del+0xd/0x30
[17501.842537]  [<ffffffff816a9c65>] ? wait_for_completion_timeout+0x125/0x140
[17501.843560]  [<ffffffff810c4810>] ? wake_up_state+0x20/0x20
[17501.844577]  [<ffffffffc0430311>] qedf_initiate_cleanup+0x2e1/0x310 [qedf]
[17501.845587]  [<ffffffffc04305fe>] qedf_flush_active_ios+0x10e/0x260 [qedf]
[17501.846612]  [<ffffffffc042892f>] qedf_cleanup_fcport+0x5f/0x370 [qedf]
[17501.847613]  [<ffffffffc04292d8>] qedf_rport_event_handler+0x398/0x950 [qedf]
[17501.848602]  [<ffffffff810cdc7c>] ? dequeue_entity+0x11c/0x5d0
[17501.849581]  [<ffffffff81098a2b>] ? __internal_add_timer+0xab/0x130
[17501.850555]  [<ffffffff810ce54e>] ? dequeue_task_fair+0x41e/0x660
[17501.851528]  [<ffffffffc03241a4>] fc_rport_work+0xf4/0x6c0 [libfc]
[17501.852490]  [<ffffffff810a881a>] process_one_work+0x17a/0x440
[17501.853446]  [<ffffffff810a94e6>] worker_thread+0x126/0x3c0

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:57:09 -04:00
himanshu.madhani@cavium.com
3f9da25602 scsi: qla2xxx: Update driver version to 10.00.00.07-k
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:12 -04:00
Quinn Tran
84905dfe78 scsi: qla2xxx: Fix TMF and Multi-Queue config
For target mode, task management command is queued to specific cpu base
on where the SCSI command is residing.  This prevent race condition of
task management command getting ahead of regular scsi command.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:12 -04:00
himanshu.madhani@cavium.com
fc31b7a803 scsi: qla2xxx: Prevent relogin loop by removing stale code
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:12 -04:00
Quinn Tran
36d49c92ef scsi: qla2xxx: Remove stale debug value for login_retry flag
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:12 -04:00
Quinn Tran
e25f76549b scsi: qla2xxx: Use predefined get_datalen_for_atio() inline function
- Uses predefine inline function to access add_cdb_len field in ATIO.

 - Return SS_RESIDUAL_UNDER status when sending BUSY

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:11 -04:00
Quinn Tran
8ea4faf829 scsi: qla2xxx: Fix Inquiry command being dropped in Target mode
When a connection is established, the target core session may not be
created immediately. Current code will drop/terminate the command based
on the session state. This patch will return BUSY status for any
commands arriving on wire before the session is created.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:11 -04:00
Quinn Tran
cc28e0ace9 scsi: qla2xxx: Move GPSC and GFPNID out of session management
Move GPSC & GFPNID commands out of session management to reduce time lag
in reporting the session state to remote port. These commands are not
essential when it comes to maintaining the rport state. Delay sending
these commands after rport state is set to Online.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:11 -04:00
Quinn Tran
bee8b84686 scsi: qla2xxx: Reduce redundant ADISC command for RSCNs
For each RSCN that triggers a rescan of the fabric, ADISC is used to
revalidate an existing session. If the RSCN is not affecting all
existing sessions, then driver should not send redundant ADISC for all
existing sessions.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:11 -04:00
Quinn Tran
1d317b2123 scsi: qla2xxx: Delete session for nport id change
This patch fixes regression introduced by commit a4239945b8 ("scsi:
qla2xxx: Add switch command to simplify fabric discovery") by scheduling
session deletion when Nport ID changes.

[mkp: clarified commit]

Fixes: a4239945b8 ("scsi: qla2xxx: Add switch command to simplify fabric discovery")
Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:11 -04:00
Quinn Tran
29528491cc scsi: qla2xxx: Fix Rport and session state getting out of sync
This patch fixes rport state and session state getting out of sync.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:11 -04:00
Quinn Tran
625a1caefe scsi: qla2xxx: Fix sending ADISC command for login
This patch fixes login_retry login for ADISC command.

when login_retry count reaches 0, further attempt to send ADISC command
is ignored by the code. Remove this redundant login_retry count check
from qla24xx_fcport_handle_login()

[mkp: fix typo]

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:46:11 -04:00
Chaitra P B
f6972d7180 scsi: mpt3sas: Update driver version "25.100.00.00"
Update driver version to match OOB/internal driver version.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:40:05 -04:00
Chaitra P B
87b3576e9e scsi: mpt3sas: fix possible memory leak.
In ioctl exit path driver refers ioc_list to free memory associated with
diag buffers and event_log pointer used to save events by driver.
If ctl_exit() func is called after unregistering driver, then ioc_list will
be empty and hence driver will not be able to free the allocated memory
which in turn causes memory leak.
So call ctl_exit() function before unregistering mpt3sas driver.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:40:05 -04:00
Chaitra P B
c1a6c5ac42 scsi: mpt3sas: For NVME device, issue a protocol level reset
1) Manufacturing Page 11 contains parameters to control internal
   firmware behavior. Based on AddlFlags2 field FW/Driver behaviour can
   be changed, (flag tm_custom_handling is used for this)

a) For PCIe device, protocol level reset should be used if flag
   tm_custom_handling is 0.  Since Abort Task Set, LUN reset and Target
   reset will result in a protocol level reset. Drivers should issue
   only one type of this reset, if that fails then it should escalate to
   a controller reset (diag reset/OCR).

b) If the driver has control over the TM reset timeout value, then
   driver should use the value exposed in PCIe Device Page 2 for pcie
   device (field ControllerResetTO).

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:39:49 -04:00
Chaitra P B
65928d1f41 scsi: mpt3sas: Update MPI Headers
Update MPI Files to support protocol level reset for NVMe device.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:20 -04:00
Chaitra P B
3d29ed85fc scsi: mpt3sas: Report Firmware Package Version from HBA Driver.
Added function _base_display_fwpkg_version, which sends FWUpload request
to pull FW package version from FW Image Header.  Now driver prints FW
package version in addition to FW version if the PackageVersion is
valid.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:20 -04:00
Chaitra P B
22a923c315 scsi: mpt3sas: Cache enclosure pages during enclosure add.
In function _scsih_add_device, for each device connected to an
enclosure, driver reads the enclosure page(To get details like enclosure
handle, enclosure logical ID, enclosure level etc.)

With this patch, instead of reading enclosure page everytime, driver
maintains a list for enclosure device(During enclosure add event,
enclosure device is added to the list and removed from the list on
delete events) and uses the enclosure page from the list.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:20 -04:00
Chaitra P B
79eb96d6ca scsi: mpt3sas: Allow processing of events during driver unload.
Events were not processed during driver unload, hence unloading of
driver doesn't complete when drives are disconnected while unloading of
driver.  So don't block events in ISR path, i,e., remove the flag
ioc->remove_host so that events are getting processed during driver
unload.  Thus allowing driver unload to complete by processing drive
removal events during driver unload.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:20 -04:00
Chaitra P B
1537d1bfc5 scsi: mpt3sas: Increase event log buffer to support 24 port HBA's.
For 24 port HBA's events generated by IOC are more in certain cases and
the current circular buffer may be overwritten.Hence increased the event
log buffer to accommodate more events.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:20 -04:00
Chaitra P B
95540b8eaf scsi: mpt3sas: Added support for SAS Device Discovery Error Event.
The SAS Device Discovery Error Event is sent to the host when discovery
for a particular device is failed during discovery, even after maximum
retries by the IOC.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:20 -04:00
Chaitra P B
e21fef6f33 scsi: mpt3sas: Enhanced handling of Sense Buffer.
Enhanced DMA allocation for Sense Buffer, if the allocation does not fit
within same 4GB.Introduced is_MSB_are_same function to check if allocted
buffer within 4GB range or not.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:19 -04:00
Chaitra P B
74522a92bb scsi: mpt3sas: Optimize I/O memory consumption in driver.
For every IO, memory of PAGE size is allocated for handling NVMe native
PRPS. And in addition to that for every IO (chains need per IO * chain
buffer size, e.g. 38 * 128byte) amount of memory is allocated for chain
buffers.

However, at any point of time; the IO request can be for NVMe target
device (where PRP's page is used for framing PRP's) or can be for SCSI
target device (where chain buffers are used for framing chain
SGE's). This patch modifies the driver to reuse same pre-allocated PRP
page buffers as a chain buffer for IO's targeted for SCSI target
devices. No need to allocate separate buffers for chain SGE's buffers.

Suppose if the number of chain buffers need for IO doesn't fit in the
PRP Page size then driver maintain's separate buffers for those extra
chain buffers that exceeds the PRP page size. For example consider PRP
page size as 4K and chain buffer size as 128 bytes, then number of chain
buffers that can fit in PRP page is 4096/128 => 32. if the number of
chain buffer need per IO exceeds 32; for example consider number of
chains need per IO is 36 then for remaining 4 chain buffer's driver
allocates them individual.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:19 -04:00
Chaitra P B
93204b782a scsi: mpt3sas: Lockless access for chain buffers.
Introduces Chain lookup table/tracker and implements accessing chain
buffer using smid.  Removed link list based access of chain buffer which
requires lock and allocated as many chains needed.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:19 -04:00
Chaitra P B
cd33223b59 scsi: mpt3sas: Pre-allocate RDPQ Array at driver boot time.
Instead of allocating RDPQ array (This stores the address's of each RDPQ
pools) at run time, now it will be allocated once during driver load
time and same will be reused during host reset operation also (instead
of allocating & freeing this buffer on the fly during every host reset
operation) and then freed during driver unload.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:19 -04:00
Chaitra P B
cf6bf9710c scsi: mpt3sas: Bug fix for big endian systems.
This patch fixes sparse warnings and bugs on big endian systems.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 00:34:19 -04:00
Sudarsana Reddy Kalluru
cac6f69154 qed: Add support for Unified Fabric Port.
This patch adds driver changes for supporting the Unified Fabric Port
(UFP). This is a new paritioning mode wherein MFW provides the set of
parameters to be used by the device such as traffic class, outer-vlan
tag value, priority type etc. Drivers receives this info via notifications
from mfw and configures the hardware accordingly.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07 23:46:10 -04:00
Christoph Hellwig
21e07dba9f scsi: reduce use of block bounce buffers
We can rely on the dma-mapping code to handle any DMA limits that is
bigger than the ISA DMA mask for us (either using an iommu or swiotlb),
so remove setting the block layer bounce limit for anything but the
unchecked_isa_dma case, or the bouncing for highmem pages.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
2018-05-07 07:15:41 +02:00
Colin Ian King
23b389c231 scsi: mpt3sas: fix spelling mistake: "disbale" -> "disable"
Trivial fix to spelling mistake in module parameter description text

[mkp: applied by hand]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:35:51 -04:00
Colin Ian King
55d9a1d241 scsi: megaraid_sas: fix spelling mistake: "disbale" -> "disable"
Trivial fix to spelling mistake in module parameter description text

[mkp: applied by hand]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:35:43 -04:00
Colin Ian King
1895bd7bce scsi: esas2r: fix spelling mistake: "asynchromous" -> "asynchronous"
Trivial fix to spelling mistake in module description text

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:32:31 -04:00
Colin Ian King
35dc0b07b3 scsi: isci: remove redundant check on in_connection_align_insertion_frequency
The sanity check on u->in_connection_align_insertion_frequency is being
performed twice and hence the first check can be removed since it is
redundant. Cleans up cppcheck warning:

drivers/scsi/ibmvscsi/ibmvscsi.c:1711: (warning) Identical inner 'if'
condition is always true.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:31:45 -04:00
YueHaibing
9407253f4a scsi: a100u2w: Use module_pci_driver
Remove boilerplate code by using macro module_pci_driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:31:40 -04:00
YueHaibing
3e1bbc5685 scsi: wd719x: Use module_pci_driver
Remove boilerplate code by using macro module_pci_driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:30:12 -04:00
YueHaibing
5f1c721196 scsi: am53c974: Use module_pci_driver
Remove boilerplate code by using macro module_pci_driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:29:41 -04:00
Dave Carroll
7d3af7d96a scsi: aacraid: Correct hba_send to include iu_type
commit b60710ec7d ("scsi: aacraid: enable sending of TMFs from
aac_hba_send()") allows aac_hba_send() to send scsi commands, and TMF
requests, but the existing code only updates the iu_type for scsi
commands. For TMF requests we are sending an unknown iu_type to
firmware, which causes a fault.

Include iu_type prior to determining the validity of the command

Reported-by: Noah Misner <nmisner@us.ibm.com>
Fixes: b60710ec7d ("aacraid: enable sending of TMFs from aac_hba_send()")
Fixes: 423400e64d ("aacraid: Include HBA direct interface")
Tested-by: Noah Misner <nmisner@us.ibm.com>
cc: stable@vger.kernel.org
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:27:18 -04:00
Jim Gill
f4b024271a scsi: vmw-pvscsi: return DID_BUS_BUSY for adapter-initated aborts
The vmw_pvscsi driver returns DID_ABORT for commands aborted internally
by the adapter, leading to the filesystem going read-only. Change the
result to DID_BUS_BUSY, causing the kernel to retry the command.

Signed-off-by: Jim Gill <jgill@vmware.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:15:40 -04:00
Christoph Hellwig
63aed100e2 scsi: scsi_transport_sas: don't bounce highmem pages for the smp handler
All three instance of ->smp_handler deal with highmem backed requests
just fine.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-01 23:11:15 -04:00
Arnd Bergmann
f990bee3f1 scsi: ips: fix firmware timestamps for 32-bit
do_gettimeofday() is deprecated since it will stop working in 2038 on
32-bit platforms, leading to incorrect times passed to the firmware.
On 64-bit platforms the current code appears to be fine, as the
calculation passes an 8-bit century number into the firmware that can
represent times long in the future (possibly until 25599).

Using ktime_get_real_seconds() to get a 64-bit seconds value and
time64_to_tm() to convert it into the firmware format greatly simplifies
the ips timekeeping code, makes 32-bit and 64-bit behave the same way
here, and gets us closer to removing the deprecated interfaces.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:40:17 -04:00
Arnd Bergmann
feeeca4ce2 scsi: esas2r: use ktime_get_real_seconds()
do_gettimeofday() is deprecated because of the y2038 overflow.  Here, we
use the result to pass into a 32-bit field in the firmware, which still
risks an overflow, but if the firmware is written to expect unsigned
values, it can at least last until y2106, and there is not much we can
do about it.

This changes do_gettimeofday() to ktime_get_real_seconds(), which at
least simplifies the code a bit, and avoids the deprecated
interface. I'm adding a comment about the overflow to document what
happens.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:40:16 -04:00
YueHaibing
f9c25ccfc1 scsi: mvumi: Using module_pci_driver
Remove boilerplate code by using macro module_pci_driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:40:11 -04:00
Colin Ian King
4bc83b3f27 scsi: isci: Fix infinite loop in while loop
In the case when the phy_mask is bitwise anded with the phy_index bit is
zero the continue statement currently jumps to the next iteration of the
while loop and phy_index is never actually incremented, potentially
causing an infinite loop if phy_index is less than SCI_MAX_PHS. Fix this
by turning the while loop into a for loop.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:23:32 -04:00
Jia-Ju Bai
4011f07660 scsi: st: Replace GFP_ATOMIC with GFP_KERNEL in new_tape_buffer
new_tape_buffer() is never called in atomic context. new_tape_buffer()
is only called by st_probe(), which is only set as ".probe" in struct
scsi_driver.

Despite never getting called from atomic context, new_tape_buffer()
calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which
can sleep and improve the possibility of sucessful allocation.

This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:14:37 -04:00
Jia-Ju Bai
1f618aac2f scsi: st: Replace GFP_ATOMIC with GFP_KERNEL in st_probe
st_probe() is never called in atomic context. st_probe() is only set as
".probe" in struct scsi_driver.

Despite never getting called from atomic context, st_probe() calls
kzalloc() with GFP_ATOMIC, which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which
can sleep and improve the possibility of sucessful allocation.

This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:14:37 -04:00
Martin Wilck
c360652006 scsi: devinfo: BLIST_RETRY_ASC_C1 for Fujitsu ETERNUS
On Fujitsu ETERNUS systems, sense code ABORTED COMMAND with ASC/Q C1/01
is used to indicate temporary condition where the storage-internal path
to a target is switched from one controller to another. SCSI commands
that return with this error code must be retried unconditionally
(i.e. without the "maybe_retry" logic in scsi_decide_disposition);
otherwise dm-multipath might initiate a failover from a healthy path
e.g. for REQ_FAILFAST_DEV commands.

Introduce a new blist flag for this case.

[mkp: applied by hand]

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:14:36 -04:00
Martin Wilck
29cfc2ab71 scsi: devinfo: add BLIST_RETRY_ITF for EMC Symmetrix
EMC Symmetrix returns 'internal target error' for a variety of
conditions, most of which will be transient.  So we should always retry
it, even with failfast set.  Otherwise we'd get spurious path flaps with
multipath.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:14:36 -04:00
Martin Wilck
358fda5ff4 scsi: devinfo: warn on undefined blist flags
Warn if a device (or the user) sets blist flags which are unknown
or have been removed. This should enable us to reuse freed blist
bits in later releases.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:14:35 -04:00
Martin Wilck
1409880357 scsi: devinfo: change blist_flag_t to 64bit
Space for SCSI blist flags is gradually running out. Change the type to
__u64 and fix a checkpatch complaint about symbolic mode flags in
scsi_devinfo.c.

Make checkpatch happy by replacing simple_strtoul() with kstrtoull().

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:14:35 -04:00
Martin Wilck
659c1c1b29 scsi: devinfo: use const_ilog2 for array indices
Use the just introduced const_ilog2() macro to avoid sparse errors.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 19:14:28 -04:00
Long Li
2217a47de4 scsi: storvsc: Select channel based on available percentage of ring buffer to write
This is a best effort for estimating on how busy the ring buffer is for
that channel, based on available buffer to write in percentage. It is
still possible that at the time of actual ring buffer write, the space
may not be available due to other processes may be writing at the time.

Selecting a channel based on how full it is can reduce the possibility
that a ring buffer write will fail, and avoid the situation a channel is
over busy.

Now it's possible that storvsc can use a smaller ring buffer size
(e.g. 40k bytes) to take advantage of cache locality.

Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 15:38:38 -04:00
Long Li
f286299c1d scsi: storvsc: Set up correct queue depth values for IDE devices
Unlike SCSI and FC, we don't use multiple channels for IDE.  Also fix
the calculation for sub-channels.

Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-20 15:36:02 -04:00
Bart Van Assche
ccce20fc79 scsi: sd_zbc: Avoid that resetting a zone fails sporadically
Since SCSI scanning occurs asynchronously, since sd_revalidate_disk() is
called from sd_probe_async() and since sd_revalidate_disk() calls
sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called
concurrently with blkdev_report_zones() and/or blkdev_reset_zones().  That can
cause these functions to fail with -EIO because sd_zbc_read_zones() e.g. sets
q->nr_zones to zero before restoring it to the actual value, even if no drive
characteristics have changed.  Avoid that this can happen by making the
following changes:

- Protect the code that updates zone information with blk_queue_enter()
  and blk_queue_exit().
- Modify sd_zbc_setup_seq_zones_bitmap() and sd_zbc_setup() such that
  these functions do not modify struct scsi_disk before all zone
  information has been obtained.

Note: since commit 055f6e18e0 ("block: Make q_usage_counter also track
legacy requests"; kernel v4.15) the request queue freezing mechanism also
affects legacy request queues.

Fixes: 89d9475610 ("sd: Implement support for ZBC devices")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: stable@vger.kernel.org # v4.16
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-19 00:04:10 -04:00
Bart Van Assche
c976562162 scsi: sd_zbc: Let the SCSI core handle ILLEGAL REQUEST / ASC 0x21
scsi_io_completion() translates the sense key ILLEGAL REQUEST / ASC 0x21 into
ACTION_FAIL. That means that setting cmd->allowed to zero in sd_zbc_complete()
for this sense code / ASC combination is not necessary. Hence remove the code
that resets cmd->allowed from sd_zbc_complete().

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-19 00:00:44 -04:00
Bart Van Assche
354f113205 scsi: sd_zbc: Change the type of the ZBC fields into u32
This patch does not change any functionality but makes it clear that it is on
purpose that these fields are 32 bits wide.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-19 00:00:44 -04:00
Christoph Hellwig
9027b15d51 scsi: storsvc: don't set a bounce limit
The default already is to never bounce, so the call is a no-op.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-19 00:00:44 -04:00
Christoph Hellwig
3e58c5cf16 scsi: iscsi_tcp: don't set a bounce limit
The default already is to never bounce, so the call is a no-op.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-19 00:00:44 -04:00
Souptick Joarder
0cac8e1bba scsi: sg: Change return type to vm_fault_t
Use new return type vm_fault_t for fault handler in struct
vm_operations_struct.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-19 00:00:44 -04:00
Michael Schmitz
3109e5ae03 scsi: zorro_esp: New driver for Amiga Zorro NCR53C9x boards
New combined SCSI driver for all ESP based Zorro SCSI boards for m68k Amiga.

Code largely based on board specific parts of the old drivers (blz1230.c,
blz2060.c, cyberstorm.c, cyberstormII.c, fastlane.c which were removed after
the 2.6 kernel series for lack of maintenance) with contributions by Tuomas
Vainikka (TCQ bug tests and workaround) and Finn Thain (TCQ bugfix by use of
PIO in extended message in transfer).

New Kconfig option and Makefile entries for new Amiga Zorro ESP SCSI driver
included in this patch.

Use DMA transfers wherever possible, with board-specific DMA set-up functions
copied from the old driver code. Three byte reselection messages do appear to
cause DMA timeouts. So wire up a PIO transfer routine for these
instead. esp_reselect_with_tag explicitly sets
esp->cmd_block_dma as target address for the message bytes but PIO
requires a virtual address.  Substiute kernel virtual address
esp->cmd_block in PIO transfer call if DMA address is esp->cmd_block_dma
and phase is message in.

PIO code taken from mac_esp.c where the reselection timeout issue was debugged
and fixed first, with minor macro and function rename.

Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christian T. Steigies <cts@debian.org>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-19 00:00:44 -04:00
Mahesh Rajashekhara
505aa4b6a8 scsi: sd: Defer spinning up drive while SANITIZE is in progress
A drive being sanitized will return NOT READY / ASC 0x4 / ASCQ
0x1b ("LOGICAL UNIT NOT READY. SANITIZE IN PROGRESS").

Prevent spinning up the drive until this condition clears.

[mkp: tweaked commit message]

Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 23:37:40 -04:00
Vinson Lee
fb1633d56b scsi: megaraid_sas: Do not log an error if FW successfully initializes.
Fixes: 2d2c233167 ("scsi: megaraid_sas: modified few prints in OCR and IOC INIT path")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 23:37:40 -04:00
Ohad Sharabi
6667e6d91c scsi: ufs: add trace event for ufs upiu
Add UFS Protocol Information Units(upiu) trace events for ufs driver,
used to trace various ufs transaction types- command, task-management
and device management.
The trace-point format is generic and can be easily adapted to trace
other upius if needed.
Currently tracing ufs transaction of type 'device management', which
this patch introduce, cannot be obtained from any other trace.
Device management transactions are used for communication with the
device such as reading and writing descriptor or attributes etc.

Signed-off-by: Ohad Sharabi <ohad.sharabi@sandisk.com>
Reviewed-by: Stanislav Nijnikov <stanislav.nijnikov@wdc.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 23:37:39 -04:00
Colin Ian King
0cfce53a78 scsi: fnic: fix spelling mistake in fnic stats "Abord" -> "Abort"
Trivial fix to spelling mistake in fnic stats message text.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 22:58:49 -04:00
Douglas Gilbert
4f2c8bf6bd scsi: scsi_debug: IMMED related delay adjustments
A patch titled: "[PATCH v2] scsi_debug: implement IMMED bit" introduced
long delays to the Start stop unit (SSU) and Synchronize cache (SC)
commands when the IMMED bit is clear.  This patch makes those delays
more realistic. It causes SSU to only delay when the start stop state is
changed; SC only delays when there's been a write since the previous
SC. It also reduced the SC delay from 1 second to 50 milliseconds.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reported-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 22:58:49 -04:00
Chris Leech
af17092810 scsi: iscsi: respond to netlink with unicast when appropriate
Instead of always multicasting responses, send a unicast netlink message
directed at the correct pid.  This will be needed if we ever want to
support multiple userspace processes interacting with the kernel over
iSCSI netlink simultaneously.  Limitations can currently be seen if you
attempt to run multiple iscsistart commands in parallel.

We've fixed up the userspace issues in iscsistart that prevented
multiple instances from running, so now attempts to speed up booting by
bringing up multiple iscsi sessions at once in the initramfs are just
running into misrouted responses that this fixes.

Signed-off-by: Chris Leech <cleech@redhat.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 22:58:49 -04:00
Xose Vazquez Perez
37b37d2609 scsi: scsi_dh: replace too broad "TP9" string with the exact models
SGI/TP9100 is not an RDAC array:
    ^^^
https://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=libmultipath/hwtable.c;h=88b4700beb1d8940008020fbe4c3cd97d62f4a56;hb=HEAD#l235

This partially reverts commit 35204772ea ("[SCSI] scsi_dh_rdac :
Consolidate rdac strings together")

[mkp: fixed up the new entries to align with rest of struct]

Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: SCSI ML <linux-scsi@vger.kernel.org>
Cc: DM ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:08 -04:00
Xose Vazquez Perez
b15578ddab scsi: devinfo: delete duplicate "Generic"/"USB Storage-SMC" device
The revision field is currently unused by the devinfo pattern matching
code. Combine two blacklist entries into one.

$ egrep "Generic.*Storage-SMC" /proc/scsi/device_info
'Generic' 'USB Storage-SMC' 0x402
'Generic' 'USB Storage-SMC' 0x402

[mkp: tweaked commit desc]

Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
Cc: SCSI ML <linux-scsi@vger.kernel.org>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:08 -04:00
James Smart
40e4a2e15c scsi: lpfc: update driver version to 12.0.0.2
Update the driver version to 12.0.0.2

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:07 -04:00
James Smart
b0a00d8d2b scsi: lpfc: Correct missing remoteport registration during link bounces
Remote port disappearance/reappearances would cause a series of RSCN
events to be delivered to the driver. During the resulting GID_FT
handling, the driver clears the fc4 settings on the remote port, which
makes it skip registration. As such, the nvme associations eventually
fail and return io errors to the applications.

Correct by not clearng the nlp_fc4_types for all nodes in
lpfc_issue_gidft.  Instead, when the GID_FT response is handled, clear
the nlp_fc4_types of FCP and NVME prior to evaluating the fc4_type
returned by the GID_FT response.  This approach leaves "skipped" nodes
with their nlp_fc4_types intacted.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:06 -04:00
James Smart
66a85155d4 scsi: lpfc: Fix NULL pointer reference when resetting adapter
Points referencing local port structures didn't accommodate cases where
the localport may not be registered yet.

Add NULL pointer checks to logic.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:06 -04:00
James Smart
b15bd3e621 scsi: lpfc: Fix nvme remoteport registration race conditions
On tests adding and removing a remote port, calls to nvme_info would
eventually show fewer target ports discovered than were present in the
san. Additionally, the following error messages were seen:

  6031 RemotePort Registration failed err: -116, DID x471301

There is a race condition that exists between the driver and the nvme
transport on remote port unregister vs the confirmed deletion. It's
possible that the driver may rediscover the remote port and reregister
the remote port before a prior unregister delete callback was made (as
it rebinded to the prior remoteport structure). However, the driver was
coded to expect the callback before seeing the remote port again thus a
new registration. The logic results in the driver having an invalid
remoteport pointer set.

Correct by tracking when waiting for the delete callback. In cases where
the ndlp remoteport pointer is updated, it is only cleared when the wait
has not been superceded by a prior registration.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:05 -04:00
James Smart
b04744ce52 scsi: lpfc: Fix driver not recovering NVME rports during target link faults
During target-side port faults, the driver would not recover all target
port logins. This resulted in a loss of nvme device discovery.

The driver is coded to wait for all GID_FT requests to complete before
restarting discovery. A fault is seen where the outstanding GIT_FT
counts are not properly decremented, thus discovery would never
start. Another fault was found in the clearing of the gidft_inp counter
that would be skipped in this condition. And a third fault found with
lpfc_nvme_register_port that would remove a reverence on the ndlp which
then allows a node swap on a port address change to prematurely remove
the reference and release the ndlp.

The following changes are made:

 - Correct the decrementing of the outstanding GID_FT counters.

 - In RSCN handling, no longer zero the counter before calling to issue
   another GID_FT.

 - No longer remove the reference on the dlp when the ndlp->nrport value
   is not yet null.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:05 -04:00
James Smart
bf316c7851 scsi: lpfc: Fix WQ/CQ creation for older asic's.
The patch to enlarge WQ/CQ creation keys off of an adapter response that
indicates support for the larger values. Older adapters return an
incorrect response and are limited in size.  Thus the adapters fail the
WQ creation steps.

Augment the WQ sizing checks with a check on the older adapter types and
limit them to the restricted sizes.

Fixes: c176ffa084 ("scsi: lpfc: Increase CQ and WQ sizes for SCSI")
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:04 -04:00
James Smart
01466024d2 scsi: lpfc: Fix NULL pointer access in lpfc_nvme_info_show
After making remoteport unregister requests, the ndlp nrport pointer was
stale.

Track when waiting for waiting for unregister completion callback and
adjust nldp pointer assignment.  Add a few safety checks for NULL
pointer values.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:04 -04:00
James Smart
0cdb84ec26 scsi: lpfc: Fix lingering lpfc_wq resource after driver unload
After driver unloads, lpfc_wq remains active. The destroy_workqueue
calls were not being made in driver unload.  Additionally, SLI3 is
allocating lpfc_wq resources, but never uses it.

Make the destroy_workqueue calls on driver unload.  Modify the SLI3 code
path no longer allocate lpfc_wq resources.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:03 -04:00
James Smart
59c68eaad7 scsi: lpfc: Fix Abort request WQ selection
When running loads that generated aborts, io errors where seen.  Turns
out the abort requests where not placed on the proper WQ resulting in
the errors. Closer inspection inspection of this error also showed
improper spinlock api use.

Correct the WQ selection policy for the abort requests.  Correct
spin_lock/spin_lock_irq/spin_lock_irqsave usage.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:02 -04:00
James Smart
2448e48425 scsi: lpfc: Enlarge nvmet asynchronous receive buffer counts
Under large io load, the current sizing of asynchronous buffer counts
could be exceeded, indicated by a 2885 log message:

  2885 Port Status Event: port status reg 0x81800000, port smphr
      reg 0xc000, error 1=0x52004a01, error 2=0x0

Enlarge the async receive queue size.  Allow for a configurable number
of buffers to be posted to each RQ, using the new attribute
lpfc_nvmet_mrq_post.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:02 -04:00
James Smart
66a210ffb8 scsi: lpfc: Add per io channel NVME IO statistics
When debugging various issues, per IO channel IO statistics were useful
to understand what was happening. However, many of the stats were on a
port basis rather than an io channel basis.

Move statistics to an io channel basis.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:01 -04:00
James Smart
f91bc594ba scsi: lpfc: Correct target queue depth application changes
The max_scsicmpl_time parameter can be used to perform scsi cmd queue
depth mgmt based on io completion time: the queue depth is reduced to
make completion time shorter. However, as soon as an io completes and
the completion time is within limits, the code immediately bumps the
queue depth limit back up to the target queue depth. Thus the procedure
restarts, effectively limiting the usefulness of adjusting queue depth
to help completion time.

This patch makes the following changes:

 - Removes the code at io completion that resets the queue depth as soon
   as within limits.

 - As the code removed was where the target queue depth was first
   applied, change target queue depth application so that it occurs when
   the parameter is changed.

 - Makes target queue depth a standard parameter: both a module
   parameter and a sysfs parameter.

 - Optimizes the command pending count by using atomics rather than
   locks.

 - Updates the debugfs nodelist stats to allow better debugging of
   pending command counts.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:01 -04:00
James Smart
118c0415ee scsi: lpfc: Fix multiple PRLI completion error path
Nodelist entry for SCSI array ends up in UNMAPPED state. This is due to
illegal discovery State machine transition because of two PRLIs and the
first one failing with LS_RJT. Also, the error path was designed
assuming the PRLIs complete in the order they were sent, FCP first, then
NVME. In a failing case, the array thinks about the first PRLI (FCP),
but issues LS_RJT for the 2nd PRLI immediately.

Fix PRLI completion error path for the ordering expectation.  Ensure the
discovery state machine update is not set until all outstanding PRLIs
are complete.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:00 -04:00
Shivasharan S
67c5490ace scsi: megaraid_sas: driver version upgrade
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:33:59 -04:00
Shivasharan S
3239b8cd28 scsi: megaraid_sas: Increase timeout by 1 sec for non-RAID fastpath IOs
Hardware could time out Fastpath IOs one second earlier than the timeout
provided by the host.

For non-RAID devices, driver provides timeout value based on OS provided
timeout value. Under certain scenarios, if the OS provides a timeout
value of 1 second, due to above behavior hardware will timeout
immediately.

Increase timeout value for non-RAID fastpath IOs by 1 second.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:33:59 -04:00
Himanshu Jha
3c6c122cfc scsi: megaraid_sas: Use zeroing memory allocator than allocator/memset
Use pci_zalloc_consistent for allocating zeroed memory and remove
unnecessary memset function.

Done using Coccinelle.
Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci

Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:33:58 -04:00
Jason Yan
b6240a4df0 scsi: libsas: add transport class for ATA devices
Now ata devices attached with sas controller do not have transport
class, so that we can not see any information of these ata devices in
/sys/class/ata_port(or ata_link or ata_device).

Add transport class for the ata devices attached with sas controller.
The /sys/class directory will show the infomation of the ata devices
as follows:

localhost:/sys/class # ls ata*
ata_device:
dev1.0  dev2.0

ata_link:
link1  link2

ata_port:
ata1  ata2

No functional change of the device scanning and io path. The ata
transport class was deleted when destroying the sas devices.

Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: Dan Williams <dan.j.williams@intel.com>
CC: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
John Garry
c90a0bea4f scsi: hisi_sas: remove some unneeded structure members
This patch removes unneeded structure elements:

- hisi_sas_phy.dev_sas_addr: only ever written
	- Also remove associated function which writes it,
	  hisi_sas_init_add().

- hisi_sas_device.attached_phy: only ever written
	- Also remove code to set it in hisi_sas_dev_found()

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
John Garry
381ed6c081 scsi: hisi_sas: print device id for errors
When we find an erroneous slot completion, to help aid debugging add the
device index to the current debug log.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiaofei Tan
327f242fa8 scsi: hisi_sas: check IPTT is valid before using it for v3 hw
There is a bug of v3 hw development version. When AXI error happen, hw
may return an abnormal CQ that IPTT value is 0xffff.  This will cause
IPTT out-of-bounds reference.

This patch adds a check of IPTT in cq_tasklet_v3_hw() and discards
invalid slot. This workaround scheme is just to enhance fault-tolerance
of the driver. So, we will apply this scheme for all version of v3 hw,
although release version has fixed this SoC bug.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiaofei Tan
3ff0f0b657 scsi: hisi_sas: consolidate command check in hisi_sas_get_ata_protocol()
Currently we check the fis->command value in 2 locations in
hisi_sas_get_ata_protocol() switch statement. Fix this by consolidating
the check for fis->command value to 1 location only.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiang Chen
4f4e21b8ff scsi: hisi_sas: use dma_zalloc_coherent()
This is a warning coming from Coccinelle, and need to use new interface
dma_zalloc_coherent() instead of dma_alloc_coherent()/memset().

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiang Chen
5df41af4b1 scsi: hisi_sas: delete timer when removing hisi_sas driver
Delete timer for v1 and v3 hw when removing hisi_sas driver.

Signed-off-by: Xiang chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiaofei Tan
6157363091 scsi: hisi_sas: update RAS feature for later revision of v3 HW
There is an modification for later revision of v3 hw. More HW errors are
reported through RAS interrupt. These errors were originally reported
only through MSI.

When report to RAS, some combinations are done to port AXI errors and
FIFO OMIT errors. For example, each port has 4 AXI errors, and they are
combined to one when report to RAS.

This patch does two things:

1. Enable RAS interrupt of these errors and handle them in PCI
   error handlers.

2. Disable MSI interrupts of these errors for this later revision hw.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Xiang Chen
8b8d665315 scsi: hisi_sas: make SAS address of SATA disks unique
When directly connected with SATA disks in different SAS cores, fill SAS
address with scsi_host's id to make it's fake SAS address unique.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Uma Krishnan
d2d354a606 scsi: cxlflash: Handle spurious interrupts
The following Oops can occur when there is heavy I/O traffic and the host is
reset by a tool such as sg_reset.

[c000200fff3fbc90] c00800001690117c process_cmd_doneq+0x104/0x500
                                       [cxlflash] (unreliable)
[c000200fff3fbd80] c008000016901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash]
[c000200fff3fbde0] c000000000193130 __handle_irq_event_percpu+0xa0/0x310
[c000200fff3fbea0] c0000000001933d8 handle_irq_event_percpu+0x38/0x90
[c000200fff3fbee0] c000000000193494 handle_irq_event+0x64/0xb0
[c000200fff3fbf10] c000000000198ea0 handle_fasteoi_irq+0xc0/0x230
[c000200fff3fbf40] c00000000019182c generic_handle_irq+0x4c/0x70
[c000200fff3fbf60] c00000000001794c __do_irq+0x7c/0x1c0
[c000200fff3fbf90] c00000000002a390 call_do_irq+0x14/0x24
[c000200e5828fab0] c000000000017b2c do_IRQ+0x9c/0x130
[c000200e5828fb00] c000000000009b04 h_virt_irq_common+0x114/0x120

When a context is reset, the pending commands are flushed and the AFU is
notified. Before the AFU handles this request there could be command
completion interrupts queued to PHB which are yet to be delivered to the
context. In this scenario, a context could receive an interrupt for a command
that has been flushed, leading to a possible crash when the memory for the
flushed command is accessed.

To resolve this problem, a boolean will indicate if the hardware queue is
ready to process interrupts or not. This can be evaluated in the interrupt
handler before proessing an interrupt.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:51 -04:00
Uma Krishnan
9a597cd4c0 scsi: cxlflash: Remove commmands from pending list on timeout
The following Oops can occur if an internal command sent to the AFU does not
complete within the timeout:

[c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash]
[c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash]
[c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230
                                       [cxlflash]
[c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl]
[c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl]
[c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0
[c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
[c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580
[c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
[c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
[c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0
[c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4

When an internal command times out, the command buffer is freed while it is
still in the pending commands list of the context. This corrupts the list and
when the context is cleaned up, a crash is encountered.

To resolve this issue, when an AFU command or TMF command times out, the
command should be deleted from the hardware queue pending command list before
freeing the buffer.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
a3feb6ef50 scsi: cxlflash: Synchronize reset and remove ops
The following Oops can be encountered if a device removal or system shutdown
is initiated while an EEH recovery is in process:

[c000000ff2f479c0] c008000015256f18 cxlflash_pci_slot_reset+0xa0/0x100
                                      [cxlflash]
[c000000ff2f47a30] c00800000dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl]
[c000000ff2f47ae0] c00000000003ef1c eeh_report_reset+0xec/0x170
[c000000ff2f47b20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
[c000000ff2f47bb0] c00000000003f80c eeh_handle_normal_event+0x56c/0x580
[c000000ff2f47c60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
[c000000ff2f47d10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
[c000000ff2f47dc0] c00000000013da48 kthread+0x1a8/0x1b0
[c000000ff2f47e30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4

The remove handler frees AFU memory while the EEH recovery is in progress,
leading to a race condition. This can result in a crash if the recovery thread
tries to access this memory.

To resolve this issue, the cxlflash remove handler will evaluate the device
state and yield to any active reset or probing threads.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
07d0c52f87 scsi: cxlflash: Enable OCXL operations
This commit enables the OCXL operations for the OCXL devices.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
9433fb32b7 scsi: cxlflash: Support AFU reset
The cxlflash core driver resets the AFU when the master contexts are created
in the initialization or recovery paths. Today, the OCXL provider service to
perform this operation is pending implementation.  To avoid a crash due to a
missing fop, log an error once and return success to continue with execution.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
66ae644b92 scsi: cxlflash: Register for translation errors
While enabling a context on the link, a predefined callback can be registered
with the OCXL provider services to be notified on translation errors. These
errors can in turn be passed back to the user on a read operation.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
f81face725 scsi: cxlflash: Introduce OCXL context state machine
In order to protect the OCXL hardware contexts from getting clobbered, a
simple state machine is added to indicate when a context is in open, close or
start state. The expected states are validated throughout the code to prevent
illegal operations on a context. A mutex is added to protect writes to the
context state field.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
d91dd3a7d1 scsi: cxlflash: Update synchronous interrupt status bits
The SISLite specification has been updated to define new synchronous interrupt
status bits. These bits are set by the AFU when a given PASID or EA is bad and
a synchronous interrupt is triggered.

The SISLite header file is updated to support these new bits. Note that there
are also some formatting updates to some of the existing bits to allow all of
the definitions to line up uniformly.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
d44af4b090 scsi: cxlflash: Setup LISNs for master contexts
Similar to user contexts, master contexts also require that the per-context
LISN registers be programmed for certain AFUs. The mapped trigger page is
obtained from underlying transport and registered with AFU for each master
context.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
23239eeccb scsi: cxlflash: Setup LISNs for user contexts
The SISLite specification has been updated for OCXL to support communicating
data to generate AFU interrupts to the AFU. This includes a new capability bit
that is advertised for OCXL AFUs and new registers to hold the object handle
and translation PASID of each interrupt. For Power, the object handle is the
mapped trigger page. Note that because these mappings are kernel only, the
PASID of a kernel context must be used to satisfy the translation.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
402a55ea47 scsi: cxlflash: Introduce object handle fop
OCXL requires that AFUs use an opaque object handle to represent an AFU
interrupt. The specification does not provide a common means to communicate
the object handle to the AFU - each AFU must define this within the AFU
specification. To support this model, the object handle must be passed back to
the core driver as it manages the AFU specification (SISLite) for cxlflash.
Note that for Power systems, the object handle is the effective address of the
trigger page.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
e117c3c731 scsi: cxlflash: Support file descriptor mapping
The cxlflash core fop API requires a way to invoke the fault and release
handlers of underlying transports using their native file-based APIs. This
provides the core with the ability to insert selectively itself into the
processing stream of these operations for cleanup. Implement these two fops to
map and release when requested.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:50 -04:00
Uma Krishnan
93b8f8df55 scsi: cxlflash: Support adapter context mmap and release
The cxlflash userspace API requires that users be able to mmap and release the
adapter context. Support mapping by implementing the AFU mmap fop to map the
context MMIO space and install the corresponding page table entry upon page
fault. Similarly, implement the AFU release fop to terminate and clean up the
context when invoked.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
03aa9c519c scsi: cxlflash: Support adapter context reading
The cxlflash userspace API requires that users be able to read the adapter
context for any pending events or interrupts from the AFU. Support reading
various events by implementing the AFU read fop to copy out event data.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
56f1db1a2a scsi: cxlflash: Support adapter context polling
The cxlflash userspace API requires that users be able to poll the adapter
context for any pending events or interrupts from the AFU. Support polling on
various events by implementing the AFU poll fop using a waitqueue.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
762c7e9332 scsi: cxlflash: Support starting user contexts
User contexts request interrupts and are started using the "start work"
interface. Populate the start_work() fop to allocate and map interrupts before
starting the user context. As part of starting the context, update the user
process identification logic to properly derive the data required by the
SPA. Also, introduce a skeleton interrupt handler using a bitmap, flag, and
spinlock to track interrupts. This handler will be expanded in future commits.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
a06b1cfc04 scsi: cxlflash: Support AFU interrupt mapping and registration
Add support to map and unmap the irq space and manage irq registrations with
the kernel for each allocated AFU interrupt. Also support mapping the physical
trigger page to obtain an effective address that will be provided to the
cxlflash core in a future commit.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
bc65c1c7bf scsi: cxlflash: Support AFU interrupt management
Add support to allocate and free AFU interrupts using the OCXL provider
services. The trigger page returned upon successful allocation will be mapped
and exposed to the cxlflash core in a future commit.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
c207b57143 scsi: cxlflash: Support process element lifecycle
As part of the context lifecycle, the associated process element within the
Shared Process Area (SPA) of the link must be updated. Each process is defined
by various parameters (pid, tid, PASID mm) that are stored in the SPA upon
starting a context and invalidated when a context is stopped.

Use the OCXL provider services to configure the SPA with the appropriate data
that is unique to the process when starting a context. Initially only kernel
contexts are supported and therefore these process values are not applicable.
Note that the OCXL service used has an optional callback for translation fault
error notification. While not used here, it will be expanded in a future
commit.

Also add a service to stop a context by terminating the corresponding PASID
and remove the process element from the SPA.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
c52bf5b384 scsi: cxlflash: Setup OCXL transaction layer
The first function of the link needs to configure the transaction layer
between the host and device. This is accomplished by a call to the OCXL
provider services.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
7390482376 scsi: cxlflash: Setup function OCXL link
After reading and modifying the function configuration, setup the OCXL link
using the OCXL provider services. The link is released when the adapter is
unconfigured.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
119c920073 scsi: cxlflash: Support reading adapter VPD data
Use the PCI VPD services to support reading the VPD data of the underlying
adapter.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
3351e4f025 scsi: cxlflash: Support AFU state toggling
The AFU should be enabled following a successful configuration and disabled
near the end of the cleanup path.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
012f394cb8 scsi: cxlflash: Support process specific mappings
Once the context is started, the assigned MMIO space can be mapped and
unmapped. Provide means to map and unmap the context MMIO space.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:49 -04:00
Uma Krishnan
6b938ac910 scsi: cxlflash: Support starting an adapter context
Once the adapter context is created, it needs to be started by assigning the
MMIO space for the context and by enabling the process element in the
link. This commit adds the skeleton for starting the context and assigns the
context specific MMIO space. Master contexts have access to the global MMIO
space while the rest have access to the context specific space.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
54370503a7 scsi: cxlflash: MMIO map the AFU
When the AFU is configured, the global and per process MMIO regions are
presented by the configuration space. Save these regions and map the global
MMIO region that is used to access all of the control and provisioning data in
the AFU.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
8b7a552150 scsi: cxlflash: Support image reload policy modification
On a PERST, the AFU image can be reloaded or left intact. Provide means to set
this image reload policy.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
b18718c626 scsi: cxlflash: Support adapter context discovery
Provide means to obtain the process element of an adapter context as well as
locate an adapter context by file.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
926a62f9bd scsi: cxlflash: Support adapter file descriptors for OCXL
Allocate a file descriptor for an adapter context when requested. In order to
allocate inodes for the file descriptors, a pseudo filesystem is created and
used.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
429ebfa69b scsi: cxlflash: Use IDR to manage adapter contexts
A range of PASIDs are used as identifiers for the adapter contexts. These
contexts may be destroyed and created randomly. Use an IDR to keep track of
contexts that are in use and assign a unique identifier to new ones.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
f6b4557c98 scsi: cxlflash: Adapter context support for OCXL
Add support to create and release the adapter contexts for OCXL and provide
means to specify certain contexts as a master.

The existing cxlflash core has a design requirement that each host will have a
single host context available by default. To satisfy this requirement, one
host adapter context is created when the hardware AFU is initialized. This is
returned by the get_context() fop.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
41df40d817 scsi: cxlflash: Setup AFU PASID
Per the OCXL specification, the maximum PASID supported by the AFU is
indicated by a field within the configuration space. Similar to acTags,
implementations can choose to use any sub-range of PASID within their assigned
range. For cxlflash, the entire range is used.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
d926519e8f scsi: cxlflash: Setup AFU acTag range
The OCXL specification supports distributing acTags amongst different AFUs and
functions on the link. As cxlflash devices are expected to only support a
single AFU per function, the entire range that was assigned to the function is
also assigned to the AFU.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
9cc84291be scsi: cxlflash: Read host AFU configuration
The host AFU configuration is read on the initialization path to identify the
features and configuration of the AFU. This data is cached for use in later
configuration steps.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
2e222779ae scsi: cxlflash: Setup function acTag range
The OCXL specification supports distributing acTags amongst different AFUs and
functions on the link. The platform-specific acTag range for the link is
obtained using the OCXL provider services and then assigned to the host
function based on implementation.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:48 -04:00
Uma Krishnan
e9dfceda92 scsi: cxlflash: Read host function configuration
Per the OCXL specification, the underlying host can have multiple AFUs per
function with each function supporting its own configuration. The host
function configuration is read on the initialization path to evaluate the
number of functions present and identify the features and configuration of the
functions present. This data is cached for use in later configuration
steps. Note that for the OCXL hardware supported by the cxlflash driver, only
one AFU per function is expected.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:47 -04:00
Uma Krishnan
48e077dbb4 scsi: cxlflash: Hardware AFU for OCXL
When an adapter is initialized, transport specific configuration and MMIO
mapping details need to be saved. For CXL, this data is managed by the
underlying kernel module. To maintain a separation between the cxlflash core
and underlying transports, introduce a new structure to store data specific to
the OCXL AFU.

Initially only the pointers to underlying PCI and generic devices are added to
this new structure - it will be expanded further in future commits. Services
to create and destroy this hardware AFU are added and integrated in the probe
and exit paths of the driver.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:47 -04:00
Uma Krishnan
76ebe01fce scsi: cxlflash: Introduce OCXL backend
Add initial infrastructure to support a new cxlflash transport, OCXL.

Claim a dependency on OCXL and add a new file, ocxl_hw.c, which will host the
backend routines that are specific to OCXL.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:47 -04:00
Uma Krishnan
fb77e52804 scsi: cxlflash: Add argument identifier names
Checkpatch throws a warning when the argument identifier names are not
included in the function definitions.

To avoid these warnings, argument identifiers are added in the existing
function definitions.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:47 -04:00
Matthew R. Ochs
465891fe92 scsi: cxlflash: Avoid clobbering context control register value
The SISLite specification originally defined the context control register with
a single field of bits to represent the LISN and also stipulated that the
register reset value be 0. The cxlflash driver took advantage of this when
programming the LISN for the master contexts via an unconditional write - no
other bits were preserved.

When unmap support was added, SISLite was updated to define bit 0 of the
context control register as a way for the AFU to notify the context owner that
unmap operations were supported. Thus the assumptions under which the register
is setup changed and the existing unconditional write is clobbering the unmap
state for master contexts. This is presently not an issue due to the order in
which the context control register is programmed in relation to the unmap bit
being queried but should be addressed to avoid a future regression in the
event this code is moved elsewhere.

To remedy this issue, preserve the bits when programming the LISN field in the
context control register. Since the LISN will now be programmed using a read
value, assert that the initial state of the LISN field is as described in
SISLite (0).

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:47 -04:00
Uma Krishnan
e11e0ff870 scsi: cxlflash: Preserve number of interrupts for master contexts
The number of interrupts requested for user contexts are stored in the context
specific structures and utilized to manage the interrupts. For the master
contexts, this number is only used once and therefore not saved.

To prepare for future commits where the number of interrupts will be required
in more than one place, preserve the value in the master context structure.

[mkp: typo in comment]

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:32:47 -04:00
Linus Torvalds
f0d98d8583 SCSI fixes on 20180415
This is a set of minor (and safe changes) that didn't make the initial
 pull request plus some bug fixes.  The status handling code is
 actually a running regression from the previous merge window which had
 an incomplete fix (now reverted) and most of the remaining bug fixes
 are for problems older than the current merge window.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWtMW7SYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYdtAP97FhqR
 x2lDO7J6QT8hMVqwPeQS0Xh5ZPbZLedPmfx9BAD+K1HauGv8J/eMggMDPGrWa/CP
 tGrg2UorMrokLLdIbyA=
 =fOJs
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of minor (and safe changes) that didn't make the initial
  pull request plus some bug fixes.

  The status handling code is actually a running regression from the
  previous merge window which had an incomplete fix (now reverted) and
  most of the remaining bug fixes are for problems older than the
  current merge window"

[ Side note: this merge also takes the base kernel git repository to 6+
  million objects for the first time. Technically we hit it a couple of
  merges ago already if you count all the tag objects, but now it
  reaches 6M+ objects reachable from HEAD.

  I was joking around that that's when I should switch to 5.0, because
  3.0 happened at the 2M mark, and 4.0 happened at 4M objects. But
  probably not, even if numerology is about as good a reason as any.

                                                              - Linus ]

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: devinfo: Add Microsoft iSCSI target to 1024 sector blacklist
  scsi: cxgb4i: silence overflow warning in t4_uld_rx_handler()
  scsi: dpt_i2o: Use after free in I2ORESETCMD ioctl
  scsi: core: Make scsi_result_to_blk_status() recognize CONDITION MET
  scsi: core: Rename __scsi_error_from_host_byte() into scsi_result_to_blk_status()
  Revert "scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"
  scsi: aacraid: Insure command thread is not recursively stopped
  scsi: qla2xxx: Correct setting of SAM_STAT_CHECK_CONDITION
  scsi: qla2xxx: correctly shift host byte
  scsi: qla2xxx: Fix race condition between iocb timeout and initialisation
  scsi: qla2xxx: Avoid double completion of abort command
  scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure
  scsi: scsi_dh: Don't look for NULL devices handlers by name
  scsi: core: remove redundant assignment to shost->use_blk_mq
2018-04-15 17:24:12 -07:00
Jens Axboe
2d097c5021 sr: get/drop reference to device in revalidate and check_events
We can't just use scsi_cd() to get the scsi_cd structure, we have
to grab a live reference to the device. For both callbacks, we're
not inside an open where we already hold a reference to the device.

This fixes device removal/addition under concurrent device access,
which otherwise could result in the below oops.

NULL pointer dereference at 0000000000000010
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
sr 12:0:0:0: [sr2] scsi-1 drive
 scsi_debug crc_t10dif crct10dif_generic crct10dif_common nvme nvme_core sb_edac xl
sr 12:0:0:0: Attached scsi CD-ROM sr2
 sr_mod cdrom btrfs xor zstd_decompress zstd_compress xxhash lzo_compress zlib_defc
sr 12:0:0:0: Attached scsi generic sg7 type 5
 igb ahci libahci i2c_algo_bit libata dca [last unloaded: crc_t10dif]
CPU: 43 PID: 4629 Comm: systemd-udevd Not tainted 4.16.0+ #650
Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
RIP: 0010:sr_block_revalidate_disk+0x23/0x190 [sr_mod]
RSP: 0018:ffff883ff357bb58 EFLAGS: 00010292
RAX: ffffffffa00b07d0 RBX: ffff883ff3058000 RCX: ffff883ff357bb66
RDX: 0000000000000003 RSI: 0000000000007530 RDI: ffff881fea631000
RBP: 0000000000000000 R08: ffff881fe4d38400 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000000001b6 R12: 000000000800005d
R13: 000000000800005d R14: ffff883ffd9b3790 R15: 0000000000000000
FS:  00007f7dc8e6d8c0(0000) GS:ffff883fff340000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000003ffda98005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? __invalidate_device+0x48/0x60
 check_disk_change+0x4c/0x60
 sr_block_open+0x16/0xd0 [sr_mod]
 __blkdev_get+0xb9/0x450
 ? iget5_locked+0x1c0/0x1e0
 blkdev_get+0x11e/0x320
 ? bdget+0x11d/0x150
 ? _raw_spin_unlock+0xa/0x20
 ? bd_acquire+0xc0/0xc0
 do_dentry_open+0x1b0/0x320
 ? inode_permission+0x24/0xc0
 path_openat+0x4e6/0x1420
 ? cpumask_any_but+0x1f/0x40
 ? flush_tlb_mm_range+0xa0/0x120
 do_filp_open+0x8c/0xf0
 ? __seccomp_filter+0x28/0x230
 ? _raw_spin_unlock+0xa/0x20
 ? __handle_mm_fault+0x7d6/0x9b0
 ? list_lru_add+0xa8/0xc0
 ? _raw_spin_unlock+0xa/0x20
 ? __alloc_fd+0xaf/0x160
 ? do_sys_open+0x1a6/0x230
 do_sys_open+0x1a6/0x230
 do_syscall_64+0x5a/0x100
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-11 11:26:09 -06:00
Ross Lagerwall
4d42680330 scsi: devinfo: Add Microsoft iSCSI target to 1024 sector blacklist
The Windows Server 2016 iSCSI target doesn't work with the Linux kernel
initiator since the kernel started sending larger requests by default,
nor does it implement the block limits VPD page. Apply the sector limit
workaround for these targets.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: KY Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 22:07:05 -04:00
Dan Carpenter
ccc495efb3 scsi: cxgb4i: silence overflow warning in t4_uld_rx_handler()
Smatch marks skb->data as untrusted so it complains that there is a
potential overflow here:

	drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:2111 t4_uld_rx_handler()
	error: buffer overflow 'cxgb4i_cplhandlers' 239 <= 255.

In this case, skb->data comes from the hardware or firmware so it's not
going to overflow unless there is a firmware bug.

[mkp: fixed braces]

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:32:45 -04:00
Dan Carpenter
7709e9bdee scsi: dpt_i2o: Use after free in I2ORESETCMD ioctl
Here is another use after free if we reset the card.  The adpt_hba_reset()
function frees "pHba" on error.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:31:37 -04:00
Bart Van Assche
f4abab3f18 scsi: core: Make scsi_result_to_blk_status() recognize CONDITION MET
Ensure that CONDITION MET and other non-zero status values that indicate
success are translated into BLK_STS_OK.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Lee Duncan <lduncan@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:31:37 -04:00
Bart Van Assche
a77b32d8b1 scsi: core: Rename __scsi_error_from_host_byte() into scsi_result_to_blk_status()
Since the next patch will modify this function such that it checks more than
just the host byte of the SCSI result, rename __scsi_error_from_host_byte()
into scsi_result_to_blk_status().  This patch does not change any
functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Lee Duncan <lduncan@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:31:37 -04:00
Bart Van Assche
cbe095e2b5 Revert "scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"
The description of commit e39a97353e is wrong: it mentions that commit
2a842acab1 introduced a bug in __scsi_error_from_host_byte() although that
commit did not change the behavior of that function.  Additionally, commit
e39a97353e introduced a bug: it causes commands that fail with
hostbyte=DID_OK and driverbyte=DRIVER_SENSE to be completed with
BLK_STS_OK. Hence revert that commit.

Fixes: e39a97353e ("scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()")
Reported-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Lee Duncan <lduncan@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:31:37 -04:00
Dave Carroll
1c6b41fb92 scsi: aacraid: Insure command thread is not recursively stopped
If a recursive IOP_RESET is invoked, usually due to the eh_thread
handling errors after the first reset, be sure we flag that the command
thread has been stopped to avoid an Oops of the form;

 [ 336.620256] CPU: 28 PID: 1193 Comm: scsi_eh_0 Kdump: loaded Not tainted 4.14.0-49.el7a.ppc64le #1
 [ 336.620297] task: c000003fd630b800 task.stack: c000003fd61a4000
 [ 336.620326] NIP: c000000000176794 LR: c00000000013038c CTR: c00000000024bc10
 [ 336.620361] REGS: c000003fd61a7720 TRAP: 0300 Not tainted (4.14.0-49.el7a.ppc64le)
 [ 336.620395] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 22084022 XER: 20040000
 [ 336.620435] CFAR: c000000000130388 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
 [ 336.620435] GPR00: c00000000013038c c000003fd61a79a0 c0000000014c7e00 0000000000000000
 [ 336.620435] GPR04: 000000000000000c 000000000000000c 9000000000009033 0000000000000477
 [ 336.620435] GPR08: 0000000000000477 0000000000000000 0000000000000000 c008000010f7d940
 [ 336.620435] GPR12: c00000000024bc10 c000000007a33400 c0000000001708a8 c000003fe3b881d8
 [ 336.620435] GPR16: c000003fe3b88060 c000003fd61a7d10 fffffffffffff000 000000000000001e
 [ 336.620435] GPR20: 0000000000000001 c000000000ebf1a0 0000000000000001 c000003fe3b88000
 [ 336.620435] GPR24: 0000000000000003 0000000000000002 c000003fe3b88840 c000003fe3b887e8
 [ 336.620435] GPR28: c000003fe3b88000 c000003fc8181788 0000000000000000 c000003fc8181700
 [ 336.620750] NIP [c000000000176794] exit_creds+0x34/0x160
 [ 336.620775] LR [c00000000013038c] __put_task_struct+0x8c/0x1f0
 [ 336.620804] Call Trace:
 [ 336.620817] [c000003fd61a79a0] [c000003fe3b88000] 0xc000003fe3b88000 (unreliable)
 [ 336.620853] [c000003fd61a79d0] [c00000000013038c] __put_task_struct+0x8c/0x1f0
 [ 336.620889] [c000003fd61a7a00] [c000000000171418] kthread_stop+0x1e8/0x1f0
 [ 336.620922] [c000003fd61a7a40] [c008000010f7448c] aac_reset_adapter+0x14c/0x8d0 [aacraid]
 [ 336.620959] [c000003fd61a7b00] [c008000010f60174] aac_eh_host_reset+0x84/0x100 [aacraid]
 [ 336.621010] [c000003fd61a7b30] [c000000000864f24] scsi_try_host_reset+0x74/0x180
 [ 336.621046] [c000003fd61a7bb0] [c000000000867ac0] scsi_eh_ready_devs+0xc00/0x14d0
 [ 336.625165] [c000003fd61a7ca0] [c0000000008699e0] scsi_error_handler+0x550/0x730
 [ 336.632101] [c000003fd61a7dc0] [c000000000170a08] kthread+0x168/0x1b0
 [ 336.639031] [c000003fd61a7e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4
 [ 336.645971] Instruction dump:
 [ 336.648743] 384216a0 7c0802a6 fbe1fff8 f8010010 f821ffd1 7c7f1b78 60000000 60000000
 [ 336.657056] 39400000 e87f0838 f95f0838 7c0004ac <7d401828> 314affff 7d40192d 40c2fff4
 [ 336.663997] -[ end trace 4640cf8d4945ad95 ]-

So flag when the thread is stopped by setting the thread pointer to NULL.

Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:08:30 -04:00
Johannes Thumshirn
584d7aad28 scsi: qla2xxx: Correct setting of SAM_STAT_CHECK_CONDITION
Bart reports that in qla_isr.c's qla2x00_handle_dif_error we're wrongly
shifting the SAM_STAT_CHECK_CONDITION by one instead of directly ORing it
onto the SCSI command's result.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:04:36 -04:00
Johannes Thumshirn
f7d5182c8f scsi: qla2xxx: correctly shift host byte
The SCSI host byte has to be shifted by 16 not 6.

As Bart pointed out this patch does not change any functionality because
DID_OK == 0, but a wrong shift is irritating for the reviewer.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:04:36 -04:00
Ben Hutchings
e74e7d9587 scsi: qla2xxx: Fix race condition between iocb timeout and initialisation
qla2x00_init_timer() calls add_timer() on the iocb timeout timer, which
means the timeout function pointer and any data that the function depends on
must be initialised beforehand.

Move this initialisation before each call to qla2x00_init_timer().  In some
cases qla2x00_init_timer() initialises a completion structure needed by the
timeout function, so move the call to add_timer() after that.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:04:36 -04:00
Ben Hutchings
3a9910d7b6 scsi: qla2xxx: Avoid double completion of abort command
qla2x00_tmf_sp_done() now deletes the timer that will run
qla2x00_tmf_iocb_timeout(), but doesn't check whether the timer already
expired.  Check the return value from del_timer() to avoid calling
complete() a second time.

Fixes: 4440e46d5d ("[SCSI] qla2xxx: Add IOCB Abort command asynchronous ...")
Fixes: 1514839b36 ("scsi: qla2xxx: Fix NULL pointer crash due to active ...")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 21:03:41 -04:00
Bill Kuzeja
6d6340672b scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure
The code that fixes the crashes in the following commit introduced a small
memory leak:

commit 6a2cf8d366 ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure")

Fixing this requires a bit of reworking, which I've explained. Also provide
some code cleanup.

There is a small window in qla2x00_probe_one where if qla2x00_alloc_queues
fails, we end up never freeing req and rsp and leak 0xc0 and 0xc8 bytes
respectively (the sizes of req and rsp).

I originally put in checks to test for this condition which were based on
the incorrect assumption that if ha->rsp_q_map and ha->req_q_map were
allocated, then rsp and req were allocated as well. This is incorrect.
There is a window between these allocations:

       ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp);
                goto probe_hw_failed;

[if successful, both rsp and req allocated]

       base_vha = qla2x00_create_host(sht, ha);
                goto probe_hw_failed;

       ret = qla2x00_request_irqs(ha, rsp);
                goto probe_failed;

       if (qla2x00_alloc_queues(ha, req, rsp)) {
                goto probe_failed;

[if successful, now ha->rsp_q_map and ha->req_q_map allocated]

To simplify this, we should just set req and rsp to NULL after we free
them. Sounds simple enough? The problem is that req and rsp are pointers
defined in the qla2x00_probe_one and they are not always passed by reference
to the routines that free them.

Here are paths which can free req and rsp:

PATH 1:
qla2x00_probe_one
   ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp);
   [req and rsp are passed by reference, but if this fails, we currently
    do not NULL out req and rsp. Easily fixed]

PATH 2:
qla2x00_probe_one
   failing in qla2x00_request_irqs or qla2x00_alloc_queues
      probe_failed:
         qla2x00_free_device(base_vha);
            qla2x00_free_req_que(ha, req)
            qla2x00_free_rsp_que(ha, rsp)

PATH 3:
qla2x00_probe_one:
   failing in qla2x00_mem_alloc or qla2x00_create_host
      probe_hw_failed:
         qla2x00_free_req_que(ha, req)
         qla2x00_free_rsp_que(ha, rsp)

PATH 1: This should currently work, but it doesn't because rsp and rsp are
not set to NULL in qla2x00_mem_alloc. Easily remedied.

PATH 2: req and rsp aren't passed in at all to qla2x00_free_device but are
derived from ha->req_q_map[0] and ha->rsp_q_map[0]. These are only set up if
qla2x00_alloc_queues succeeds.

In qla2x00_free_queues, we are protected from crashing if these don't exist
because req_qid_map and rsp_qid_map are only set on their allocation. We are
guarded in this way:

        for (cnt = 0; cnt < ha->max_req_queues; cnt++) {
                if (!test_bit(cnt, ha->req_qid_map))
                        continue;

PATH 3: This works. We haven't freed req or rsp yet (or they were never
allocated if qla2x00_mem_alloc failed), so we'll attempt to free them here.

To summarize, there are a few small changes to make this work correctly and
(and for some cleanup):

1) (For PATH 1) Set *rsp and *req to NULL in case of failure in
qla2x00_mem_alloc so these are correctly set to NULL back in
qla2x00_probe_one

2) After jumping to probe_failed: and calling qla2x00_free_device,
explicitly set rsp and req to NULL so further calls with these pointers do
not crash, i.e. the free queue calls in the probe_hw_failed section we fall
through to.

3) Fix return code check in the call to qla2x00_alloc_queues. We currently
drop the return code on the floor. The probe fails but the caller of the
probe doesn't have an error code, so it attaches to pci. This can result in
a crash on module shutdown.

4) Remove unnecessary NULL checks in qla2x00_free_req_que,
qla2x00_free_rsp_que, and the egregious NULL checks before kfrees and vfrees
in qla2x00_mem_free.

I tested this out running a scenario where the card breaks at various times
during initialization. I made sure I forced every error exit path in
qla2x00_probe_one.

Cc: <stable@vger.kernel.org> # v4.16
Fixes: 6a2cf8d366 ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 16:35:49 -04:00
Johannes Thumshirn
2ee5671e3a scsi: scsi_dh: Don't look for NULL devices handlers by name
Currently scsi_dh_lookup() doesn't check for NULL as a device name. This
combined with nvme over dm-mpath results in the following messages
emitted by device-mapper:

 device-mapper: multipath: Could not failover device 259:67: Handler scsi_dh_(null) error 14.

Let scsi_dh_lookup() fail fast on NULL names.

[mkp: typo fix]

Cc: <stable@vger.kernel.org> # v4.16
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 16:35:12 -04:00
Colin Ian King
cbee67c2d7 scsi: core: remove redundant assignment to shost->use_blk_mq
The first assignment to shost->use_blk_mq is redundant as it is
overwritten by the following statement. Remove this redundant code.

Detected by CoverityScan, CID#1466993 ("Unused value")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-09 16:34:41 -04:00
Linus Torvalds
052c220da3 SCSI for-linus on 20180404
This is mostly updates of the usual drivers: arcmsr, qla2xx, lpfc,
 ufs, mpt3sas, hisi_sas.  In addition we have removed several really
 old drivers: sym53c416, NCR53c406a, fdomain, fdomain_cs and removed
 the old scsi_module.c initialization from all remaining drivers.  Plus
 an assortment of bug fixes, initialization errors and other minor
 fixes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWsVSnSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbvbAP9ErpTZ
 OR5iJ5HIz4W3Bd8aTfEpJrDyeYwSUC+sra5SKQD/ZWyVB3fYFSg+ZROyT26pmtmd
 SdImhG7hLaHgVvF5qRQ=
 =SQ/n
 -----END PGP SIGNATURE-----

Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual drivers: arcmsr, qla2xx, lpfc,
  ufs, mpt3sas, hisi_sas.

  In addition we have removed several really old drivers: sym53c416,
  NCR53c406a, fdomain, fdomain_cs and removed the old scsi_module.c
  initialization from all remaining drivers.

  Plus an assortment of bug fixes, initialization errors and other minor
  fixes"

* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (168 commits)
  scsi: ufs: Add support for Auto-Hibernate Idle Timer
  scsi: ufs: sysfs: reworking of the rpm_lvl and spm_lvl entries
  scsi: qla2xxx: fx00 copypaste typo
  scsi: qla2xxx: fix error message on <qla2400
  scsi: smartpqi: update driver version
  scsi: smartpqi: workaround fw bug for oq deletion
  scsi: arcmsr: Change driver version to v1.40.00.05-20180309
  scsi: arcmsr: Sleep to avoid CPU stuck too long for waiting adapter ready
  scsi: arcmsr: Handle adapter removed due to thunderbolt cable disconnection.
  scsi: arcmsr: Rename ACB_F_BUS_HANG_ON to ACB_F_ADAPTER_REMOVED for adapter hot-plug
  scsi: qla2xxx: Update driver version to 10.00.00.06-k
  scsi: qla2xxx: Fix Async GPN_FT for FCP and FC-NVMe scan
  scsi: qla2xxx: Cleanup code to improve FC-NVMe error handling
  scsi: qla2xxx: Fix FC-NVMe IO abort during driver reset
  scsi: qla2xxx: Fix retry for PRLI RJT with reason of BUSY
  scsi: qla2xxx: Remove nvme_done_list
  scsi: qla2xxx: Return busy if rport going away
  scsi: qla2xxx: Fix n2n_ae flag to prevent dev_loss on PDB change
  scsi: qla2xxx: Add FC-NVMe abort processing
  scsi: qla2xxx: Add changes for devloss timeout in driver
  ...
2018-04-05 15:05:53 -07:00
Linus Torvalds
3526dd0c78 for-4.17/block-20180402
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJawr05AAoJEPfTWPspceCmT2UP/1uuaqwzyl4VjFNb/k7KS7UM
 +Cs/1HBlGomgMA8orDTGqtWqLRdR3z4RSh0+MvXTzQ78HpFVYz7CbDc9itHm+G9M
 X0ypD4kF/JGCFb5cxk+x6qv28uO2nv4DP3+0hHqJWLH4UVJBWDY6bs4BPShsf9QB
 I6XjioNMhoqylXgdOITLODJZz+TcChlJMDAqwhpJwh9TH1wjobleAZ6AdmCPfgi5
 h0UCKMUKzcVJlNZwQUrzrs2cxcx9Uhunnbz7HK0ZV4n/FKFtDpGynFpQQ71pZxKe
 Be0ZOBPCQvC3ykOM/egCIvC/e5y7FgrjORD6jxyu1PTwAugI5E1VYSMxHkXvgPAx
 zOo9A7RT4GPO2tDQv+DbzNFpqeSAclTgSmr+/y1wmheBs8DiSt7MPVBiNM4zdCNv
 NLk9z7IEjFhdmluSB/LbTb1aokypMb/q7QTLouPHdwGn80k7yrhFyLHgdjpNTQ2K
 UHfHZvGxkOX6SmFhBNOtIFUkuSceenh64a0RkRle7filx+ImpbCVm2/GYi9zZNCu
 EtctgzLbLmz40zMiyDaZS2bxBgGzfn6yf4xd9LsaAJPMhvZnmXogT0D9ctWXB0WU
 mMaS7sOkLnNjnGkzF1fHkeiZ/oigrstJbe+CA7BtOdwxpWn6MZBgKEoFQ6iA2b3X
 5J1axMgVH5LAsIEcEQVq
 =RVhK
 -----END PGP SIGNATURE-----

Merge tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block

Pull block layer updates from Jens Axboe:
 "It's a pretty quiet round this time, which is nice. This contains:

   - series from Bart, cleaning up the way we set/test/clear atomic
     queue flags.

   - series from Bart, fixing races between gendisk and queue
     registration and removal.

   - set of bcache fixes and improvements from various folks, by way of
     Michael Lyle.

   - set of lightnvm updates from Matias, most of it being the 1.2 to
     2.0 transition.

   - removal of unused DIO flags from Nikolay.

   - blk-mq/sbitmap memory ordering fixes from Omar.

   - divide-by-zero fix for BFQ from Paolo.

   - minor documentation patches from Randy.

   - timeout fix from Tejun.

   - Alpha "can't write a char atomically" fix from Mikulas.

   - set of NVMe fixes by way of Keith.

   - bsg and bsg-lib improvements from Christoph.

   - a few sed-opal fixes from Jonas.

   - cdrom check-disk-change deadlock fix from Maurizio.

   - various little fixes, comment fixes, etc from various folks"

* tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block: (139 commits)
  blk-mq: Directly schedule q->timeout_work when aborting a request
  blktrace: fix comment in blktrace_api.h
  lightnvm: remove function name in strings
  lightnvm: pblk: remove some unnecessary NULL checks
  lightnvm: pblk: don't recover unwritten lines
  lightnvm: pblk: implement 2.0 support
  lightnvm: pblk: implement get log report chunk
  lightnvm: pblk: rename ppaf* to addrf*
  lightnvm: pblk: check for supported version
  lightnvm: implement get log report chunk helpers
  lightnvm: make address conversions depend on generic device
  lightnvm: add support for 2.0 address format
  lightnvm: normalize geometry nomenclature
  lightnvm: complete geo structure with maxoc*
  lightnvm: add shorten OCSSD version in geo
  lightnvm: add minor version to generic geometry
  lightnvm: simplify geometry structure
  lightnvm: pblk: refactor init/exit sequences
  lightnvm: Avoid validation of default op value
  lightnvm: centralize permission check for lightnvm ioctl
  ...
2018-04-05 14:27:02 -07:00
Linus Torvalds
672a9c1069 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  kfifo: fix inaccurate comment
  tools/thermal: tmon: fix for segfault
  net: Spelling s/stucture/structure/
  edd: don't spam log if no EDD information is present
  Documentation: Fix early-microcode.txt references after file rename
  tracing: Block comments should align the * on each line
  treewide: Fix typos in printk
  GenWQE: Fix a typo in two comments
  treewide: Align function definition open/close braces
2018-04-05 11:56:35 -07:00
James Bottomley
2e1f44f6ad Merge branch 'fixes' into misc
Somewhat nasty merge due to conflicts between "33b28357dd00 scsi:
qla2xxx: Fix Async GPN_FT for FCP and FC-NVMe scan" and "2b5b96473efc
scsi: qla2xxx: Fix FC-NVMe LUN discovery"

Merge is non-trivial and has been verified by Qlogic (Cavium)

Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
2018-04-03 17:38:39 -07:00
David S. Miller
c0b458a946 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c,
we had some overlapping changes:

1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE -->
   MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE

2) In 'net-next' params->log_rq_size is renamed to be
   params->log_rq_mtu_frames.

3) In 'net-next' params->hard_mtu is added.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01 19:49:34 -04:00
Keith Busch
f23f5bece6 blk-mq: Allow PCI vector offset for mapping queues
The PCI interrupt vectors intended to be associated with a queue may
not start at 0; a driver may allocate pre_vectors for special use. This
patch adds an offset parameter so blk-mq may find the intended affinity
mask and updates all drivers using this API accordingly.

Cc: Don Brace <don.brace@microsemi.com>
Cc: <qla2xxx-upstream@qlogic.com>
Cc: <linux-scsi@vger.kernel.org>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-27 21:25:36 -06:00
Linus Torvalds
fd9adc402b SCSI fixes on 20180327
Two driver fixes (ibmvfc, iscsi_tcp) and a USB fix for devices that
 give the wrong return to Read Capacity and cause a huge log spew.  The
 remaining 5 patches all try to fix commit 84676c1f21
 "genirq/affinity: assign vectors to all possible CPUs") which broke
 the non-mq I/O path.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWrpSmyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishY+MAP9Zvin/
 AUc3xkvOPdIzRPp2aXQHJKC+NGmNFr6MiXIHiAD/TvjbkxEjbUTjnr+gZNaloDma
 d/I4i9xaBNSvqNJpzT0=
 =mLIX
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two driver fixes (ibmvfc, iscsi_tcp) and a USB fix for devices that
  give the wrong return to Read Capacity and cause a huge log spew.

  The remaining five patches all try to fix commit 84676c1f21
  ("genirq/affinity: assign vectors to all possible CPUs") which broke
  the non-mq I/O path"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: iscsi_tcp: set BDI_CAP_STABLE_WRITES when data digest enabled
  scsi: sd: Remember that READ CAPACITY(16) succeeded
  scsi: ibmvfc: Avoid unnecessary port relogin
  scsi: virtio_scsi: unify scsi_host_template
  scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity
  scsi: core: introduce force_blk_mq
  scsi: megaraid_sas: fix selection of reply queue
  scsi: hpsa: fix selection of reply queue
2018-03-27 14:11:46 -10:00
Masanari Iida
bc8282a730 treewide: Fix typos in printk
This patch fixes spelling typos found in printk.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-03-27 09:51:22 +02:00
Joe Perches
447a5647c9 treewide: Align function definition open/close braces
Some functions definitions have either the initial open brace and/or
the closing brace outside of column 1.

Move those braces to column 1.

This allows various function analyzers like gnu complexity to work
properly for these modified functions.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-03-26 11:13:09 +02:00
David S. Miller
03fe2debbb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Fun set of conflict resolutions here...

For the mac80211 stuff, these were fortunately just parallel
adds.  Trivially resolved.

In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
function phy_disable_interrupts() earlier in the file, whilst in
'net-next' the phy_error() call from this function was removed.

In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
'rt_table_id' member of rtable collided with a bug fix in 'net' that
added a new struct member "rt_mtu_locked" which needs to be copied
over here.

The mlxsw driver conflict consisted of net-next separating
the span code and definitions into separate files, whilst
a 'net' bug fix made some changes to that moved code.

The mlx5 infiniband conflict resolution was quite non-trivial,
the RDMA tree's merge commit was used as a guide here, and
here are their notes:

====================

    Due to bug fixes found by the syzkaller bot and taken into the for-rc
    branch after development for the 4.17 merge window had already started
    being taken into the for-next branch, there were fairly non-trivial
    merge issues that would need to be resolved between the for-rc branch
    and the for-next branch.  This merge resolves those conflicts and
    provides a unified base upon which ongoing development for 4.17 can
    be based.

    Conflicts:
            drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f95
            (IB/mlx5: Fix cleanup order on unload) added to for-rc and
            commit b5ca15ad7e (IB/mlx5: Add proper representors support)
            add as part of the devel cycle both needed to modify the
            init/de-init functions used by mlx5.  To support the new
            representors, the new functions added by the cleanup patch
            needed to be made non-static, and the init/de-init list
            added by the representors patch needed to be modified to
            match the init/de-init list changes made by the cleanup
            patch.
    Updates:
            drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
            prototypes added by representors patch to reflect new function
            names as changed by cleanup patch
            drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
            stage list to match new order from cleanup patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23 11:31:58 -04:00
Adrian Hunter
ad44837882 scsi: ufs: Add support for Auto-Hibernate Idle Timer
UFS host controllers may support an autonomous power management feature
called the Auto-Hibernate Idle Timer. The timer is set to the number of
microseconds of idle time before the UFS host controller will autonomously
put the link into Hibernate state. That will save power at the expense of
increased latency. Any access to the host controller interface registers
will automatically put the link out of Hibernate state. So once configured,
the feature is transparent to the driver.

Expose the Auto-Hibernate Idle Timer value via SysFS to allow users to
choose between power efficiency or lower latency. Set a default value of
150 ms.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Stanislav Nijnikov <stanislav.nijnikov@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 21:21:25 -04:00
Stanislav Nijnikov
114c1aa210 scsi: ufs: sysfs: reworking of the rpm_lvl and spm_lvl entries
Read from these files will return the integer value of the chosen power
management level now. Separate entries were added to show the target UFS
device and UIC link states. The description of the possible power
managements levels was added to the ABI file. The on-write behaviour of
these entries wasn't changed.

[mkp: typo]

Signed-off-by: Stanislav Nijnikov <stanislav.nijnikov@wdc.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 21:21:25 -04:00
Meelis Roos
3f6c9be27a scsi: qla2xxx: fx00 copypaste typo
Fix an obvious copy-paste error in freeing QLAFX00 response queue - the
code checked for rsp->ring but freed rsp->ring_fx00.

[mkp: applied by hand]

Signed-off-by: Meelis Roos <mroos@linux.ee>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 19:03:07 -04:00
Meelis Roos
f7e59e994f scsi: qla2xxx: fix error message on <qla2400
This patch fixes IO traps caught by hardware when mailbox command fails
on qla2200. The error handler assumes newer firmware that is available
on 2400 and newer HBA-s.

This causes ugly crashes on sparc64.

Fix it with separate debug prints on different firmware generations like
most other places do.

[mkp: updated based on feedback from Himanshu]

Signed-off-by: Meelis Roos <mroos@linux.ee>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:59:31 -04:00
Don Brace
61c187e46e scsi: smartpqi: update driver version
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Gerry Morong <gerry.morong@microsemi.com>
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:51:37 -04:00
Kevin Barnett
339faa8150 scsi: smartpqi: workaround fw bug for oq deletion
Skip deleting PQI operational queues when there is an error creating a
new queue group. It's not really necessary to delete the queues anyway
because they get deleted during the PQI reset that is part of the error
recovery path.

Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:51:37 -04:00
Ching Huang
45dce24df5 scsi: arcmsr: Change driver version to v1.40.00.05-20180309
Change driver version to v1.40.00.05-20180309

Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:46:30 -04:00
Ching Huang
c2c62ebca1 scsi: arcmsr: Sleep to avoid CPU stuck too long for waiting adapter ready
Sleep to avoid CPU stuck too long for waiting adapter ready.

Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:46:30 -04:00
Ching Huang
c4c1adb349 scsi: arcmsr: Handle adapter removed due to thunderbolt cable disconnection.
Handle adapter removed due to thunderbolt cable disconnection.

Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:46:30 -04:00
Ching Huang
50b08240de scsi: arcmsr: Rename ACB_F_BUS_HANG_ON to ACB_F_ADAPTER_REMOVED for adapter hot-plug
Rename ACB_F_BUS_HANG_ON to ACB_F_ADAPTER_REMOVED for adapter hot-plug.

Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:46:30 -04:00
himanshu.madhani@cavium.com
9b14323125 scsi: qla2xxx: Update driver version to 10.00.00.06-k
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:55 -04:00
Quinn Tran
33b28357dd scsi: qla2xxx: Fix Async GPN_FT for FCP and FC-NVMe scan
This patch combines FCP and FC-NVMe scan into single scan when
driver detects FC-NVMe capability on same port.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:55 -04:00
Darren Trapp
60dd6e8e42 scsi: qla2xxx: Cleanup code to improve FC-NVMe error handling
This patch cleans up ABTS handling for FC-NVMe by

- Removing allocation of sp, instead pass the sp pointer for abort IOCB
- Fix error handling from Trasport failure
- set outstanding_cmds array to NULL for nvme completion

Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
623ee824e5 scsi: qla2xxx: Fix FC-NVMe IO abort during driver reset
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
1cbc0efcd9 scsi: qla2xxx: Fix retry for PRLI RJT with reason of BUSY
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
2e4c5d2ef7 scsi: qla2xxx: Remove nvme_done_list
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
870fe24f3c scsi: qla2xxx: Return busy if rport going away
This patch adds mechanism to return EBUSY if rport is going away
to prevent exhausting FC-NVMe layer's retry counter.

Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
1763c1fd76 scsi: qla2xxx: Fix n2n_ae flag to prevent dev_loss on PDB change
On a port db changes, this patch will set n2n_ae flag for N2N
connection when requesting for Report ID Acquition MBX, instead
of Loop Initialization or point to point asynchronous events.

Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
e473b30741 scsi: qla2xxx: Add FC-NVMe abort processing
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
9dd9686b14 scsi: qla2xxx: Add changes for devloss timeout in driver
Add support for error recovery within devloss timeout, now that
FC-NVMe transport support devloss timeout.

Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
dbe18018e3 scsi: qla2xxx: Set IIDMA and fcport state before qla_nvme_register_remote()
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
himanshu.madhani@cavium.com
1d4614e1e6 scsi: qla2xxx: Remove unneeded message and minor cleanup for FC-NVMe
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00
Darren Trapp
8d7d777526 scsi: qla2xxx: Restore ZIO threshold setting
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-21 18:38:54 -04:00