Commit Graph

68 Commits

Author SHA1 Message Date
Damien Le Moal
9b52c1c6ca scsi: libsas: Declare sas_set_phy_speed() static
sas_set_phy_speed() is used only within sas_init.c. Declare this function
as static.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230912230551.454357-3-dlemoal@kernel.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 21:06:44 -04:00
John Garry
1136a0225d scsi: libsas: Delete struct scsi_core
Since commit 79855d1785 ("libsas: remove task_collector mode"), struct
scsi_core only contains a reference to the shost. struct scsi_core is only
used in sas_ha_struct.core, so delete scsi_core and replace with a
reference to the shost there.

Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230815115156.343535-5-john.g.garry@oracle.com
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-21 17:50:58 -04:00
John Garry
8e8d43642f scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private
We have no users outside libsas any longer, so make sas_alloc_task(),
sas_alloc_slow_task(), and sas_free_task() private.

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1665998435-199946-8-git-send-email-john.garry@huawei.com
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Niklas Cassel <niklas.cassel@wdc.com> # pm80xx
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18 02:37:45 +00:00
Xiang Chen
1e82e4627a scsi: libsas: Resume SAS host for phy reset or enable via sysfs
Currently if a phy reset or enable phy is issued via sysfs when controller
is suspended, those operations will be ignored as SAS_HA_REGISTERED is
cleared. If RPM is enabled then we may aggressively suspend automatically.
In this case it may be difficult to enable or reset a phy via sysfs, so
resume the host in these scenarios.

Link: https://lore.kernel.org/r/1657823002-139010-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-18 23:04:12 -04:00
Xiang Chen
bf19aea460 scsi: libsas: Defer works of new phys during suspend
During the processing of event PORT_BYTES_DMAED, the driver queues work
DISCE_DISCOVER_DOMAIN and then flushes workqueue ha->disco_q.  If a new
phyup event occurs during resuming the controller, the work
PORTE_BYTES_DMAED of new phy occurs before suspended phy's. The work
DISCE_DISCOVER_DOMAIN of new phy requires an active SAS controller (it
needs to resume SAS controller by function scsi_sysfs_add_sdev() and some
other functions such as function add_device_link()). However, the
activation of the SAS controller requires completion of work
PORTE_BYTES_DMAED of suspended phys while it is blocked by new phy's work
on ha->event_q. So there is a deadlock and it is released only after resume
timeout.

To solve the issue, defer works of new phys during suspend and queue those
defer works after SAS controller becomes active.

Link: https://lore.kernel.org/r/1639999298-244569-13-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:30 -05:00
Xiang Chen
4ea775abbb scsi: libsas: Add flag SAS_HA_RESUMING
Add a flag SAS_HA_RESUMING and use it to indicate the state of resuming the
host controller.

Link: https://lore.kernel.org/r/1639999298-244569-11-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:30 -05:00
Xiang Chen
e31e18128e scsi: libsas: Insert PORTE_BROADCAST_RCVD event for resuming host
If a new disk is inserted through an expander when the host was suspended,
it will not necessarily be detected as the topology is not re-scanned
during resume.  To detect possible changes in topology during suspension,
insert a PORTE_BROADCAST_RCVD event per port when resuming to trigger a
revalidation.

Link: https://lore.kernel.org/r/1639999298-244569-8-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:30 -05:00
John Garry
fbefe22811 scsi: libsas: Don't always drain event workqueue for HA resume
For the hisi_sas driver, if a directly attached disk is removed during
suspend, a hang will occur in the resume process:

The background is that in commit 16fd4a7c59 ("scsi: hisi_sas: Add device
link between SCSI devices and hisi_hba"), it is ensured that the HBA device
cannot be runtime suspended when any SCSI device associated is active.

Other drivers which use libsas don't worry about this as none support
runtime suspend.

The mentioned hang occurs when an disk is removed during suspend. In the
removal process - from PHYE_RESUME_TIMEOUT event processing - we call into
scsi_remove_device(), which is being processed in the HA event workqueue.
Here we wait for all suppliers of the SCSI device to resume, which includes
the HBA device (from the above commit). However the HBA device cannot
resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from
calling sas_resume_ha() -> sas_drain_work()). This is the deadlock.

There does not appear to be any need for the sas_drain_work() to be called
at all in sas_resume_ha() as it is not syncing against anything, so allow
LLDDs to avoid this by providing a variant of sas_resume_ha() which does
"sync", i.e. doesn't drain the event workqueue.

Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-22 23:38:29 -05:00
Luo Jiaxing
00aeaf329a scsi: libsas: Export sas_phy_enable()
Export sas_phy_enable() so LLDDs can directly use it to control remote
phys.

We already do this for companion function sas_phy_reset().

Link: https://lore.kernel.org/r/1634041588-74824-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 22:46:06 -04:00
John Garry
ce4fc333e5 scsi: libsas: Co-locate exports with symbols
It is standard practice to co-locate export declarations with the symbol
which is being exported. Or at least in the same file - see
sas_phy_reset().

Modify libsas to follow this practice consistently.

Link: https://lore.kernel.org/r/1631530296-32358-1-git-send-email-john.garry@huawei.com
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 23:57:39 -04:00
Jason Yan
e15f669cd9 scsi: libsas: Allow libsas to include SCSI header files directly
libsas needs to include some header files in the scsi directory. However
these are currently hardcoded with the path "../" in the C files. Do this
in the Makefile to avoid hardcoding the path.

Link: https://lore.kernel.org/r/20210716074551.771312-1-yanaijie@huawei.com
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-20 23:11:17 -04:00
Ahmed S. Darwish
65f7cfba61 scsi: libsas: Remove temporarily-added _gfp() API variants
These variants were added for bisectability. Remove them, as all call sites
have now been convertd to use the original API.

Link: https://lore.kernel.org/r/20210118100955.1761652-20-a.darwish@linutronix.de
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:10 -05:00
Ahmed S. Darwish
f76d9f1a15 scsi: libsas: Switch back to original event notifiers API
libsas event notifiers required an extension where gfp_t flags must be
explicitly passed. For bisectability, a temporary _gfp() variant of such
functions were added. All call sites then got converted use the _gfp()
variants and explicitly pass GFP context. Having no callers left, the
original libsas notifiers were then modified to accept gfp_t flags by
default.

Switch back to the original event notifiers API, while still passing GFP
context.  The _gfp() notifier variants will be removed afterwards.

Link: https://lore.kernel.org/r/20210118100955.1761652-17-a.darwish@linutronix.de
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:09 -05:00
Ahmed S. Darwish
5d6a75a1ed scsi: libsas: Add gfp_t flags parameter to event notifications
All call-sites of below libsas APIs:

  - sas_alloc_event()
  - sas_notify_port_event()
  - sas_notify_phy_event()

have been converted to use the _gfp()-suffixed version.  Modify the
original APIs above to take a gfp_t flags parameter by default.

For bisectability, call-sites will be modified again to use the original
libsas APIs (while passing gfp_t). The temporary _gfp()-suffixed versions
can then be removed.

Link: https://lore.kernel.org/r/20210118100955.1761652-13-a.darwish@linutronix.de
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:09 -05:00
Ahmed S. Darwish
19a39831ff scsi: libsas: Pass gfp_t flags to event notifiers
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

Context analysis:

  - sas_enable_revalidation(): process, acquires mutex
  - sas_resume_ha(): process, calls wait_event_timeout()

Link: https://lore.kernel.org/r/20210118100955.1761652-9-a.darwish@linutronix.de
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:08 -05:00
Ahmed S. Darwish
c2d0f1a65a scsi: libsas: Introduce a _gfp() variant of event notifiers
sas_alloc_event() uses in_interrupt() to decide which allocation should be
used.

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be separated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

The in_interrupt() check is also only partially correct, because it fails
to choose the correct code path when just preemption or interrupts are
disabled. For example, as in the following call chain:

  mvsas/mv_sas.c: mvs_work_queue() [process context]
  spin_lock_irqsave(mvs_info::lock, )
    -> libsas/sas_event.c: sas_notify_phy_event()
      -> sas_alloc_event()
        -> in_interrupt() = false
          -> invalid GFP_KERNEL allocation
    -> libsas/sas_event.c: sas_notify_port_event()
      -> sas_alloc_event()
        -> in_interrupt() = false
          -> invalid GFP_KERNEL allocation

Introduce sas_alloc_event_gfp(), sas_notify_port_event_gfp(), and
sas_notify_phy_event_gfp(), which all behave like the non _gfp() variants
but use a caller-passed GFP mask for allocations.

For bisectability, all callers will be modified first to pass GFP context,
then the non _gfp() libsas API variants will be modified to take a gfp_t by
default.

Link: https://lore.kernel.org/r/20210118100955.1761652-4-a.darwish@linutronix.de
Fixes: 1c393b970e ("scsi: libsas: Use dynamic alloced work to avoid sas event lost")
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:07 -05:00
John Garry
121181f3f8 scsi: libsas: Remove notifier indirection
LLDDs report events to libsas with .notify_port_event and .notify_phy_event
callbacks.

These callbacks are fixed and so there is no reason why the functions
cannot be called directly, so do that.

This neatens the code slightly, makes it more obvious, and reduces function
pointer usage, which is generally a good thing. Downside is that there are
2x more symbol exports.

[a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables]

Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:07 -05:00
Linus Torvalds
ba6d10ab80 SCSI misc on 20190709
This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
 mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
 removal of the osst driver (I heard from Willem privately that he
 would like the driver removed because all his test hardware has
 failed).  Plus number of minor changes, spelling fixes and other
 trivia.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTl4yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdcxAQDCJVbd
 fPUX76/V1ldupunF97+3DTharxxbst+VnkOnCwD8D4c0KFFFOI9+F36cnMGCPegE
 fjy17dQLvsJ4GsidHy8=
 =aS5B
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
  mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
  removal of the osst driver (I heard from Willem privately that he
  would like the driver removed because all his test hardware has
  failed). Plus number of minor changes, spelling fixes and other
  trivia.

  The big merge conflict this time around is the SPDX licence tags.
  Following discussion on linux-next, we believe our version to be more
  accurate than the one in the tree, so the resolution is to take our
  version for all the SPDX conflicts"

Note on the SPDX license tag conversion conflicts: the SCSI tree had
done its own SPDX conversion, which in some cases conflicted with the
treewide ones done by Thomas & co.

In almost all cases, the conflicts were purely syntactic: the SCSI tree
used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the
treewide conversion had used the new-style ones ("GPL-2.0-only" and
"GPL-2.0-or-later").

In these cases I picked the new-style one.

In a few cases, the SPDX conversion was actually different, though.  As
explained by James above, and in more detail in a pre-pull-request
thread:

 "The other problem is actually substantive: In the libsas code Luben
  Tuikov originally specified gpl 2.0 only by dint of stating:

  * This file is licensed under GPLv2.

  In all the libsas files, but then muddied the water by quoting GPLv2
  verbatim (which includes the or later than language). So for these
  files Christoph did the conversion to v2 only SPDX tags and Thomas
  converted to v2 or later tags"

So in those cases, where the spdx tag substantially mattered, I took the
SCSI tree conversion of it, but then also took the opportunity to turn
the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag.

Similarly, when there were whitespace differences or other differences
to the comments around the copyright notices, I took the version from
the SCSI tree as being the more specific conversion.

Finally, in the spdx conversions that had no conflicts (because the
treewide ones hadn't been done for those files), I just took the SCSI
tree version as-is, even if it was old-style.  The old-style conversions
are perfectly valid, even if the "-only" and "-or-later" versions are
perhaps more descriptive.

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits)
  scsi: qla2xxx: move IO flush to the front of NVME rport unregistration
  scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition
  scsi: qla2xxx: on session delete, return nvme cmd
  scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices
  scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1
  scsi: megaraid_sas: Introduce various Aero performance modes
  scsi: megaraid_sas: Use high IOPS queues based on IO workload
  scsi: megaraid_sas: Set affinity for high IOPS reply queues
  scsi: megaraid_sas: Enable coalescing for high IOPS queues
  scsi: megaraid_sas: Add support for High IOPS queues
  scsi: megaraid_sas: Add support for MPI toolbox commands
  scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver
  scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura
  scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD
  scsi: megaraid_sas: Handle sequence JBOD map failure at driver level
  scsi: megaraid_sas: Don't send FPIO to RL Bypass queue
  scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault
  scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout
  scsi: megaraid_sas: Call disable_irq from process IRQ poll
  scsi: megaraid_sas: Remove few debug counters from IO path
  ...
2019-07-11 15:14:01 -07:00
Thomas Gleixner
5078709e89 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 59
Based on 1 normalized pattern(s):

  this file is licensed under gplv2 this program is free software you
  can redistribute it and or modify it under the terms of the gnu
  general public license as published by the free software foundation
  either version 2 of the license or at your option any later version
  this program is distributed in the hope that it will be useful but
  without any warranty without even the implied warranty of
  merchantability or fitness for a particular purpose see the gnu
  general public license for more details you should have received a
  copy of the gnu general public license along with this program if
  not write to the free software foundation inc 59 temple place suite
  330 boston ma 02111 1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 5 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190520071858.561902672@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24 17:36:44 +02:00
Christoph Hellwig
86b89cb0d2 scsi: libsas: switch remaining files to SPDX tags
Use the the GPLv2 SPDX tag instead of verbose boilerplate text.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-21 06:16:22 -04:00
John Garry
3c236f8cc6 scsi: libsas: Print expander PHY indexes in decimal
Currently we print expander PHY indexes in a mix of decimal and hex.

It is more consistent and also more convenient to read decimal, so
make this change.

We use width of 2 for expander and 1 for root PHYs prints.

Some lines which were needlessly spilling multiple lines are unified.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-15 18:55:01 -04:00
John Garry
7b27c5fe24 scsi: libsas: Stop hardcoding SAS address length
Many times we use 8 for SAS address length, while we already have a macro
for this - SAS_ADDR_SIZE.

Replace instances of this with the macro. However, don't touch the SAS
address array sizes sas.h, as these are defined according to the SAS spec.

Some missing whitespaces are also added, and whitespace indentation
in sas_hash_addr() is also fixed (see sas_hash_addr()).

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-15 18:55:00 -04:00
John Garry
15ba7806c3 scsi: libsas: Drop SAS_DPRINTK() and revise logs levels
Like sas_printk() did previously, SAS_DPRINTK() offers little value now
that libsas logs already have the "sas" prefix through pr_fmt(fmt). So it
can be dropped.

However, after reviewing some logs in libsas, it is noticed that debug
level is too low in many instances.

So this change drops SAS_DPRINTK() and revises some logs to a more
appropriate level. However many stay at debug level, although some
are significantly promoted.

We add -DDEBUG for compilation so that we keep the debug messages by
default, as before.

All the pre-existing checkpatch errors for spanning messages across
multiple lines are also fixed.

Finally, all other references to printk() [apart from special formatting
in sas_ata.c] are removed and replaced with appropriate pr_xxx().

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-15 14:37:06 -05:00
John Garry
71a4a99231 scsi: libsas: Drop sas_printk()
The printk wrapper sas_printk() adds little value now that libsas logs
already have the "sas" prefix through pr_fmt(fmt), so just use pr_notice()
directly.

In addition, strings which span multiple lines are reunited.

Originally-from: Joe Perches <joe@perches.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-15 14:37:06 -05:00
Bart Van Assche
121246ae93 scsi: libsas: Fix kernel-doc headers
Avoid that building with W=1 causes the kernel-doc tool to complain
about function arguments that have not been documented in the libsas
kernel-doc headers. Avoid that the short description starts with a
hyphen by changing "--" into "-" in the first line of the kernel-doc
headers.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-27 21:15:16 -05:00
Jason Yan
93bdbd06b1 scsi: libsas: Use new workqueue to run sas event and disco event
Now all libsas works are queued to scsi host workqueue, include sas
event work post by LLDD and sas discovery work, and a sas hotplug flow
may be divided into several works, e.g libsas receive a
PORTE_BYTES_DMAED event, currently we process it as following steps:

sas_form_port  --- run in work in shost workq
	sas_discover_domain  --- run in another work in shost workq
		...
		sas_probe_devices  --- run in new work in shost workq
We found during hot-add a device, libsas may need run several
works in same workqueue to add device in system, the process is
not atomic, it may interrupt by other sas event works, like
PHYE_LOSS_OF_SIGNAL.

This patch is preparation of execute libsas sas event in sync. We need
to use different workqueue to run sas event and disco event. Otherwise
the work will be blocked for waiting another chained work in the same
workqueue.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Johannes Thumshirn <jthumshirn@suse.de>
CC: Ewan Milne <emilne@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Tomas Henzl <thenzl@redhat.com>
CC: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-08 21:59:28 -05:00
Jason Yan
8eea9dd84e scsi: libsas: make the event threshold configurable
Add a sysfs attr that LLDD can configure it for every host. We made an
example in hisi_sas. Other LLDDs using libsas can implement it if they
want.

Suggested-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Johannes Thumshirn <jthumshirn@suse.de>
CC: Ewan Milne <emilne@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Tomas Henzl <thenzl@redhat.com>
CC: Dan Williams <dan.j.williams@intel.com>
Acked-by: John Garry <john.garry@huawei.com> #for hisi_sas part
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-08 21:59:28 -05:00
Jason Yan
f12486e06a scsi: libsas: shut down the PHY if events reached the threshold
If the PHY burst too many events, we will alloc a lot of events for the
worker. This may leads to memory exhaustion.

Dan Williams suggested to shut down the PHY if the events reached the
threshold, because in this case the PHY may have gone into some
erroneous state. Users can re-enable the PHY by sysfs if they want.

We cannot use the fixed memory pool because if we run out of events, the
shut down event and loss of signal event will lost too. The events still
need to be allocated and processed in this case.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Johannes Thumshirn <jthumshirn@suse.de>
CC: Ewan Milne <emilne@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-08 21:59:28 -05:00
Jason Yan
1c393b970e scsi: libsas: Use dynamic alloced work to avoid sas event lost
Now libsas hotplug work is static, every sas event type has its own
static work, LLDD driver queues the hotplug work into shost->work_q.  If
LLDD driver burst posts lots hotplug events to libsas, the hotplug
events may pending in the workqueue like

shost->work_q
new work[PORTE_BYTES_DMAED] --> |[PHYE_LOSS_OF_SIGNAL][PORTE_BYTES_DMAED] -> processing
                                |<-------wait worker to process-------->|

In this case, a new PORTE_BYTES_DMAED event coming, libsas try to queue
it to shost->work_q, but this work is already pending, so it would be
lost. Finally, libsas delete the related sas port and sas devices, but
LLDD driver expect libsas add the sas port and devices(last sas event).

This patch use dynamic allocated work to avoid this issue.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Johannes Thumshirn <jthumshirn@suse.de>
CC: Ewan Milne <emilne@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Tomas Henzl <thenzl@redhat.com>
CC: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-08 21:59:28 -05:00
Linus Torvalds
670ffccb2f SCSI misc on 20171114
This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
 megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor
 updates.
 
 There's no major behaviour change or additions to the core in all of
 this, so the potential for regressions should be small (biggest
 potential being in the scsi error handler changes).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJaCxtCAAoJEAVr7HOZEZN4d9EQAI+OHP6ss6zjKKC21c9jNPcH
 NhLrNv37gHg/LA2VXeUEL9RGUjCGLIUrI4HsrxzkFAMLKP4TkshMs8/2RvczY+Sa
 VpayPqVybEKLIS6ipQyM1SLIQff2nvtDVcN/T+8z1lkk45TrbA6ZGuwUwd2aJyEA
 2V2wtg51ObnL0Nr9QPPll0JrtL1AnCZyRlu9XrwTZuuSBZwk93opIuuvbZm/3dVg
 Ir4GSS4Y+PuHIfu4cxqdsPMdzRdY9I2me1YiE4jeFSn1/VTAjL4HBz7fO9eITT42
 VhXSpDz1XvFsa9dJ0ubkqoALpJzCfOcBw+EuGvSydLEvOBoEVwMccdfaD9lT1zc5
 L9e1Z5qqJoq7hTA6xTXCYfWG73I9HYvljtmc8yudKHhADOdnSTUXhaO6uBF0RNqD
 OxPSA1RZwRx3c6lDOcK6BTtvLAkTEuYKdrWSKJi0w+QXJAyQ6etqbmsKpmPdRim7
 Z4ZSpJFro2gyo9gcdJO0ykTG+z3U7Z/ay1sNgnuprsv+eU/QjUdlAPl18o79EkRf
 H54zZggZ4wC6q/cFVVt4Vx+V+oqIeu38s7NDXS9UltLoTZPm2EzDW6pXd/38Z4Tf
 a1oBAUET8kYLC90P8sVZxUIHZjITlpgDbyE2Lq00PMYXhk8S4IxF0aMN5RvVqzUv
 +7N2HrHkSSgG1nhw1t+E
 =3O85
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
  megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor
  updates.

  There's no major behaviour change or additions to the core in all of
  this, so the potential for regressions should be small (biggest
  potential being in the scsi error handler changes)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
  scsi: lpfc: Fix hard lock up NMI in els timeout handling.
  scsi: mpt3sas: remove a stray KERN_INFO
  scsi: mpt3sas: cleanup _scsih_pcie_enumeration_event()
  scsi: aacraid: use timespec64 instead of timeval
  scsi: scsi_transport_fc: add 64GBIT and 128GBIT port speed definitions
  scsi: qla2xxx: Suppress a kernel complaint in qla_init_base_qpair()
  scsi: mpt3sas: fix dma_addr_t casts
  scsi: be2iscsi: Use kasprintf
  scsi: storvsc: Avoid excessive host scan on controller change
  scsi: lpfc: fix kzalloc-simple.cocci warnings
  scsi: mpt3sas: Update mpt3sas driver version.
  scsi: mpt3sas: Fix sparse warnings
  scsi: mpt3sas: Fix nvme drives checking for tlr.
  scsi: mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log info
  scsi: mpt3sas: Add-Task-management-debug-info-for-NVMe-drives.
  scsi: mpt3sas: scan and add nvme device after controller reset
  scsi: mpt3sas: Set NVMe device queue depth as 128
  scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.
  scsi: mpt3sas: API's to remove nvme drive from sml
  scsi: mpt3sas: API 's to support NVMe drive addition to SML
  ...
2017-11-14 16:23:44 -08:00
Kees Cook
77570eedd9 scsi: sas: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. This requires adding a pointer to
hold the timer's target task, as there isn't a link back from slow_task.

Cc: John Garry <john.garry@huawei.com>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jack Wang <jinpu.wang@profitbricks.com>
Cc: lindar_liu@usish.com
Cc: Jens Axboe <axboe@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
Cc: Baoyou Xie <baoyou.xie@linaro.org>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: John Garry <john.garry@huawei.com> # for hisi_sas part
Tested-by: John Garry <john.garry@huawei.com> # basic sanity test for hisi_sas
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
2017-11-01 11:43:47 -07:00
Jason Yan
042ebd293b scsi: libsas: kill useless ha_event and do some cleanup
The ha_event now has only one event HAE_RESET, and this event does
nothing. Kill it and do some cleanup.

This is a preparation for enhance libsas hotplug feature in the next
patches.

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
CC: Johannes Thumshirn <jthumshirn@suse.de>
CC: Ewan Milne <emilne@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Tomas Henzl <thenzl@redhat.com>
CC: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15 21:32:58 -04:00
Johannes Thumshirn
8690218a4c scsi: sas: remove sas_domain_release_transport
sas_domain_release_transport is unused since at least v3.13, remove it.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-04 20:16:38 -04:00
Christoph Hellwig
28917d40e6 scsi: libsas: remove sas_scsi_timed_out
EH_NOT_HANDLED is the default case if no eh_timed_out method is
provided, so there is no need to supply it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-06 19:09:12 -05:00
Christoph Hellwig
79855d1785 libsas: remove task_collector mode
The task_collector mode (or "latency_injector", (C) Dan Willians) is an
optional I/O path in libsas that queues up scsi commands instead of
directly sending it to the hardware.  It generall increases latencies
to in the optiomal case slightly reduce mmio traffic to the hardware.

Only the obsolete aic94xx driver and the mvsas driver allowed to use
it without recompiling the kernel, and most drivers didn't support it
at all.

Remove the giant blob of code to allow better optimizations for scsi-mq
in the future.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
2014-11-27 16:40:24 +01:00
Dan Williams
303694eeee [SCSI] libsas: suspend / resume support
libsas power management routines to suspend and recover the sas domain
based on a model where the lldd is allowed and expected to be
"forgetful".

sas_suspend_ha - disable event processing allowing the lldd to take down
                 links without concern for causing hotplug events.
                 Regardless of whether the lldd actually posts link down
                 messages libsas notifies the lldd that all
                 domain_devices are gone.

sas_prep_resume_ha - on the way back up before the lldd starts link
                     training clean out any spurious events that were
                     generated on the way down, and re-enable event
                     processing

sas_resume_ha - after the lldd has started and decided that all phys
		have posted link-up events this routine is called to let
		libsas start it's own timeout of any phys that did not
		resume.  After the timeout an lldd can cancel the
                phy teardown by posting a link-up event.

Storage for ex_change_count (u16) and phy_change_count (u8) are changed
to int so they can be set to -1 to indicate 'invalidated'.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jacek Danecki <jacek.danecki@intel.com>
Tested-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-08-24 13:10:23 +04:00
Dan Williams
f0bf750c2d [SCSI] libsas: trim sas_task of slow path infrastructure
The timer and the completion are only used for slow path tasks (smp, and
lldd tmfs), yet we incur the allocation space and cpu setup time for
every fast path task.

Cc: Xiangliang Yu <yuxiangl@marvell.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:54 +01:00
Dan Williams
5db45bdc87 [SCSI] libsas: enforce eh strategy handlers only in eh context
The strategy handlers may be called in places that are problematic for
libsas (i.e. sata resets outside of domain revalidation filtering /
libata link recovery), or problematic for userspace (non-blocking ioctl
to sleeping reset functions).  However, these routines are also called
for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
as long as we are running in the host's error handler, otherwise arrange
for them to be triggered in eh_context.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:50 +01:00
Dan Williams
e4a9c3732c [SCSI] libata, libsas: introduce sched_eh and end_eh port ops
When managing shost->host_eh_scheduled libata assumes that there is a
1:1 shost-to-ata_port relationship.  libsas creates a 1:N relationship
so it needs to manage host_eh_scheduled cumulatively at the host level.
The sched_eh and end_eh port port ops allow libsas to track when domain
devices enter/leave the "eh-pending" state under ha->lock (previously
named ha->state_lock, but it is no longer just a lock for ha->state
changes).

Since host_eh_scheduled indicates eh without backing commands pinning
the device it can be deallocated at any time.  Move the taking of the
domain_device reference under the port_lock to guarantee that the
ata_port stays around for the duration of eh.

Reviewed-by: Jacek Danecki <jacek.danecki@intel.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:45 +01:00
Dan Williams
22b9153faa [SCSI] libsas: introduce sas_work to fix sas_drain_work vs sas_queue_work
When requeuing work to a draining workqueue the last work instance may
not be idle, so sas_queue_work() must not touch work->entry.  Introduce
sas_work with a drain_node list_head to have a private list for
collecting work deferred due to drain collision.

Fixes reports like:
  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [<ffffffff810410d4>] process_one_work+0x2e/0x338

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-04-23 12:03:39 +01:00
Dan Williams
26a2e68f81 [SCSI] libsas: don't recover end devices attached to disabled phys
If userspace has decided to disable a phy the kernel should honor that
and not inadvertantly re-enable the phy via error recovery.  This is
more straightforward in the sata case where link recovery (via
libata-eh) is separate from sas_task cancelling in libsas-eh.  Teach
libsas to accept -ENODEV as a successful response from I_T_nexus_reset
('successful' in terms of not escalating further).

This is a more comprehensive fix then "libsas: don't recover 'gone'
devices in sas_ata_hard_reset()", as it is no longer sata-specific.

aic94xx does check the return value from sas_phy_reset() so if the phy
is disabled we proceed with clearing the I_T_nexus.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29 15:42:51 -06:00
Dan Williams
5d7f6d1071 [SCSI] libsas: fix sas_unregister_ports vs sas_drain_work
We need to hold drain_mutex across the unregistration as port down events
queue device removal as chained events, so we need to make sure no other
drainers are active.

[ 1118.673968] WARNING: at kernel/workqueue.c:996 __queue_work+0x11a/0x326()
[ 1118.681982] Hardware name: S2600CP
[ 1118.686193] Modules linked in: isci(-) libsas scsi_transport_sas nls_utf8
ipv6 uinput sg iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core ioatdma dca
sd_mod sr_mod cdrom ahci libahci libata [last unloaded: scsi_transport_sas]
[ 1118.709893] Pid: 6831, comm: rmmod Not tainted 3.2.0-isci+ #1
[ 1118.716727] Call Trace:
[ 1118.719867]  [<ffffffff8103e9f5>] warn_slowpath_common+0x85/0x9d
[ 1118.727000]  [<ffffffff8103ea27>] warn_slowpath_null+0x1a/0x1c
[ 1118.733942]  [<ffffffff81056d44>] __queue_work+0x11a/0x326
[ 1118.740481]  [<ffffffff81056f99>] queue_work_on+0x1b/0x22
[ 1118.746925]  [<ffffffff81057106>] queue_work+0x37/0x3e
[ 1118.753105]  [<ffffffffa0120e05>] ? sas_discover_event+0x55/0x82 [libsas]
[ 1118.761094]  [<ffffffff813217c3>] scsi_queue_work+0x42/0x44
[ 1118.767717]  [<ffffffffa0120e19>] sas_discover_event+0x69/0x82 [libsas]
[ 1118.775509]  [<ffffffffa0120f5b>] sas_unregister_dev+0xc3/0xcc [libsas]
[ 1118.783319]  [<ffffffffa0120fae>] sas_unregister_domain_devices+0x4a/0xc8 [libsas]
[ 1118.792731]  [<ffffffffa0120071>] sas_deform_port+0x60/0x1a6 [libsas]
[ 1118.800339]  [<ffffffffa01201ea>] sas_unregister_ports+0x33/0x44 [libsas]
[ 1118.808342]  [<ffffffffa011f7e5>] sas_unregister_ha+0x41/0x6b [libsas]
[ 1118.816055]  [<ffffffffa0134055>] isci_unregister+0x22/0x4d [isci]
[ 1118.823384]  [<ffffffffa0143040>] isci_pci_remove+0x2e/0x60 [isci]

Reported-by: Jacek Danecki <jacek.danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29 15:27:09 -06:00
Dan Williams
ab5266335b [SCSI] libsas: route local link resets through ata-eh
Similar to the conversion of the transport-class reset we want bsg
initiated resets to be managed by libata.

Reported-by: Jacek Danecki <jacek.danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29 15:25:32 -06:00
Jeff Skirvin
1f4fe89c9c [SCSI] libsas: Remove redundant phy state notification calls.
In the case of an explicit sas_phy_enable call to disable a phy,
the LLDD provides the calls to sas_phy_disconnected and the
PHYE_LOSS_OF_SIGNAL event.

NOTE: This assumes that the lldd(s) generate the notification, which
appears to be the case, but only verfied on isci.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 14:20:09 -06:00
Dan Williams
2a559f4ba4 [SCSI] libsas: sas_phy_enable via transport_sas_phy_reset
Execute the link-reset triggered by sas_phy_enable via
transport_sas_phy_reset so that it can be managed by libata.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 14:18:01 -06:00
Dan Williams
81c757bc69 [SCSI] libsas: execute transport link resets with libata-eh via host workqueue
Link resets leave ata affiliations intact, so arrange for libsas to make
an effort to avoid dropping the device due to a slow-to-recover link.
Towards this end carry out reset in the host workqueue so that it can
check for ata devices and kick the reset request to libata.  Hard
resets, in contrast, bypass libata since they are meant for associating
an ata device with another initiator in the domain (tears down
affiliations).

Need to add a new transport_sas_phy_reset() since the current
sas_phy_reset() is a utility function to libsas lldds.  They are not
prepared for it to loop back into eh.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 14:13:51 -06:00
Dan Williams
0b3e09da13 [SCSI] libsas: perform sas-transport resets in shost->workq context
Extend the sas transport class to allow transport users to attach extra
data to a sas_phy (->hostdata).  Use this area in libsas to move resets
to workq context in preparation for scheduling ata device resets through
libata-eh.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 14:11:33 -06:00
Dan Williams
3944f50995 [SCSI] libsas: let libata handle command timeouts
libsas-eh if it successfully aborts an ata command will hide the timeout
condition (AC_ERR_TIMEOUT) from libata.  The command likely completes
with the all-zero task->task_status it started with.  Instead, interpret
a TMF_RESP_FUNC_COMPLETE as the end of the sas_task but keep the scmd
around for libata-eh to handle.

Tested-by: Andrzej Jakowski <andrzej.jakowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 14:07:15 -06:00
Dan Williams
87c8331fcf [SCSI] libsas: prevent domain rediscovery competing with ata error handling
libata error handling provides for a timeout for link recovery.  libsas
must not rescan for previously known devices in this interval otherwise
it may remove a device that is simply waiting for its link to recover.
Let libata-eh make the determination of when the link is stable and
prevent libsas (host workqueue) from taking action while this
determination is pending.

Using a mutex (ha->disco_mutex) to flush and disable revalidation while
eh is running requires any discovery action that may block on eh be
moved to its own context outside the lock.  Probing ATA devices
explicitly waits on ata-eh and the cache-flush-io issued during device
removal may also pend awaiting eh completion.  Essentially any rphy
add/remove activity needs to run outside the lock.

This adds two new cleanup states for sas_unregister_domain_devices()
'allocated-but-not-probed', and 'flagged-for-destruction'.  In the
'allocated-but-not-probed' state  dev->rphy points to a rphy that is
known to have not been through a sas_rphy_add() event.  At domain
teardown check if this device is still pending probe and cleanup
accordingly.  Similarly if a device has already been queued for removal
then sas_unregister_domain_devices has nothing to do.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:52:34 -06:00
Dan Williams
b1124cd3ec [SCSI] libsas: introduce sas_drain_work()
When an lldd invokes ->notify_port_event() it can trigger a chain of libsas
events to:

  1/ form the port and find the direct attached device

  2/ if the attached device is an expander perform domain discovery

A call to flush_workqueue() will only flush the initial port formation work.
Currently libsas users need to call scsi_flush_work() up to the max depth of
chain (which will grow from 2 to 3 when ata discovery is moved to its own
discovery event).  Instead of open coding multiple calls switch to use
drain_workqueue() to flush sas work.

drain_workqueue() does not handle new work submitted during the drain so
libsas needs a bit of infrastructure to hold off unchained work submissions
while a drain is in flight.  A lldd ->notify() event is considered 'unchained'
while a sas_discover_event() is 'chained'.  As Tejun notes:

  "For now, I think it would be best to add private wrapper in libsas to
   support deferring unchained work items while draining."

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:48:51 -06:00