linux

mainlining shenanigans

Go to file

Steffen Maier 6f2ce1c6af scsi: zfcp: fix rport unblock race with LUN recovery It is unavoidable that zfcp_scsi_queuecommand() has to finish requests with DID_IMM_RETRY (like fc_remote_port_chkready()) during the time window when zfcp detected an unavailable rport but fc_remote_port_delete(), which is asynchronous via zfcp_scsi_schedule_rport_block(), has not yet blocked the rport. However, for the case when the rport becomes available again, we should prevent unblocking the rport too early. In contrast to other FCP LLDDs, zfcp has to open each LUN with the FCP channel hardware before it can send I/O to a LUN. So if a port already has LUNs attached and we unblock the rport just after port recovery, recoveries of LUNs behind this port can still be pending which in turn force zfcp_scsi_queuecommand() to unnecessarily finish requests with DID_IMM_RETRY. This also opens a time window with unblocked rport (until the followup LUN reopen recovery has finished). If a scsi_cmnd timeout occurs during this time window fc_timed_out() cannot work as desired and such command would indeed time out and trigger scsi_eh. This prevents a clean and timely path failover. This should not happen if the path issue can be recovered on FC transport layer such as path issues involving RSCNs. Fix this by only calling zfcp_scsi_schedule_rport_register(), to asynchronously trigger fc_remote_port_add(), after all LUN recoveries as children of the rport have finished and no new recoveries of equal or higher order were triggered meanwhile. Finished intentionally includes any recovery result no matter if successful or failed (still unblock rport so other successful LUNs work). For simplicity, we check after each finished LUN recovery if there is another LUN recovery pending on the same port and then do nothing. We handle the special case of a successful recovery of a port without LUN children the same way without changing this case's semantics. For debugging we introduce 2 new trace records written if the rport unblock attempt was aborted due to still unfinished or freshly triggered recovery. The records are only written above the default trace level. Benjamin noticed the important special case of new recovery that can be triggered between having given up the erp_lock and before calling zfcp_erp_action_cleanup() within zfcp_erp_strategy(). We must avoid the following sequence: ERP thread rport_work other context ------------------------- -------------- -------------------------------- port is unblocked, rport still blocked, due to pending/running ERP action, so ((port->status & ...UNBLOCK) != 0) and (port->rport == NULL) unlock ERP zfcp_erp_action_cleanup() case ZFCP_ERP_ACTION_REOPEN_LUN: zfcp_erp_try_rport_unblock() ((status & ...UNBLOCK) != 0) [OLD!] zfcp_erp_port_reopen() lock ERP zfcp_erp_port_block() port->status clear ...UNBLOCK unlock ERP zfcp_scsi_schedule_rport_block() port->rport_task = RPORT_DEL queue_work(rport_work) zfcp_scsi_rport_work() (port->rport_task != RPORT_ADD) port->rport_task = RPORT_NONE zfcp_scsi_rport_block() if (!port->rport) return zfcp_scsi_schedule_rport_register() port->rport_task = RPORT_ADD queue_work(rport_work) zfcp_scsi_rport_work() (port->rport_task == RPORT_ADD) port->rport_task = RPORT_NONE zfcp_scsi_rport_register() (port->rport == NULL) rport = fc_remote_port_add() port->rport = rport; Now the rport was erroneously unblocked while the zfcp_port is blocked. This is another situation we want to avoid due to scsi_eh potential. This state would at least remain until the new recovery from the other context finished successfully, or potentially forever if it failed. In order to close this race, we take the erp_lock inside zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or LUN. With that, the possible corresponding rport state sequences would be: (unblock[ERP thread],block[other context]) if the ERP thread gets erp_lock first and still sees ((port->status & ...UNBLOCK) != 0), (block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock after the other context has already cleard ...UNBLOCK from port->status. Since checking fields of struct erp_action is unsafe because they could have been overwritten (re-used for new recovery) meanwhile, we only check status of zfcp_port and LUN since these are only changed under erp_lock elsewhere. Regarding the check of the proper status flags (port or port_forced are similar to the shown adapter recovery): [zfcp_erp_adapter_shutdown()] zfcp_erp_adapter_reopen() zfcp_erp_adapter_block() * clear UNBLOCK ---------------------------------------+ zfcp_scsi_schedule_rports_block() \| write_lock_irqsave(&adapter->erp_lock, flags);-------+ \| zfcp_erp_action_enqueue() \| \| zfcp_erp_setup_act() \| \| * set ERP_INUSE -----------------------------------\|--\|--+ write_unlock_irqrestore(&adapter->erp_lock, flags);--+ \| \| .context-switch. \| \| zfcp_erp_thread() \| \| zfcp_erp_strategy() \| \| write_lock_irqsave(&adapter->erp_lock, flags);------+ \| \| ... \| \| \| zfcp_erp_strategy_check_target() \| \| \| zfcp_erp_strategy_check_adapter() \| \| \| zfcp_erp_adapter_unblock() \| \| \| * set UNBLOCK -----------------------------------\|--+ \| zfcp_erp_action_dequeue() \| \| * clear ERP_INUSE ---------------------------------\|-----+ ... \| write_unlock_irqrestore(&adapter->erp_lock, flags);-+ Hence, we should check for both UNBLOCK and ERP_INUSE because they are interleaved. Also we need to explicitly check ERP_FAILED for the link down case which currently does not clear the UNBLOCK flag in zfcp_fsf_link_down_info_eval(). Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: `8830271c48` ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport") Fixes: `a2fa0aede0` ("[SCSI] zfcp: Block FC transport rports early on errors") Fixes: `5f852be9e1` ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI") Fixes: `338151e066` ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable") Fixes: `3859f6a248` ("[PATCH] zfcp: add rports to enable scsi_add_device to work again") Cc: <stable@vger.kernel.org> #2.6.32+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>		2016-12-14 15:17:20 -05:00
arch	arm64 updates for 4.10:	2016-12-13 16:39:21 -08:00
block	SCSI misc on 20161213	2016-12-14 10:49:33 -08:00
certs	certs: Add a secondary system keyring that can be added to dynamically	2016-04-11 22:48:09 +01:00
crypto	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-12-10 16:21:55 -05:00
Documentation	. various fixes and improvements to request-based DM and DM multipath	2016-12-14 11:01:00 -08:00
drivers	scsi: zfcp: fix rport unblock race with LUN recovery	2016-12-14 15:17:20 -05:00
firmware	WHENCE: use https://linuxtv.org for LinuxTV URLs	2015-12-04 10:35:11 -02:00
fs	This merge request includes the dax-4.0-iomap-pmd branch which is	2016-12-14 09:17:42 -08:00
include	. various fixes and improvements to request-based DM and DM multipath	2016-12-14 11:01:00 -08:00
init	Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	2016-12-13 12:59:57 -08:00
ipc	sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q	2016-11-21 10:29:01 +01:00
kernel	Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	2016-12-13 12:59:57 -08:00
lib	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md	2016-12-14 10:58:17 -08:00
mm	This merge request includes the dax-4.0-iomap-pmd branch which is	2016-12-14 09:17:42 -08:00
net	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-12-12 19:25:04 -08:00
samples	VFIO updates for v4.10-rc1	2016-12-13 09:23:56 -08:00
scripts	Char/Misc driver patches for 4.10-rc1	2016-12-13 12:11:01 -08:00
security	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-12-12 19:56:15 -08:00
sound	Main pull request for drm for 4.10 kernel	2016-12-13 09:35:09 -08:00
tools	arm64 updates for 4.10:	2016-12-13 16:39:21 -08:00
usr	usr/Kconfig: make initrd compression algorithm selection not expert	2014-12-13 12:42:52 -08:00
virt	Small release, the most interesting stuff is x86 nested virt improvements.	2016-12-13 15:47:02 -08:00
.cocciconfig	scripts: add Linux .cocciconfig for coccinelle	2016-07-22 12:13:39 +02:00
.get_maintainer.ignore	Add hch to .get_maintainer.ignore	2015-08-21 14:30:10 -07:00
.gitattributes	.gitattributes: set git diff driver for C source code files	2016-10-07 18:46:30 -07:00
.gitignore	Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild	2016-08-02 16:48:52 -04:00
.mailmap	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus	2016-10-15 09:26:12 -07:00
COPYING
CREDITS	Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-12-12 15:23:02 -08:00
Kbuild	scripts/gdb: provide linux constants	2016-05-23 17:04:14 -07:00
Kconfig
MAINTAINERS	scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.	2016-12-14 14:56:28 -05:00
Makefile	Linux 4.9	2016-12-11 11:17:54 -08:00
README	README: add a new README file, pointing to the Documentation/	2016-10-24 08:12:35 -02:00

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.