linux

Author	SHA1	Message	Date
Shivasharan S	4959e61b83	scsi: megaraid_sas: Selectively apply stream detection based on IO type Performance improvement: Current driver calls stream detection unconditionally for all IOs. Stream Detection logic is not required for most of the fast path IO. To improve performance, avoid stream detection logic and do it only if required. Below are the cases where stream detection is required in driver: 1. All non-FastPath IOs (IOs going to FW) 2. Fast Path reads sent to ReadAhead capable VDs. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:18 -05:00
Shivasharan S	5f19f7c879	scsi: megaraid_sas: Update LD map after populating drv_map driver map copy Issue – There may be some IO accessing incorrect raid map, but driver has checks in IO path to handle those cases. It is always better to move to new raid map only once raid map is populated and validated. No functional defect. Fix is provided as part of review. Fix – Update instance->map_id after driver has populated new driver raid map from firmware raid map. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:17 -05:00
Shivasharan S	619831f23b	scsi: megaraid_sas: Use megasas_wait_for_adapter_operational to detect controller state in IOCTL path In IOCTL path, re-use megasas_wait_for_adapter_operational API to detect controller state. This will make driver to use this API uniformly in all cases where we need to wait for adapter to become operational. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:17 -05:00
Shivasharan S	149c5751e6	scsi: megaraid_sas: Avoid firing DCMDs while OCR is in progress Driver needs to avoid PCI writes while OCR is in progress. Use reset_mutex to synchronize between firing DCMDs MR_DCMD_PD_GET_INFO and MR_DCMD_DRV_GET_TARGET_PROP while OCR is triggered. Without this fix, if Device/VD add/creation is in progress and at the same time MR Firmware is going through OCR, user may see OCR never completed and it may need system reboot. This scenario is rare to occur. Fix is provided as part of review. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:16 -05:00
Shivasharan S	f3f7920b39	scsi: megaraid_sas: unload flag should be set after scsi_remove_host is called Issue - Driver returns DID_NO_CONNECT when unload is in progress, indicated using instance->unload flag. In case of dynamic unload of driver, this flag is set before calling scsi_remove_host(). While doing manual driver unload, user will see lots of prints for Sync Cache command with DID_NO_CONNECT status. Fix - Set the instance->unload flag after scsi_remove_host(). Allow device removal process to be completed and do not block any command before that. SCSI commands (like SYNC_CACHE) are received (as part of scsi_remove_host) by driver during unload will be submitted further down to the drives. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:16 -05:00
Shivasharan S	7ada701d0d	scsi: megaraid_sas: Error handling for invalid ldcount provided by firmware in RAID map Currently driver does not validate ldcount provided by firmware. If the value is invalid, fail RAID map validation accordingly. This issue is rare to hit in field and is fixed as part of code review. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:15 -05:00
Shivasharan S	41fae9a498	scsi: megaraid_sas: Reset ldio_outstanding in megasas_resume Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:15 -05:00
Shivasharan S	b051cc661c	scsi: megaraid_sas: Return the DCMD status from megasas_get_seq_num In megasas_get_seq_num, the status of the DCMD fired to FW is not returned, it always returns success. We could end up registering AEN request with incorrect sequence number if the DCMD failed. Return the DCMD status back to caller. This was discovered during code review and very rare to see issue in field to see AEN request failed bt FW. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:14 -05:00
Shivasharan S	cb51efeb81	scsi: megaraid_sas: memset IOC INIT frame using correct size Commit `b9637d14dc` ("scsi: megaraid_sas: Resize MFA frame used for IOC INIT to 4k") increased the size of IOC INIT frame to 4k. Need to use updated size when memsetting init_frame. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:14 -05:00
Shivasharan S	e05ee4e986	scsi: megaraid_sas: zero out IOC INIT and stream detection memory Memory allocated for IOC_INIT command and stream detection array are not zero'd before using. Use kzalloc instead of kmalloc to zero out the memory allocated. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:13 -05:00
Bart Van Assche	08640e81dc	scsi: core: Change third __scsi_queue_insert() argument from int to bool This patch does not change any functionality but makes the SCSI core source code slightly easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:13 -05:00
Raghava Aditya Renukunta	cfc350ab0e	scsi: aacraid: Delay for rescan worker needs to be 10 seconds The delay for the rescan worker needs to 10 seconds, missed the HZ in there. Fixes: `a1367e4ade` (scsi: aacraid: Reschedule host scan in case of failure) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:12 -05:00
Raghava Aditya Renukunta	bbd16d96d1	scsi: aacraid: Get correct lun count The correct lun count needs to be divided by 24, missed it in the previous patch set. Fixes: `4b00022753` (scsi: aacraid: Create helper functions to get lun info) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:11 -05:00
Hannes Reinecke	80c716fad8	scsi: scsi_dh_alua: skip RTPG for devices only supporting active/optimized For hardware only supporting active/optimized there's no point in ever re-issuing RTPG as the only new state we can possibly read is active/optimized. This avoid spurious errors during path failover on such arrays. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:11 -05:00
Colin Ian King	bcb872400b	scsi: bfa: use ARRAY_SIZE for array sizing calculation on array __pciids Use the ARRAY_SIZE macro on array __pciids to determine size of the array. Improvement suggested by coccinelle. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:10 -05:00
Colin Ian King	8b56918082	scsi: qla2xxx: remove redundant assignment of d The initialization of d is redundant as this value is never read and it is overwritten inside the subsequent for-loop. Remove this redundant assignment. Cleans up clang warning: drivers/scsi/qla2xxx/qla_gs.c:3985:29: warning: Value stored to 'd' during its initialization is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:10 -05:00
Himanshu Jha	e9f31779a5	scsi: qedi: Use zeroing allocator instead of allocator/memset Use dma_zalloc_coherent instead of dma_alloc_coherent followed by memset 0. Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com> Acked-by: Manish Rangankar <Manish.Rangankar@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:09 -05:00
Himanshu Jha	bb5420c39e	scsi: bnx2fc: Use zeroing allocator rather than allocator/memset Use dma_zalloc_coherent instead of dma_alloc_coherent followed by memset 0. Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:09 -05:00
chenxiang	468f4b8d07	scsi: hisi_sas: Change frame type for SET MAX commands According to ATA protocol, SET MAX commands belong to different frame types. So judge features field of SET MAX commands to decide which frame type they belongs to. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:08 -05:00
chenxiang	d5c15c2c22	scsi: ata: enhance the definition of SET MAX feature field value There are two other values for SET MAX feature field according to ata protocol. So definite them. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:08 -05:00
Steffen Weber	dc2db1dc5f	scsi: smartpqi: allow static build ("built-in") If CONFIG_SCSI_SMARTPQI=y then don't build this driver as a module. Signed-off-by: Steffen Weber <steffen.weber@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:07 -05:00
Douglas Gilbert	481b5e5c79	scsi: scsi_debug: add resp_write_scat function Add resp_write_scat() function to support decoding WRITE SCATTERED (16 and 32). Also weave resp_write_scat() into the cdb decoding logic. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:07 -05:00
Douglas Gilbert	46f64e70b8	scsi: scsi_debug: ARRAY_SIZE and FF_MEDIA_IO Reviewer suggested using the ARRAY_SIZE macro. That reduced one of the subtle inter-dependencies in the parser's tables. It is important that commands which simulate media access, indicate this in the flags for that command. The flag to do that was FF_DIRECT_IO. On reflection FF_MEDIA_IO seems a more accurate description. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:06 -05:00
Douglas Gilbert	0a7e69c7d4	scsi: scsi_debug: do_device_access add sg offset argument WRITE SCATTERED needs to take several "bites" out of the data-out buffer. Expand the do_device_access() function to take a sg_skip argument. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:05 -05:00
Douglas Gilbert	b7e24581f3	scsi: scsi_debug: fix group_number mask Various cdb masks incorrectly assumed the GROUP NUMBER field was 5 bits long. It is actually 6 bits long. Correct. Also fix mask failure (in same byte) to allow DLD0 in READ(16) and WRITE(16). Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:05 -05:00
Douglas Gilbert	9a05101954	scsi: scsi_debug: tab, kstrto changes Some of my development tools tend to add spaces (my preference) rather than tabs (kernel convention). Running unexpand to clean these spaces up found more of them than checkpatch.pl did. Then checkpatch.pl complained about other style violations in those newly tabbed lines. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:04 -05:00
Suganath Prabu Subramani	dbec4c9040	scsi: mpt3sas: lockless command submission Performance improvement using block layer tag. Curent driver gets scsiio tracker and free smid from link list and array based tracking managed by driver. Accessing list in main io path is performance pentaly because of protection using spinlock "scsi_lookup_lock". In this patch: 1. Driver removes all link list access from main io path and use scmd->request->tag to get free smid. 2. Instead of holding 'struct scsiio_tracker' in its own pool driver can embed it into the scsi command. Driver provides cmd_size in scsi_host_template, so that struct scsiio_tracker is preallocated by scsi mid layer for each scsi command. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:04 -05:00
Hannes Reinecke	272e253c7b	scsi: mpt3sas: simplify _wait_for_commands_to_complete() Use 'host_busy' instead of counting outstanding commands by hand. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:03 -05:00
Hannes Reinecke	6da999fe5a	scsi: mpt3sas: simplify mpt3sas_scsi_issue_tm() Move the check for outstanding commands out of the function allowing us to simplify the overall code. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:03 -05:00
Hannes Reinecke	74fcfa5371	scsi: mpt3sas: simplify task management functions No functional change. Code optimization. One can simply check 'target_busy' or 'device_busy' when figuring out if there are outstanding commands; no need to painstakingly count them by hand. [mkp: tweaked patch description] Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:02 -05:00
Hannes Reinecke	b0cd285eb5	scsi: mpt3sas: always use first reserved smid for ioctl passthrough ioctl passthrough commands require a SCSIIO smid, but cannot easily integrate with the block layer. But the driver already has reserved some SCSIIO smids and we're only ever allowing one ioctl command at a time we can use the first reserved smid for ioctl commands. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:02 -05:00
Hannes Reinecke	9961c9bbf2	scsi: mpt3sas: check command status before attempting abort When attempting a command abort we should check the command status prior to sending the abort; the command might've been completed already. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:01 -05:00
Hannes Reinecke	12e7c6782b	scsi: mpt3sas: Introduce mpt3sas_get_st_from_smid() Abstract accesses to the scsi_lookup array by introducing mpt3sas_get_st_from_smid(). Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:01 -05:00
Hannes Reinecke	02a386df36	scsi: mpt3sas: open-code _scsih_scsi_lookup_get() Just a wrapper around the scsi lookup array and only used in one place, so open-code it. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:25:00 -05:00
Hannes Reinecke	6a2d4618ae	scsi: mpt3sas: separate out _base_recovery_check() No functional change. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:59 -05:00
Hannes Reinecke	05303dfb73	scsi: mpt3sas: use list_splice_init() Use 'list_splice_init()' instead of hand-crafted function. No functional change. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:59 -05:00
Hannes Reinecke	ba4494d47b	scsi: mpt3sas: set default value for cb_idx No functional change Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:58 -05:00
Matthew R. Ochs	25b8e08e83	scsi: cxlflash: Staging to support future accelerators As staging to support future accelerator transports, add a shim layer such that the underlying services the cxlflash driver requires can be conditional upon the accelerator infrastructure. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:58 -05:00
Uma Krishnan	0df69c6024	scsi: cxlflash: Adapter context init can return error Adapter context creation can return either NULL or an error pointer. Updating the check condition to reflect this. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:57 -05:00
Matthew R. Ochs	8762353106	scsi: cxlflash: Remove embedded CXL work structures The CXL-specific work structure used to request the number of interrupts currently resides as a nested member of both the context information and hardware queue structures. It is used to cache values (specifically the number of interrupts) required by the CXL layer when starting a context. To facilitate staging that will ultimately allow the cxlflash core to become agnostic of the underlying accelerator transport, remove these embedded work structures. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:57 -05:00
Matthew R. Ochs	af2047ec00	scsi: cxlflash: Explicitly cache number of interrupts per context The number of interrupts a user requests during a context attach is presently stored within the CXL work ioctl structure that is nested alongside the per context metadata. Keeping this data in a structure that is tied to a particular hardware implementation (CXL) will only complicate matters when supporting newer accelerator transports. Instead of relying upon the number of interrupts being cached within a CXL-specific structure, explicitly cache the value within the context information structure. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:56 -05:00
Uma Krishnan	b070545db1	scsi: cxlflash: Update cxl-specific arguments to generic cookie Convert cxl-specific pointers to generic cookies to facilitate future enhancements. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:56 -05:00
Uma Krishnan	96cf727fe8	scsi: cxlflash: Reset command ioasc In the event of a command failure, cxlflash returns the failure to the upper layers to process. After processing the error, when the command is queued again, the private command structure will not be zeroed and the ioasc could be stale. Per the SISLite specification, the AFU only sets the ioasc in the presence of a failure. Thus, even though the original command succeeds the second time, the command is considered a failure due to stale ioasc. This cycle repeats indefinitely and can cause a hang or IO failure. To fix the issue, clear the ioasc before queuing any command. [mkp: added Cc: stable per request] Fixes: `479ad8e9d4` ("scsi: cxlflash: Remove zeroing of private command data") Cc: <stable@vger.kernel.org> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:55 -05:00
Jason Yan	1689c9367b	scsi: libsas: notify event PORTE_BROADCAST_RCVD in sas_enable_revalidation() There are two places queuing the disco event DISCE_REVALIDATE_DOMAIN. One is in sas_porte_broadcast_rcvd() and uses sas_chain_event() to queue the event. The other is in sas_enable_revalidation() and uses sas_queue_event() to queue the event. We have diffrent work queues for event and discovery now, so the DISCE_REVALIDATE_DOMAIN event may be processed in both event queue and discovery queue. Now since we do synchronous event handling, we cannot do it in discovery queue, so have to trigger a fake broadcast event to re-trigger the revalidation from event queue. Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:54 -05:00
Jason Yan	0558f33c06	scsi: libsas: direct call probe and destruct In commit `87c8331fcf` ("[SCSI] libsas: prevent domain rediscovery competing with ata error handling") introduced disco mutex to prevent rediscovery competing with ata error handling and put the whole revalidation in the mutex. But the rphy add/remove needs to wait for the error handling which also grabs the disco mutex. This may leads to dead lock.So the probe and destruct event were introduce to do the rphy add/remove asynchronously and out of the lock. The asynchronously processed workers makes the whole discovery process not atomic, the other events may interrupt the process. For example, if a loss of signal event inserted before the probe event, the sas_deform_port() is called and the port will be deleted. And sas_port_delete() may run before the destruct event, but the port-x:x is the top parent of end device or expander. This leads to a kernel WARNING such as: [ 82.042979] sysfs group 'power' not found for kobject 'phy-1:0:22' [ 82.042983] ------------[ cut here ]------------ [ 82.042986] WARNING: CPU: 54 PID: 1714 at fs/sysfs/group.c:237 sysfs_remove_group+0x94/0xa0 [ 82.043059] Call trace: [ 82.043082] [<ffff0000082e7624>] sysfs_remove_group+0x94/0xa0 [ 82.043085] [<ffff00000864e320>] dpm_sysfs_remove+0x60/0x70 [ 82.043086] [<ffff00000863ee10>] device_del+0x138/0x308 [ 82.043089] [<ffff00000869a2d0>] sas_phy_delete+0x38/0x60 [ 82.043091] [<ffff00000869a86c>] do_sas_phy_delete+0x6c/0x80 [ 82.043093] [<ffff00000863dc20>] device_for_each_child+0x58/0xa0 [ 82.043095] [<ffff000008696f80>] sas_remove_children+0x40/0x50 [ 82.043100] [<ffff00000869d1bc>] sas_destruct_devices+0x64/0xa0 [ 82.043102] [<ffff0000080e93bc>] process_one_work+0x1fc/0x4b0 [ 82.043104] [<ffff0000080e96c0>] worker_thread+0x50/0x490 [ 82.043105] [<ffff0000080f0364>] kthread+0xfc/0x128 [ 82.043107] [<ffff0000080836c0>] ret_from_fork+0x10/0x50 Make probe and destruct a direct call in the disco and revalidate function, but put them outside the lock. The whole discovery or revalidate won't be interrupted by other events. And the DISCE_PROBE and DISCE_DESTRUCT event are deleted as a result of the direct call. Introduce a new list to destruct the sas_port and put the port delete after the destruct. This makes sure the right order of destroying the sysfs kobject and fix the warning above. In sas_ex_revalidate_domain() have a loop to find all broadcasted device, and sometimes we have a chance to find the same expander twice. Because the sas_port will be deleted at the end of the whole revalidate process, sas_port with the same name cannot be added before this. Otherwise the sysfs will complain of creating duplicate filename. Since the LLDD will send broadcast for every device change, we can only process one expander's revalidation. [mkp: kbuild test robot warning] Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-10 23:24:02 -05:00
Jason Yan	517e5153d2	scsi: libsas: use flush_workqueue to process disco events synchronously Now we are processing sas event and discover event in different workqueues. It's safe to wait the discover event done in the sas event work. Use flush_workqueue() to insure the disco and revalidate events processed synchronously so that the whole discover and revalidate process will not be interrupted by other events. Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-08 21:59:28 -05:00
Jason Yan	93bdbd06b1	scsi: libsas: Use new workqueue to run sas event and disco event Now all libsas works are queued to scsi host workqueue, include sas event work post by LLDD and sas discovery work, and a sas hotplug flow may be divided into several works, e.g libsas receive a PORTE_BYTES_DMAED event, currently we process it as following steps: sas_form_port --- run in work in shost workq sas_discover_domain --- run in another work in shost workq ... sas_probe_devices --- run in new work in shost workq We found during hot-add a device, libsas may need run several works in same workqueue to add device in system, the process is not atomic, it may interrupt by other sas event works, like PHYE_LOSS_OF_SIGNAL. This patch is preparation of execute libsas sas event in sync. We need to use different workqueue to run sas event and disco event. Otherwise the work will be blocked for waiting another chained work in the same workqueue. Signed-off-by: Yijing Wang <wangyijing@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-08 21:59:28 -05:00
Jason Yan	8eea9dd84e	scsi: libsas: make the event threshold configurable Add a sysfs attr that LLDD can configure it for every host. We made an example in hisi_sas. Other LLDDs using libsas can implement it if they want. Suggested-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Acked-by: John Garry <john.garry@huawei.com> #for hisi_sas part Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-08 21:59:28 -05:00
Jason Yan	f12486e06a	scsi: libsas: shut down the PHY if events reached the threshold If the PHY burst too many events, we will alloc a lot of events for the worker. This may leads to memory exhaustion. Dan Williams suggested to shut down the PHY if the events reached the threshold, because in this case the PHY may have gone into some erroneous state. Users can re-enable the PHY by sysfs if they want. We cannot use the fixed memory pool because if we run out of events, the shut down event and loss of signal event will lost too. The events still need to be allocated and processed in this case. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-08 21:59:28 -05:00
Jason Yan	1c393b970e	scsi: libsas: Use dynamic alloced work to avoid sas event lost Now libsas hotplug work is static, every sas event type has its own static work, LLDD driver queues the hotplug work into shost->work_q. If LLDD driver burst posts lots hotplug events to libsas, the hotplug events may pending in the workqueue like shost->work_q new work[PORTE_BYTES_DMAED] --> \|[PHYE_LOSS_OF_SIGNAL][PORTE_BYTES_DMAED] -> processing \|<-------wait worker to process-------->\| In this case, a new PORTE_BYTES_DMAED event coming, libsas try to queue it to shost->work_q, but this work is already pending, so it would be lost. Finally, libsas delete the related sas port and sas devices, but LLDD driver expect libsas add the sas port and devices(last sas event). This patch use dynamic allocated work to avoid this issue. Signed-off-by: Yijing Wang <wangyijing@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-08 21:59:28 -05:00

1 2 3 4 5 ...

721998 Commits