linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-08 13:11:45 +00:00

Author	SHA1	Message	Date
Brian King	0f33ece5bc	[SCSI] ibmvscsi: Fix softlockup on resume This fixes a softlockup seen on resume. During resume, the CRQ must be reenabled. However, the H_ENABLE_CRQ hcall used to do this may return H_BUSY or H_LONG_BUSY. When this happens, the caller is expected to retry later. This patch changes a simple loop, which was causing the softlockup, to a loop at task level which sleeps between retries rather than simply spinning. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:03:46 -05:00
Brian King	06395193b2	[SCSI] ibmvfc: Driver version 1.0.8 Bump driver version. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:42 -05:00
Brian King	3f01424c81	[SCSI] ibmvfc: Add support for fc_block_scsi_eh Adds support for fc_block_scsi_eh to block the EH handlers if the target device is in the blocked state to ensure we don't take devices offline. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:41 -05:00
Brian King	73ee5d8672	[SCSI] ibmvfc: Fix soft lockup on resume This fixes a softlockup seen on resume. During resume, the CRQ must be reenabled. However, the H_ENABLE_CRQ hcall used to do this may return H_BUSY or H_LONG_BUSY. When this happens, the caller is expected to retry later. Normally the H_ENABLE_CRQ succeeds relatively soon. However, we have seen cases where this can take long enough to see softlockup warnings. This patch changes a simple loop, which was causing the softlockup, to a loop at task level which sleeps between retries rather than simply spinning. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:39 -05:00
Bandan Das	15f7fc060a	[SCSI] mpt fusion: Cleanup some duplicate calls in mptbase.c In mpt_detach, call to pci_set_drvdata is redundant because it has already been called in mpt_adapter_disable. In mpt_attach, ioc->pcidev is set to pdev two times. Signed-off-by: Bandan Das <bandan.das@stratus.com> Acked-by: "Desai, Kashyap" <Kashyap.Desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:37 -05:00
Kashyap, Desai	c817ce842a	[SCSI] mptfusion: Bump version 03.04.16 Upgrade driver version to 3.4.16 Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:37 -05:00
Kashyap, Desai	b9a0f872a9	[SCSI] mptfusion: Added missing reset for ioc_reset_in_progress in SoftReset Added missing part which will reset ioc_reset_in_progress before returning from SoftResetHandler. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:35 -05:00
Kashyap, Desai	cc7e9f5f99	[SCSI] mptfusion: Added code for occationally SATA hotplug failure. Issue: SATA hotplug does not work sometimes. At the time of ADD device/ADD phys disk, drive may fail to add SATA device due to temporary SAS Address for SATA device generated by firmware. Final SAS address for SATA driver will be generated only after disk spinup is done. This may take some times for slow spining SATA drives. At phy link up driver gets attached device sas address and stores into phyinfo. At the time of ADD event driver will read sas device page0 using channel and FW ID provided in ADD Device event. Here in case of SATA drives, driver will see miss match in phyinfo->sas_address and latest sas address read from SAS DEVICE PAGE0 and eventually device won't be added to OS. Fix: When Driver read SAS DEVICE PAGE0, it can identify Device type looking at device_info. If device is SATA drive and sas address mismatch happens, Driver will do same stuffs which happened at the time of LINK UP to get correct piece of information from Pages. ( Find parent device and refresh parent device phys either HBA refresh/Exp refresh) Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:35 -05:00
Kashyap, Desai	b68bf096d4	[SCSI] mptfusion: schedule_target_reset from all Reset context Issue: target reset will be queued to driver's internal queue to get schedule later. When driver add target into internal target_reset queue we will block IOs on those target using scsi midlayer API. Now due to some cause driver is not executing those target_reset list and it is always in block state. Changes: now we are clearing target_reset queue from all other Callback context instead of only DeviceReset context.Now wherever driver is clearing taskmgmt_in_progress flag it is considering target_reset queue cleanup also. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:34 -05:00
Kashyap, Desai	51106ab530	[SCSI] mptfusion: Added sanity to check B_T mapping for device before adding to OS Added sanity check before treating any device is a valid device. It is possible that firmware can have device page0 in its table, but that devicemay not be available in topology. Device will be available in topology only if there is Bus Target mapping is done in firmware. Driver will always check B_T mapping of firmware before reporting device to upper layer. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:32 -05:00
Kashyap, Desai	aca794ddd6	[SCSI] mptfusion: Corrected declaration of device_missing_delay device missing delay is 8 bit value in io unit pg1. Making correct variable declaration for device_missing_delay. The driver is storing the calculated device missing delay in IOC structure as a u8 instead of a u16. It needs to be a u16 if the delay is > 255. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:28 -05:00
Kashyap, Desai	4d0695664e	[SCSI] mptfusion: Use DID_TRANSPORT_DISRUPTED instead of DID_BUS_BUSY Changed the return value for Nexus Loss IOs to be DID_TRANSPORT_DISRUPTED. What this will allow is the multi-path driver to delay the fail over process. They would like the path to keep up as long as the nexus loss Loginfo is return from firmware. With DID_BUS_BUSY the path fails over immediately. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:28 -05:00
Kashyap, Desai	8ce13de2ad	[SCSI] mptfusion: Set fw_events_off to 1 at driver load time. fw_events_off is flag checking for driver to do Event handling or not. Normally it should be OFF at the time of initialization. Only enable it at the time of INTR enable of device first time. This will always occur only after resource allocation. ioc->fw_events_off = 1 is set in mpt_attach() Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:27 -05:00
Kashyap, Desai	d4572c3dbb	[SCSI] mpt2sas: Bump version 06.100.00.00 Version upgrade patch Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:26 -05:00
Kashyap, Desai	1bbfa378af	[SCSI] mpt2sas: Copy message frame before releasing to free pool to have a local reference. Current driver is not clearing the per device tm_busy flag following the Task Mangement request completion from the IOCTL path. When this flag is set, the IO queues are frozen. The reason the flag didn't get cleared is becuase the driver is referencing memory associated to the mpi request following the completion, when the memory had been reallocated for a new request. When the memory was reallocated, the driver didn't clear the flag becuase it was expecting a task managment reqeust, and the reallocated request was for SCSI_IO. To fix the problem the driver needs to have a cached backup copy of the original reqeust. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:25 -05:00
Kashyap, Desai	769578ff81	[SCSI] mpt2sas: Copy sense buffer instead of working on direct memory location (1) driver was not setting the sense data size prior to sending SCSI_IO, resulting in the 0x31190000 loginfo (2) The driver needs to copy the sense data to local buffer prior to releasing the request message frame. If not, the sense buffer gets overwritten by the next SCSI_IO request. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:24 -05:00
Kashyap, Desai	8e864a81e3	[SCSI] mpt2sas: Adding additional message to error escalation callback Adding additional messages to the error escallation callbacks which displays the wwid, sas address, handle, phy number, enclosure logical id, and slot. In the same eh callbacks, routines, the printks were converted to sdev_printks, which displays the bus target mapping. These additional modifications help better identify the device which is in recovery. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:23 -05:00
Kashyap, Desai	d417d1c3a3	[SCSI] mpt2sas: Add additional check for responding volumes after Host Reset ISSUE DESCRIPTION: This test case involves creating two RAID1 volumes, then simultaneiously issue host reset and pull all the drives associated to the 1st raid volume. The observed behavour is the physical drives are removed, however the volume remains. The expected behavour is the volume as well as physical drives should be removed from OS. FIX: Add support in the post host reset device scan logic for raid volumes where the driver will have an additional check for responding raid volume where the status should be either online, optimal, or degraded. So for voluemes that have a status of missing or failed, the driver will mark them for deletion. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:21 -05:00
Kashyap, Desai	3e2e833a54	[SCSI] mpt2sas: Added -ENOMEM return type when allocation fails In the driver mpt2sas_base_attach subroutine, we need to add support to return the proper error code when there are memory allocation failures, e.g. returning -ENOMEM. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:20 -05:00
Kashyap, Desai	f3eedd698e	[SCSI] mpt2sas: Redesign Raid devices event handling using pd_handles per HBA Actual problem : Driver may receiving the top level expander removal event prior to all the individual PD removal events, hence the driver is breaking down all the PDs in advanced to the actaul PD UNHIDE event. Driver sends multiple Target Resets to the same volume handle for each individual PD removal. FIX DESCRIPTION: To fix this issue, the entire PD device handshake protocal has to be moved to interrupt context so the breakdown occurs immediately after the actual UNHIDE event arrives. The driver will only issue one Target Reset to the volume handle, occurring after the FAILED or MISSING volume status event arrives from interrupt context. For the PD UNHIDE event, the driver will issue target resets to the PD handles, followed by OP_REMOVE. The driver will set the "deteleted" flag during interrupt context. A "pd_handle" bitmask was introduced so the driver has a list of known pds during entire life of the PD; this replaces the "hidden_raid_component" flag handle in the sas_device object. Each bit in the bitmask represents a device handle. The bit in the bitmask would be toggled ON/OFF when the HIDE/UNHIDE events arrive; also this pd_handle bitmask would bould be refreshed across host resets. Here we kept older behavior of sending target reset to volume when there is a single drive pull, wait for the reply, then send target resets to the PDs. We kept this behavior so the driver will behave the same for older versions of firmware. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:19 -05:00
Kashyap, Desai	7fbae67a3f	[SCSI] mpt2sas: Tie a log info message to a specific PHY. Add support to display additional debug info for SCSI_IO and RAID_SCSI_IO_PASSTHROUGH sent from the normal entry queued entry point, as well as internal generated commands, and IOCTLS. The additional debug info included the phy number, as well as the sas address, enclosure logical id, and slot number. This debug info has to be enabled thru the logging_level command line option, by default this will not be displayed. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:17 -05:00
Kashyap, Desai	eabb08ad2d	[SCSI] mpt2sas: print level KERN_DEBUG is replaced by KERN_INFO Converting print level from MPT2SAS_DEBUG_FMT to MPT2SAS_INFO_FMT. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:16 -05:00
Kashyap, Desai	570c67ac44	[SCSI] mpt2sas: Added sysfs support for trace buffer Added support so the diag ring buffer can be pulled via sysfs Added three new shost attributes: host_trace_buffer, host_trace_buffer_enable, and host_trace_buffer_size. The host_trace_buffer_enable attribute is used to either post or release the trace buffers. The host_trace_buffer_size attribute contains the size of the trace buffer. The host_trace_buffer atttribute contains a maximum 4KB window of the buffer. In order to read the entire host buffer, you will need to write the offset to host_trace_buffer prior to reading it. release the host buffer, then write the entire host buffer contents to a file. In addition to this enhancement, we moved the automatic posting of host buffers at driver load time to be called prior to port_enable, instead of after. That way discovery is available in the host buffer. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:14 -05:00
Kashyap, Desai	203d65b16c	[SCSI] mpt2sas: MPI header version N is updated. Updating MPI header version N. Removed mpi_history.txt. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:13 -05:00
Kashyap, Desai	d32a8c15e1	[SCSI] mpt2sas: Added sysfs counter for ioc reset Added a new sysfs shost attribute called ioc_reset_count. This will keep count of host resets (both diagnostic and message unit). Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:11 -05:00
Kashyap, Desai	b8d7d7bb37	[SCSI] mpt2sas: Added expander phy control support Added support to send link resets, hard resets, enable/disable phys, and changing link rates for for expanders. This will be exported to attributes within the sas transport layer. A new wrapper function was added for sending SMP passthru to expanders for phy control. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:10 -05:00
Kashyap, Desai	d5f491e658	[SCSI] mpt2sas: Added expander phy counter support Added support to retrieve the invalid_dword_count, running_disparity_error_count, loss_of_dword_sync_count, and phy_reset_problem_count for expanders. This will be exported to attributes within the sas transport layer. A new wrapper function was added for sending SMP passthru to retrieve the expander phy error log. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:08 -05:00
Kashyap, Desai	dd5fd3323a	[SCSI] mpt2sas: staged device discovery. disable_discovery module parameter is added. Added command line option called disable_discovery. When enabled on the command line, the driver will not send a port_enable when loaded for the first time. If port_enable is not called, then there is no discovery of devices, as well as the sas topology. Then later if one desires to invoke discovery, then they will need to issue a diagnostic reset. A diagnostic reset can be issued various ways. One of the way is throught sysfs. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:07 -05:00
Kashyap, Desai	d274213a1a	[SCSI] mpt2sas: Hold Controller reset when another reset is in progress Driver should not allow multiple host reset when already host reset is in progress. It is possible that host reset was sent by scsi mid layer while there was already an host reset active, either issued via IOCTL interface or internaly, like a config page timeout. Since there was a host reset active, the driver would return a FAILED response to the scsi mid layer. The solution is make sure pending host resets will wait for the active host reset to complete before returning control back up the call stack. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:06 -05:00
Kashyap, Desai	ab6ce92541	[SCSI] mpt2sas: Fix to use sas device list instead of enclosure list for _transpor_get_enclosure_identifier. Enclosure_identifier not being returned by mpt2sas The driver exports callback function to the sas transport layer for obtaining the enclosure logical id. This function is called _transport_get_enclosure_identifier. The driver was searching the wrong list for the enclosure_identifier. The driver should be searching the sas device list instead of enclosure list. The sas address that is passed to the driver is for the end device, not enclosure. Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:06 -05:00
Karen Xie	593d572074	[SCSI] cxgb3i: zero out reserved or un-used fields. Zero out the reserved or un-used CPL message fields to prevent any garbage value. Signed-off-by: Karen Xie <kxie@chelsio.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:04 -05:00
Stephen M. Cameron	cba3d38b6c	[SCSI] hpsa: sanitize max commands Some controllers might try to tell us they support 0 commands in performant mode. This is a lie told by buggy firmware. We have to be wary of this lest we try to allocate a negative number of command blocks, which will be treated as unsigned, and get an out of memory condition. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:03 -05:00
Stephen M. Cameron	10f6601808	[SCSI] hpsa: separate intx and msi/msix interrupt handlers There are things which need to be done in the intx interrupt handler which do not need to be done in the msi/msix interrupt handler, like checking that the interrupt is actually for us, and checking that the interrupt pending bit on the hardware is set (which we weren't previously doing at all, which means old controllers wouldn't work), so it makes sense to separate these into two functions. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:03 -05:00
Stephen M. Cameron	1886765906	[SCSI] hpsa: forbid hard reset of 640x boards The 6402/6404 are two PCI devices -- two Smart Array controllers -- that fit into one slot. It is possible to reset them independently, however, they share a battery backed cache module. One of the pair controls the cache and the 2nd one access the cache through the first one. If you reset the one controlling the cache, the other one will not be a happy camper. So we just forbid resetting this conjoined mess. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:02 -05:00
Stephen M. Cameron	1df8552abf	[SCSI] hpsa: Fix hard reset code. Smart Array controllers newer than the P600 do not honor the PCI power state method of resetting the controllers. Instead, in these cases we can get them to reset via the "doorbell" register. This escaped notice until we began using "performant" mode because the fact that the controllers did not reset did not normally impede subsequent operation, and so things generally appeared to "work". Once the performant mode code was added, if the controller does not reset, it remains in performant mode. The code immediately after the reset presumes the controller is in "simple" mode (which previously, it had remained in simple mode the whole time). If the controller remains in performant mode any code which presumes it is in simple mode will not work. So the reset needs to be fixed. Unfortunately there are some controllers which cannot be reset by either method. (eg. p800). We detect these cases by noticing that the controller seems to remain in performant mode even after a reset has been attempted. In those case, we proceed anyway, as if the reset has happened (and skip the step of waiting for the controller to become ready -- which is expecting it to be in "simple" mode.) To sum up, we try to do a better job of resetting the controller if "reset_devices" is set, and if it doesn't work, we print a message and try to continue anyway. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:00 -05:00
Stephen M. Cameron	4c2a8c40d8	[SCSI] hpsa: factor out the code to reset controllers on driver load for kdump support Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:02:00 -05:00
Stephen M. Cameron	a51fd47f1b	[SCSI] hpsa: factor out hpsa_find_cfg_addrs. Rationale for this is that I will also need to use this code in fixing kdump host reset code prior to having the hba structure. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:58 -05:00
Stephen M. Cameron	12d2cd4711	[SCSI] hpsa: make hpsa_find_memory_BAR not require the per HBA structure. Rationale for this is that in order to fix the hard reset code used by kdump, we need to use this function before we even have the per HBA structure. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:57 -05:00
Stephen M. Cameron	6798cc0a49	[SCSI] hpsa: Make "hpsa_allow_any=1" boot param enable Compaq Smart Arrays. We were previously only accepting HP boards. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:55 -05:00
Stephen M. Cameron	2e931f3176	[SCSI] hpsa: add new controllers Add 5 CCISSE smart array controllers Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:54 -05:00
Nick Cheng	ae52e7f09f	[SCSI] arcmsr: Support 1024 scatter-gather list entries and improve AP while FW trapped and behaviors of EHs 1. To support 4M/1024 scatter-gather list entry, reorganize struct ARCMSR_CDB and struct CommandControlBlock 2. To modify arcmsr_probe 3. In order to help fix F/W issue, add the driver mode for type B card 4. To improve AP's behavior while F/W resets 5. To unify struct MessageUnit_B's members' naming in all OS drivers' 6. To improve error handlers, arcmsr_bus_reset(), arcmsr_abort() 7. To fix the arcmsr_queue_command() in bus reset stage, just let the commands pass down to FW, don't block Signed-off-by: Nick Cheng <nick.cheng@areca.com.tw> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:53 -05:00
Joe Eykholt	f034260db3	[SCSI] libfc: fix indefinite rport restart Remote ports were restarting indefinitely after getting rejects in PRLI. Fix by adding a counter of restarts and limiting that with the port login retry limit as well. Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:53 -05:00
Joe Eykholt	4b2164d4d2	[SCSI] libfc: Fix remote port restart problem This patch somewhat combines two fixes to remote port handing in libfc. The first problem was that rport work could be queued on a deleted and freed rport. This is handled by not resetting rdata->event ton NONE if the rdata is about to be deleted. However, that fix led to the second problem, described by Bhanu Gollapudi, as follows: > Here is the sequence of events. T1 is first LOGO receive thread, T2 is > fc_rport_work() scheduled by T1 and T3 is second LOGO receive thread and > T4 is fc_rport_work scheduled by T3. > > 1. (T1)Received 1st LOGO in state Ready > 2. (T1)Delete port & enter to RESTART state. > 3. (T1)schdule event_work, since event is RPORT_EV_NONE. > 4. (T1)set event = RPORT_EV_LOGO > 5. (T1)Enter RESTART state as disc_id is set. > 6. (T2)remember to PLOGI, and set event = RPORT_EV_NONE > 6. (T3)Received 2nd LOGO > 7. (T3)Delete Port & enter to RESTART state. > 8. (T3)schedule event_work, since event is RPORT_EV_NONE. > 9. (T3)Enter RESTART state as disc_id is set. > 9. (T3)set event = RPORT_EV_LOGO > 10.(T2)work restart, enter PLOGI state and issues PLOGI > 11.(T4)Since state is not RESTART anymore, restart is not set, and the > event is not reset to RPORT_EV_NONE. (current event is RPORT_EV_LOGO). > 12. Now, PLOGI succeeds and fc_rport_enter_ready() will not schedule > event_work, and hence the rport will never be created, eventually losing > the target after dev_loss_tmo. So, the problem here is that we were tracking the desire for the rport be restarted by state RESTART, which was otherwise equivalent to DELETE. A contributing factor is that we dropped the lock between steps 6 and 10 in thread T2, which allows the state to change, and we didn't completely re-evaluate then. This is hopefully corrected by the following minor redesign: Simplify the rport restart logic by making the decision to restart after deleting the transport rport. That decision is based on a new STARTED flag that indicates fc_rport_login() has been called and fc_rport_logoff() has not been called since then. This replaces the need for the RESTART state. Only restart if the rdata is still in DELETED state and only if it still has the STARTED flag set. Also now, since we clear the event code much later in the work thread, allow for the possibility that the rport may have become READY again via incoming PLOGI, and if so, queue another event to handle that. In the problem scenario, the second LOGO received will cause the LOGO event to occur again. Reported-by: Bhanu Gollapudi <bprakash@broadcom.com> Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:52 -05:00
Roel Kluin	0db6f4353d	[SCSI] fnic: fnic_scsi.c: clean up In fnic_abort_cmd() and fnic_device_reset() assign `rport' earlier to make FNIC_SCSI_DBG() calls cleaner. In fnic_clean_pending_aborts() `rport' is not used. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Acked-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:51 -05:00
Yi Zou	1c4bfe6305	[SCSI] libfc: lport state is enum not bit mask lport state is enum not bit mask. Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:51 -05:00
Bhanu Prakash Gollapudi	be61331d90	[SCSI] libfcoe: Check for order and missing critical descriptors for FIP ELS requests As per FC-BB-5 rev.2, section 7.8.7.1, strict ordering of FIP descriptors is required for ELS requests. Also, look for missing and duplicate critical descriptors. Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:50 -05:00
Bhanu Prakash Gollapudi	5550fda73d	[SCSI] libfcoe: Host doesnt handle CVL to NPIV ports Clear virtual link for NPIV ports is now handled by resetting the matching vnport. Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:49 -05:00
Bhanu Prakash Gollapudi	0a9c5d344d	[SCSI] libfcoe: Handle duplicate critical descriptors As per FC-BB-5 rev 2, section 7.8.6.2, malformed FIP frame shall be discarded. Drop discovery adv, ELS and CLV's with duplicate critical descriptors. [Resending after incorporating Joe's review comments] Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:49 -05:00
Joe Eykholt	c600fea2d8	[SCSI] libfcoe: update FIP FCF D flag from advertisments Allow the D flag (indicating that keep-alives are not needed) to be updated dynamically from received FIP advertisements. Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:48 -05:00
Joe Eykholt	d99ee45b7c	[SCSI] libfcoe: Use fka_period as periodic timeouts to age out fcf if keep alives are disabled due to fd_flags set and also stop updating keep alive values in that case. Update select fcf time only if fcf is not already selected or select time is not already determined from parse adv, and then have select time cleared only once after fcf is selected. Changed deadline check to time_after_eq() from time_after() since now next timeout will be on exact 2.5 times FKA followed by first advertisement. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-07-27 12:01:47 -05:00

1 2 3 4 5 ...

89630 Commits