Commit Graph

435 Commits

Author SHA1 Message Date
Johannes Thumshirn
7ac65007c2 scsi: libfc: don't set FC_RQST_STATE_DONE before calling fc_bsg_jobdone()
Don't set FC_RQST_STATE_DONE before calling fc_bsg_jobdone() as
fc_bsg_jobdone() calls blk_complete_requeust() which raises a soft-IRQ
that ends up in fc_bsg_sofirq_done() and fc_bsg_softirq_done() sets the
FC_RQST_STATE_DONE flag.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-17 20:15:25 -05:00
Johannes Thumshirn
1d69b1222a scsi: fc: provide fc_bsg_to_rport() helper
Provide fc_bsg_to_rport() helper that will become handy when we're
moving from struct fc_bsg_job to a plain struct bsg_job. Also move all
LLDDs to use the new helper.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-17 20:15:25 -05:00
Johannes Thumshirn
cd21c605b2 scsi: fc: provide fc_bsg_to_shost() helper
Provide fc_bsg_to_shost() helper that will become handy when we're
moving from struct fc_bsg_job to a plain struct bsg_job. Also use this
little helper in the LLDDs.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-17 20:15:25 -05:00
Johannes Thumshirn
1abaede715 scsi: fc: Export fc_bsg_jobdone and use it in FC drivers
Export fc_bsg_jobdone so drivers can use it directly instead of doing
the round-trip via struct fc_bsg_job::job_done() and use it in the
LLDDs.  That way we can also unify the interfaces of fc_bsg_jobdone and
bsg_job_done.

As we've converted all LLDDs over to use fc_bsg_jobdone() directly, we
can remove the function pointer from struct fc_bsg_job as well.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-17 20:15:25 -05:00
Johannes Thumshirn
01e0e15c8b scsi: don't use fc_bsg_job::request and fc_bsg_job::reply directly
Don't use fc_bsg_job::request and fc_bsg_job::reply directly, but use
helper variables bsg_request and bsg_reply. This will be helpful when
transitioning to bsg-lib.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-17 20:15:24 -05:00
Chris Leech
d5c3eb26d9 scsi: libfc: Don't have fc_exch_find log errors on a new exchange
With the error message I added in "libfc: sanity check cpu number
extracted from xid" I didn't account for the fact that fc_exch_find is
called with FC_XID_UNKNOWN at the start of a new exchange if we are the
responder.

It doesn't come up with the initiator much, but that's basically every
exchange for a target.  By checking the xid for FC_XID_UNKNOWN first, we
not only prevent the erroneous error message, but skip the unnecessary
lookup attempt as well.

[mkp: applied by hand due to conflict with Hannes' libfc patch series]

Signed-off-by: Chris Leech <cleech@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:58 -05:00
Hannes Reinecke
9625cc483b scsi: libfc: Replace ->seq_release callback with function call
The ->seq_release callback only ever had one implementation,
so call the function directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:56 -05:00
Hannes Reinecke
96d564e24a scsi: libfc: Replace ->seq_assign callback with function call
The ->seq_assign callback only ever had one implementation,
so call the function directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:56 -05:00
Hannes Reinecke
f1d61e6e68 scsi: libfc: Replace ->seq_set_resp callback with direct function call
The ->seq_set_resp callback only ever had one implementation,
so call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:56 -05:00
Hannes Reinecke
c6865b30be scsi: libfc: Replace ->seq_start_next callback with function call
The ->seq_start_next callback only ever had one implementation,
so call the function directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:56 -05:00
Hannes Reinecke
768c72cc34 scsi: libfc: Replace ->exch_done callback with function call
The ->exch_done callback only ever had one implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:56 -05:00
Hannes Reinecke
0ebaed17fe scsi: libfc: Replace ->seq_exch_abort callback with function call
The ->seq_exch_abort callback only ever had one implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:56 -05:00
Hannes Reinecke
0cac937da5 scsi: libfc: Replace ->seq_send callback with function call
The ->seq_send callback only ever had one implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:56 -05:00
Hannes Reinecke
a8220ded09 scsi: libfc: Remove fc_rport_init()
Function is empty now and can be removed.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Reviewed-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:56 -05:00
Hannes Reinecke
5922a95745 scsi: libfc: Replace ->rport_flush_queue callback with function call
The ->rport_flush_queue callback only ever had a single
implementation, so we can as well call it directly and
drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:55 -05:00
Hannes Reinecke
e76ee65fa6 scsi: libfc: Replace ->rport_recv_req callback with function call
The ->rport_recv_req callback only ever had one implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:55 -05:00
Hannes Reinecke
c96c792aee scsi: libfc: Replace ->rport_logoff callback with function call
The ->rport_logoff callback only ever had one implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Reviewed-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:55 -05:00
Hannes Reinecke
05d7d3b0bd scsi: libfc: Replace ->rport_login callback with function call
The ->rport_login callback only ever had one implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:55 -05:00
Hannes Reinecke
2580064b5e scsi: libfc: Replace ->rport_create callback with function call
The ->rport_create callback only ever had a single implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:55 -05:00
Hannes Reinecke
e87b777793 scsi: libfc: Replace ->rport_lookup callback with function call
The ->rport_lookup callback only ever had a single implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:55 -05:00
Hannes Reinecke
944ef9689d scsi: libfc: Replace ->rport_destroy callback with function call
The ->rport_destroy callback only ever had one implementation,
so we can as well call it directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:55 -05:00
Hannes Reinecke
3afd2d1521 scsi: libfc: Replace ->exch_seq_send callback with function call
The ->exch_seq_send callback only ever had one implementation,
so we can call the function directly and drop the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:55 -05:00
Hannes Reinecke
c5cb444c31 scsi: libfc: Replace ->lport_recv with function call
The ->lport_recv callback only ever had one implementation,
so call the function directly and remove the callback.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:54 -05:00
Hannes Reinecke
31c0a631a4 scsi: libfc: Replace ->lport_reset callback with function call
The ->lport_reset callback only ever had one implementation,
which already is exported. So remove it and use the function
directly.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:54 -05:00
Hannes Reinecke
7ab24dd165 scsi: libfc: Replace ->seq_els_rsp_send callback with function call
The 'seq_els_rsp_send' callback only ever had one implementation,
so we might as well drop it and use the function directly.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:54 -05:00
Hannes Reinecke
e0a25286d8 scsi: libfc: Check xid when looking up REC exchanges
We currently can only lookup the local xid, so we need
to reject REC with empty rxid.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:53 -05:00
Hannes Reinecke
87da3b832e scsi: libfc: wait for E_D_TOV when out-of-order sequence is received
When detecting an out-of-order sequence we should be waiting for
E_D_TOV before trying to abort the sequence.
The response might still be stuck in the queue somewhere.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:53 -05:00
Hannes Reinecke
ad3120cfe0 scsi: libfc: reset timeout on queue full
When we're receiving a timeout we should be checking for queue
full status; if there are still some packets pending we should
be resetting the counter to ensure we're not missing out any
packets which are still queued.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:53 -05:00
Hannes Reinecke
53db8fa8a3 scsi: libfc: Do not drop out-of-order frames
When receiving packets from the network we cannot guarantee any
frame ordering, so we should be receiving all valid frames and
let the upper layers deal with it.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
d11b44eff1 scsi: libfc: don't fail sequence abort for completed exchanges
If a sequence should be aborted the exchange might already
be completed (eg if the response is still queued in the rx
queue), so this shouldn't considered as an error.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
9ca1e182b9 scsi: libfc: quarantine timed out xids
When a sequence times out we have no idea what happened to the
frame. And we do not know if we will ever receive the frame.
Hence we cannot re-use the xid as we would risk data corruption
if the xid had been re-used and the timed out frame would be
received after that.
So we need to quarantine the xid until the lport is reset.
Yes, I know this will (eventually) deplete the xid pool.
But for now it's the safest method.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
b73aa56ee9 scsi: libfc: safeguard against invalid exchange index
The cached exchange index might be invalid, in which case
we should drop down to allocate a new one.
And we should not try to access an invalid exchange when
responding to a BA_ABTS.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
5d339d163a scsi: libfc: Clarify ramp-down messages
When the queue depth is reduced we should print out the reason
for this; it might be due to a queue full condition.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
8acf1b50cf scsi: libfc: Return LS_RJT_BUSY for PRLI in status PLOGI
Occasionally it might happen that we receive a PRLI while we're still
waiting for our PLOGI response. In that case we should return
'busy' LS status instead of 'plogi required' LS status.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
386b97b43c scsi: libfc: Rework PRLI handling
PRLI is only required if the port is acting as an initiator; ports
which support target functionality only do not need to send PRLI.
At the same time the PRLI state is only used if the port initiated
a PRLI transfer; if we received a PRLI request we should _not_
change the state as this would cause our PRLI response to be dropped.
And when we receive a PRLI response we need to check if an image
pair has been established; if not the remote port cannot act as a
target for us and we need to disable target functionality.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
7c5a51b8f8 scsi: libfc: Implement RTV responder
The libfc stack generates an RTV request, so we should be implementing
an RTV responder, too.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
9f9504a7cd scsi: libfc: use error code for fc_rport_error()
We only ever use the 'fp' argument for fc_rport_error() to
encapsulate the error code, so we can as well do away with that
and pass the error directly.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:52 -05:00
Hannes Reinecke
0f4c16a2f4 scsi: libfc: do not overwrite DID_TIME_OUT status
When a command is aborted it might already have the DID_TIME_OUT
status set, so we shouldn't be overwriting that.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:51 -05:00
Hannes Reinecke
76e72ad117 scsi: libfc: sanitize E_D_TOV and R_A_TOV setting
When setting the FCP timeout we need to ensure a lower boundary
for E_D_TOV and R_A_TOV, otherwise we'd be getting spurious I/O
issues due to the fcp timer firing too early.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:51 -05:00
Hannes Reinecke
a50cc9eccc scsi: libfc: use configured rport E_D_TOV
If fc_rport_error_retry() is attempting to retry the remote
port state we should be waiting for the configured e_d_tov
value rather than the default.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:51 -05:00
Hannes Reinecke
f7ce413cea scsi: libfc: use configured lport R_A_TOV
We should be using the configured R_A_TOV value when sending the
exchange.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:51 -05:00
Hannes Reinecke
a0452bb45c scsi: libfc: spurious I/O error under high load
If a command times out libfc is sending an REC, which also
might fail (due to frames being lost or something).
If no data has been transferred we can simply retry
the command, but the current code sets a state of FC_ERROR,
which then is being translated into DID_ERROR, resulting
in an I/O error.
So to handle this properly we need to set a separate
state FC_TRANS_RESET and mapping it onto DID_SOFT_RETRY.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:51 -05:00
Hannes Reinecke
57d3ec7e46 scsi: libfc: additional debugging messages
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:51 -05:00
Martin K. Petersen
f8f91f3f31 scsi: libfc: Revert "[SCSI] libfc: use offload EM instance again instead jumping to next EM"
This reverts commit 3e22760d4d.

This revert came about because of efforts by Ewan Milne, Curtis Taylor
and I.  In researching this issue, significant performance issues were
seen on large CPU count systems using the software FCOE stack.  Hannes
also weighed in.

The same was not apparent on much smaller low count CPU systems.  The
behavior introduced by commit 3e22760d4d
lands sup with large count CPU systems seeing continual
blk_requeue_request() calls due to ML_QUEUE_HOST_BUSY.

fc_exch_alloc() used to try all the available exchange managers in the
list for an available exchange id, but this was changed in 2010 so that
if the first matched exchange manager couldn't allocate one, it fails
and we end up returning host busy.  This was due to commit:

Setting the ddp_min module parameter to fcoe to 128MB prevents the
->match function from permitting the use of the offload exchange manager
for the frame, and we no longer see the problem with host busy status,
since it uses the larger non-offloaded pool.

Reverting commit 3e22760d4d was tested to
also prevent the host busy issue due to failing allocations.

Suggested-by: Ewan Milne <emilne@redhat.com>
Suggested-by: Curtis Taylor <cjt@us.ibm.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Laurence Oberman <loberman@redhat.com>
2016-11-08 17:29:51 -05:00
Hannes Reinecke
f89b8d67db scsi: libfc: don't advance state machine for incoming FLOGI
When we receive an FLOGI but have already sent our own we should
not advance the state machine but rather wait for our FLOGI to
return before continuing with PLOGI.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:47 -05:00
Hannes Reinecke
06ee2571a4 scsi: libfc: Do not login if the port is already started
When the port is already started we don't need to login; that
will only confuse the state machine.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:47 -05:00
Hannes Reinecke
e5a20009da scsi: libfc: Do not drop down to FLOGI for fc_rport_login()
When fc_rport_login() is called while the rport is not
in RPORT_ST_INIT, RPORT_ST_READY, or RPORT_ST_DELETE
login is already in progress and there's no need to
drop down to FLOGI; doing so will only confuse the
other side.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:47 -05:00
Chad Dupuis
785141c62a scsi: libfc: Do not take rdata->rp_mutex when processing a -FC_EX_CLOSED ELS response.
When an ELS response handler receives a -FC_EX_CLOSED, the rdata->rp_mutex is
already held which can lead to a deadlock condition like the following stack trace:

[<ffffffffa04d8f18>] fc_rport_plogi_resp+0x28/0x200 [libfc]
[<ffffffffa04cfa1a>] fc_invoke_resp+0x6a/0xe0 [libfc]
[<ffffffffa04d0c08>] fc_exch_mgr_reset+0x1b8/0x280 [libfc]
[<ffffffffa04d87b3>] fc_rport_logoff+0x43/0xd0 [libfc]
[<ffffffffa04ce73d>] fc_disc_stop+0x6d/0xf0 [libfc]
[<ffffffffa04ce7ce>] fc_disc_stop_final+0xe/0x20 [libfc]
[<ffffffffa04d55f7>] fc_fabric_logoff+0x17/0x70 [libfc]

The other ELS handlers need to follow the FLOGI response handler and simply do
a kref_put against the fc_rport_priv struct and exit when receving a
-FC_EX_CLOSED response.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:47 -05:00
Hannes Reinecke
a407c59339 scsi: libfc: Fixup disc_mutex handling
The list of attached 'rdata' remote port structures is RCU
protected, so there is no need to take the 'disc_mutex' when
traversing it.
Rather we should be using rcu_read_lock() and kref_get_unless_zero()
to validate the entries.
We need, however, take the disc_mutex when deleting an entry;
otherwise we risk clashes with list_add.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:46 -05:00
Hannes Reinecke
4d2095cc42 scsi: libfc: Revisit kref handling
The kref handling in fc_rport is a mess. This patch updates
the kref handling according to the following rules:

- Take a reference whenever scheduling a workqueue
- Take a reference whenever an ELS command is send
- Drop the reference at the end of the workqueue function
- Drop the reference at the end of handling ELS replies
- Take a reference when allocating an rport
- Drop the reference when removing an rport

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-08 17:29:46 -05:00
Hannes Reinecke
a850ced429 scsi: libfc: do not send ABTS when resetting exchanges
When all exchanges are reset the upper layers have already logged out of
the remote port, so the exchanges can be reset without sending any ABTS.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Chad Dupuis <chad.dupuis@qlogic.com>
Tested-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-08-18 22:35:17 -04:00
Hannes Reinecke
649eb86938 scsi: libfc: reset exchange manager during LOGO handling
FC-LS mandates that we should invalidate all sequences before sending a
LOGO. And we should set the event to RPORT_EV_STOP when a LOGO request
has been received to signal that all exchanges are terminated.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Chad Dupuis <chad.dupuis@qlogic.com>
Tested-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-08-18 22:34:40 -04:00
Hannes Reinecke
d391966a03 scsi: libfc: send LOGO for PLOGI failure
When running in point-to-multipoint mode PLOGI is done after FLOGI
completed. So when the PLOGI fails we should be sending a LOGO to the
remote port.

[mkp: Applied by hand]

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Chad Dupuis <chad.dupuis@qlogic.com>
Tested-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-08-18 22:34:05 -04:00
Hannes Reinecke
166f310b62 scsi: libfc: Issue PRLI after a PRLO has been received
When receiving a PRLO it just means that the operating parameters have
changed, it does _not_ mean that the port doesn't want to communicate
with us.  So instead of implicitly logging out we should be issueing a
PRLI to figure out the new operating parameters.  We can always recover
once PRLI fails.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Chad Dupuis <chad.dupuis@qlogic.com>
Tested-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-08-18 22:27:26 -04:00
Chris Leech
fa06883281 libfc: sanity check cpu number extracted from xid
In the receive path libfc extracts a cpu number from the ox_id in the
fiber channel header and uses that to do a per_cpu_ptr conversion.  If,
for some reason, a frame is received with an invalid ox_id, per_cpu_ptr
will return an invalid pointer and the libfc receive path will panic the
system trying to use it.

I'm currently looking at such a case, and I don't yet know why a cpu
number > nr_cpu_ids is appearing in an exchange id.  But adding a sanity
check in libfc prevents a system panic, and seems like good idea when
dealing with frames coming in from the network.

Signed-off-by: Chris Leech <cleech@redhat.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-07-13 21:49:57 -04:00
Arnd Bergmann
540eb1eef0 scsi: libfc: fix seconds_since_last_reset calculation
The fc_get_host_stats() function contains a complex conversion from
jiffies to timespec to seconds. As we try to get rid of uses of struct
timespec, we can clean this up and replace it with a simpler
computation.

Simply dividing the difference in jiffies by HZ is not only much more
efficient, it also avoids a problem that causes the
seconds_since_last_reset value to be incorrect if jiffies has overrun
since the 'boot_time' value was recorded.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-07-12 23:16:31 -04:00
Hannes Reinecke
baa6719f90 libfc: Update rport reference counting
Originally libfc would just be initializing the refcount to '1', and
using the disc_mutex to synchronize if and when the final put should be
happening.  This has a race condition as the mutex might be delayed,
causing other threads to access an invalid structure.  This patch
updates the rport reference counting to increase the reference every
time 'rport_lookup' is called, and decreases the reference
correspondingly.  This removes the need to hold 'disc_mutex' when
removing the structure, and avoids the above race condition.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Vasu Dev <vasu.dev@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-07-12 23:16:31 -04:00
Sebastian Herbszt
c59ab4e5af libfc: Use the correct function name in kernel-doc comment.
Signed-off-by: Sebastian Herbszt <herbszt@gmx.de>
Acked-by: Vasu Dev <vasu.dev@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-11-09 17:15:52 -08:00
Linus Torvalds
df910390e2 SCSI misc on 20150901
This includes one new driver: cxlflash plus the usual grab bag of updates for
 the major drivers: qla2xxx, ipr, storvsc, pm80xx, hptiop, plus a few assorted
 fixes.
 
 Signed-off-by: James Bottomley <JBottomley@Odin.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJV5eGiAAoJEDeqqVYsXL0MCpwH/AoneZOPeCx0YdTlyiojasi4
 Kc7ECmV9IJJoMoCbP8grwStvynYyCHDSphYmqopZPRlD021eG8ota2uRTHEGJI+q
 SoiZUlq8ti8xgnD55mubwO+UNF+zoELMyHUok2pGzBoZN5alA6nvKuNY7Hif3P3b
 YMT490oWQLjWmJkMW8TbpMn9nHpW0dfbP323uaggWsMy3CSI707+x36FLi1/ICg6
 MZRyv4aESAcauZGUI5EG+SrIl3OBQX7snsYXyuqD3biGqzbGc3p3L9uWG1qXHDbM
 OSGXhN+our0WYHCV1/UrGz7/IAWW1UU0W2qgCBwkXkDjkXJ4jqd36zLJxeuhSpE=
 =KOmP
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull first round of SCSI updates from James Bottomley:
 "This includes one new driver: cxlflash plus the usual grab bag of
  updates for the major drivers: qla2xxx, ipr, storvsc, pm80xx, hptiop,
  plus a few assorted fixes.

  There's another tranch coming, but I want to incubate it another few
  days in the checkers, plus it includes a mpt2sas separated lifetime
  fix, which Avago won't get done testing until Friday"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (85 commits)
  aic94xx: set an error code on failure
  storvsc: Set the error code correctly in failure conditions
  storvsc: Allow write_same when host is windows 10
  storvsc: use storage protocol version to determine storage capabilities
  storvsc: use correct defaults for values determined by protocol negotiation
  storvsc: Untangle the storage protocol negotiation from the vmbus protocol negotiation.
  storvsc: Use a single value to track protocol versions
  storvsc: Rather than look for sets of specific protocol versions, make decisions based on ranges.
  cxlflash: Remove unused variable from queuecommand
  cxlflash: shift wrapping bug in afu_link_reset()
  cxlflash: off by one bug in cxlflash_show_port_status()
  cxlflash: Virtual LUN support
  cxlflash: Superpipe support
  cxlflash: Base error recovery support
  qla2xxx: Update driver version to 8.07.00.26-k
  qla2xxx: Add pci device id 0x2261.
  qla2xxx: Fix missing device login retries.
  qla2xxx: do not clear slot in outstanding cmd array
  qla2xxx: Remove decrement of sp reference count in abort handler.
  qla2xxx: Add support to show MPI and PEP FW version for ISP27xx.
  ...
2015-09-02 12:22:54 -07:00
Bart Van Assche
8f2777f53e libfc: Fix fc_fcp_cleanup_each_cmd()
Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:

BUG: scheduling while atomic: sg_reset/1512/0x00000202
1 lock held by sg_reset/1512:
 #0:  (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Call Trace:
 [<ffffffff816c612c>] dump_stack+0x4f/0x7b
 [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
 [<ffffffff816c87aa>] __schedule+0x71a/0xa10
 [<ffffffff816c8ad2>] schedule+0x32/0x80
 [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
 [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
 [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
 [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
 [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
 [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
 [<ffffffff814a2650>] scsi_ioctl+0x150/0x440
 [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
 [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
 [<ffffffff811da608>] block_ioctl+0x38/0x40
 [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
 [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
 [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-12 11:24:21 -07:00
Bart Van Assche
f6979adeaa libfc: Fix fc_exch_recv_req() error path
Due to patch "libfc: Do not invoke the response handler after
fc_exch_done()" (commit ID 7030fd62) the lport_recv() call
in fc_exch_recv_req() is passed a dangling pointer. Avoid this
by moving the fc_frame_free() call from fc_invoke_resp() to its
callers. This patch fixes the following crash:

general protection fault: 0000 [#3] PREEMPT SMP
RIP: fc_lport_recv_req+0x72/0x280 [libfc]
Call Trace:
 fc_exch_recv+0x642/0xde0 [libfc]
 fcoe_percpu_receive_thread+0x46a/0x5ed [fcoe]
 kthread+0x10a/0x120
 ret_from_fork+0x42/0x70

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-12 11:23:30 -07:00
Bart Van Assche
ce83a4ca18 libfc: Fix a typo in a source code comment
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-12 11:21:43 -07:00
Christoph Hellwig
db5ed4dfd5 scsi: drop reason argument from ->change_queue_depth
Drop the now unused reason argument from the ->change_queue_depth method.
Also add a return value to scsi_adjust_queue_depth, and rename it to
scsi_change_queue_depth now that it can be used as the default
->change_queue_depth implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-24 14:45:27 +01:00
Christoph Hellwig
c40ecc12cf scsi: avoid ->change_queue_depth indirection for queue full tracking
All drivers use the implementation for ramping the queue up and down, so
instead of overloading the change_queue_depth method call the
implementation diretly if the driver opts into it by setting the
track_queue_depth flag in the host template.

Note that a few drivers validated the new queue depth in their
change_queue_depth method, but as we never go over the queue depth
set during slave_configure or the sysfs file this isn't nessecary
and can safely be removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
2014-11-24 14:45:12 +01:00
Christoph Hellwig
c8b09f6fb6 scsi: don't set tagging state from scsi_adjust_queue_depth
Remove the tagged argument from scsi_adjust_queue_depth, and just let it
handle the queue depth.  For most drivers those two are fairly separate,
given that most modern drivers don't care about the SCSI "tagged" status
of a command at all, and many old drivers allow queuing of multiple
untagged commands in the driver.

Instead we start out with the ->simple_tags flag set before calling
->slave_configure, which is how all drivers actually looking at
->simple_tags except for one worke anyway.  The one other case looks
broken, but I've kept the behavior as-is for now.

Except for that we only change ->simple_tags from the ->change_queue_type,
and when rejecting a tag message in a single driver, so keeping this
churn out of scsi_adjust_queue_depth is a clear win.

Now that the usage of scsi_adjust_queue_depth is more obvious we can
also remove all the trivial instances in ->slave_alloc or ->slave_configure
that just set it to the cmd_per_lun default.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2014-11-12 11:19:43 +01:00
Christoph Hellwig
2ecb204d07 scsi: always assign block layer tags if enabled
Allow a driver to ask for block layer tags by setting .use_blk_tags in the
host template, in which case it will always see a valid value in
request->tag, similar to the behavior when using blk-mq.  This means even
SCSI "untagged" commands will now have a tag, which is especially useful
when using a host-wide tag map.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12 11:19:43 +01:00
Christoph Hellwig
a62182f338 scsi: provide a generic change_queue_type method
Most drivers use exactly the same implementation, so provide it as a
library function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12 11:19:39 +01:00
Andreea-Cristina Bernat
f4303d8fa6 libfc: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
The uses of "rcu_assign_pointer()" are NULLing out the pointers.
According to RCU_INIT_POINTER()'s block comment:
"1.   This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.

The following Coccinelle semantic patch was used:
@@
@@

- rcu_assign_pointer
+ RCU_INIT_POINTER
  (..., NULL)

Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Acked-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-30 09:28:36 +02:00
Bart Van Assche
9de99010cb fcp: Do not interpret check condition as underrun
This patch avoids that the FCoE initiator sends a REC message after
having received a SCSI response with non-zero status and non-zero
DATA IN buffer length.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 13:52:35 -07:00
Bart Van Assche
7030fd6261 libfc: Do not invoke the response handler after fc_exch_done()
While the FCoE initiator driver invokes fc_exch_done() from inside
the libfc response handler, FCoE target drivers typically invoke
fc_exch_done() from outside the libfc response handler. The object
fc_exch.arg points at may disappear as soon as fc_exch_done() has
finished. So it's important not to invoke the response handler
function after fc_exch_done() has finished. Modify libfc such that
this guarantee is provided if fc_exch_done() is invoked from
outside a response handler. This patch fixes a sporadic crash in
FCoE target implementations after a command has been aborted.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 13:45:22 -07:00
Bart Van Assche
f95b35cfca libfc: Reduce exchange lock contention in fc_exch_recv_abts()
Reduce the time during which the exchange lock is held by allocating
a frame before obtaining the exchange lock.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 13:37:53 -07:00
Bart Van Assche
cae7b6dd6c libfc: Avoid that sending after an abort triggers a kernel warning
Calling fc_seq_send() after an ABTS message has been received triggers
a kernel warning (WARN_ON(!(ep->esb_stat & ESB_ST_SEQ_INIT))). Avoid
this by returning -ENXIO to the caller if fc_seq_send() is invoked after
an ABTS message has been received.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 13:30:43 -07:00
Bart Van Assche
5d73bea2d3 libfc: Protect ep->esb_stat changes via ex_lock
This patch avoids that the WARN_ON(!(ep->esb_stat & ESB_ST_SEQ_INIT))
statement in fc_seq_send_locked() gets triggered sporadically when
running FCoE target code due to concurrent ep->esb_stat modifications.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 13:23:38 -07:00
Bart Van Assche
b86788658b libfc: Fix a race in fc_exch_timer_set_locked()
It is allowed to pass a zero timeout value to fc_seq_exch_abort().
Avoid that this can cause the timeout function to drop the exchange
reference before it has been increased by fc_exch_timer_set_locked().
This patch fixes a crash when running FCoE target code with poisoning
enabled in the memory allocator.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 13:16:25 -07:00
Bart Van Assche
8d08023687 libfc: Clarify fc_exch_find()
The condition ep != NULL && ep->xid != xid can never be met. Make
this explicit.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 13:09:15 -07:00
Bart Van Assche
a84ea8c7e8 libfc: Micro-optimize fc_setup_exch_mgr()
Convert a loop into an ilog2() call. Although this code is not performance
sensitive this conversion makes this code easier to read.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 13:02:09 -07:00
Bart Van Assche
b20d9bfda7 libfc: Debug code fixes
The second argument of fc_lport_error() may be a valid frame pointer.
Hence only print it as an error code if it really is an error code.

Debug statements must end in a newline. Add one where it is missing.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 12:55:02 -07:00
Bart Van Assche
c1d454246c libfc: Source code comment spelling fixes
Change 'initiaive' into 'initiative', 'remainig' into 'remaining'
and change 'exected' into 'expected'.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-09-04 12:47:49 -07:00
James Bottomley
622f9a8e7b A short series of fixes to libfc, libfcoe and fcoe.
Most patches fix formatting problems, one changes
 the behavior of which discovered ports can/will be
 logged into and another fixes a memory leak.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJR3GU8AAoJEEajxTw9cn4Hfi0P/2xRiIY8XtePDjBh5AMoIEnA
 FvhdIoEzojmonEYry/6zemLv/6HJqOLdIxEfOc0UTN43NBcKLR2kqd5XjtA6Pu8z
 WHCoic/fhojMEs0MkwwAd8bYqOhUw3ENBwZgKxdmEu5zoHXyg28mURI/4/+lbGwj
 fWL/M7vFmRdyfh6tA7DaxhmhPdIPI8B545XMGK00guvnbb82riqBK6dosGki/tPg
 R3dsf2twyZFiSje/EH9x71KIuVNHwox0IenvXKTlIUaltHyR2TRPlH932z/ElFwh
 fd1yeXFG7B7FfNS7ssLvVBoa2Gyn2HZB1mQj7T2SqgATBZgUNpOoI4UPi9y3EUII
 DU7APOSrK1jKGxpnbvTUw0XsS0kyWb/o0OZfT/RQc6YzYIAA8EFh4GwdigvWK/1S
 kW/QC10+AN71vS5XaYerDs7Mq3UqkCUlADgiBJWryuujHghr8KK7sDAccI0IBxRU
 bHlkIR0uqX+QxVyrVlyLAPLBeuldjEVfimlcIlol+7DE6kcaOzwsaxIQR8NXgOGm
 gdgow4N+yzrivQh8CxG5gLJTBONAHAoLtTW1fIYGxLWF4pbxpdqQeb189NEbNSZD
 JclKNAu1QKGuz3VRFBeahMbdi6m2Lw7nY/pj0LVOlScb0yZzGXGxOG3f8x8R8fsv
 3pBgPW5iH5Limq/3RGkb
 =mrTt
 -----END PGP SIGNATURE-----

Merge tag 'fcoe' into for-linus

A short series of fixes to libfc, libfcoe and fcoe.
Most patches fix formatting problems, one changes
the behavior of which discovered ports can/will be
logged into and another fixes a memory leak.
2013-07-13 08:22:56 +04:00
Robert Love
3a2926054a libfc: Differentiate echange timer cancellation debug statements
There are two debug statements with the same output string regarding
echange timer cancellation. This patch simply changes the output of
one string so that they can be differentiated.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2013-07-09 11:18:41 -07:00
Robert Love
4a80f083dd libfc: Remove extra space in fc_exch_timer_cancel definition
Simply remove an extra space that violates coding style.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2013-07-09 11:18:33 -07:00
Mark Rustad
e0335f67a2 libfc: Reject PLOGI from nodes with incompatible role
Reject a PLOGI from a node with an incompatible role,
that is, initiator-to-initiator or target-to-target.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Reviewed-by: Yi Zou <yi.zou@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-07-09 09:29:14 -07:00
Linus Torvalds
80cc38b163 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "The usual stuff from trivial tree"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
  treewide: relase -> release
  Documentation/cgroups/memory.txt: fix stat file documentation
  sysctl/net.txt: delete reference to obsolete 2.4.x kernel
  spinlock_api_smp.h: fix preprocessor comments
  treewide: Fix typo in printk
  doc: device tree: clarify stuff in usage-model.txt.
  open firmware: "/aliasas" -> "/aliases"
  md: bcache: Fixed a typo with the word 'arithmetic'
  irq/generic-chip: fix a few kernel-doc entries
  frv: Convert use of typedef ctl_table to struct ctl_table
  sgi: xpc: Convert use of typedef ctl_table to struct ctl_table
  doc: clk: Fix incorrect wording
  Documentation/arm/IXP4xx fix a typo
  Documentation/networking/ieee802154 fix a typo
  Documentation/DocBook/media/v4l fix a typo
  Documentation/video4linux/si476x.txt fix a typo
  Documentation/virtual/kvm/api.txt fix a typo
  Documentation/early-userspace/README fix a typo
  Documentation/video4linux/soc-camera.txt fix a typo
  lguest: fix CONFIG_PAE -> CONFIG_x86_PAE in comment
  ...
2013-07-04 11:40:58 -07:00
Geert Uytterhoeven
83a35e3604 treewide: relase -> release
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-06-28 14:34:33 +02:00
Neil Horman
fb00cc2353 libfc: extend ex_lock to protect all of fc_seq_send
This warning was reported recently:

WARNING: at drivers/scsi/libfc/fc_exch.c:478 fc_seq_send+0x14f/0x160 [libfc]()
(Not tainted)
Hardware name: ProLiant DL120 G7
Modules linked in: tcm_fc target_core_iblock target_core_file target_core_pscsi
target_core_mod configfs dm_round_robin dm_multipath 8021q garp stp llc bnx2fc
cnic uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt autofs4 sunrpc
pcc_cpufreq ipv6 hpilo hpwdt e1000e microcode iTCO_wdt iTCO_vendor_support
serio_raw shpchp ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif pata_acpi
ata_generic ata_piix hpsa dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
scsi_wait_scan]
Pid: 5464, comm: target_completi Not tainted 2.6.32-272.el6.x86_64 #1
Call Trace:
 [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20
 [<ffffffffa025f7df>] ? fc_seq_send+0x14f/0x160 [libfc]
 [<ffffffffa035cbce>] ? ft_queue_status+0x16e/0x210 [tcm_fc]
 [<ffffffffa030a660>] ? target_complete_ok_work+0x0/0x4b0 [target_core_mod]
 [<ffffffffa030a766>] ? target_complete_ok_work+0x106/0x4b0 [target_core_mod]
 [<ffffffffa030a660>] ? target_complete_ok_work+0x0/0x4b0 [target_core_mod]
 [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0
 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
 [<ffffffff81091d66>] ? kthread+0x96/0xa0
 [<ffffffff8100c14a>] ? child_rip+0xa/0x20
 [<ffffffff81091cd0>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20

It occurs because fc_seq_send can have multiple contexts executing within it at
the same time, and fc_seq_send doesn't consistently use the ep->ex_lock that
protects this structure.  Because of that, its possible for one context to clear
the INIT bit in the ep->esb_state field while another checks it, leading to the
above stack trace generated by the WARN_ON in the function.

We should probably undertake the effort to convert access to the fc_exch
structures to use rcu, but that a larger work item.  To just fix this specific
issue, we can just extend the ex_lock protection through the entire fc_seq_send
path

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Gris Ge <fge@redhat.com>
CC: Robert Love <robert.w.love@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-05-10 10:19:23 -07:00
Mark Rustad
732bdb9d14 libfc: Correct check for initiator role
The service_params field is being checked against the symbol
FC_RPORT_ROLE_FCP_INITIATOR where it really should be checked
against FCP_SPPF_INIT_FCN.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-05-10 10:19:22 -07:00
Robert Love
0807619d3c libfc, fcoe, bnx2fc: Split fc_disc_init into fc_disc_{init, config}
Split discovery initialization in code that is setup once (fcoe_disc_init)
and code that can be re-configured (fcoe_disc_config).

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
2013-03-25 16:03:03 -07:00
Robert Love
8a9a713812 libfc, fcoe, bnx2fc: Always use fcoe_disc_init for discovery layer initialization
Currently libfcoe is doing some libfc discovery layer initialization outside of
libfc. This patch moves this code into libfc and sets up a split in discovery
(one time) initialization code and (re-configurable) settings that will come in
the next patch.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
2013-03-25 16:01:10 -07:00
Krishna Mohan
a586069b0f libfc: XenServer fails to mount root filesystem.
schedule_delayed_work() is using msec instead of jiffies. On PLOGI
reject from target, remote port retry is scheduled @ 20 sec instead
of 2sec(FC_DEF_E_D_TOV).
XenServer dom0 kernel is configured with CONFIG_HZ=100Hz

Signed-off-by: Krishna Mohan <krmohan@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2013-02-15 09:33:27 -08:00
Robert Love
8e6c5363dc libfc, libfcoe, fcoe: Convert debug_logging macros to pr_info
Convert libfc, libfcoe and fcoe's debug_logging macros
to use pr_info() instead of printk(KERN_INFO, ...). checkpatch.pl
now complains about this, so convert libfcoe to preferred
method.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
2012-12-14 10:38:55 -08:00
Vasu Dev
5b97fabdc8 libfc: fix REC handling
Currently fc_fcp_timeout doesn't check FC_RP_FLAGS_REC_SUPPORTED
flag first, this prevents REC request ever going out at all
to the target having REC support. So this patches fixes the
fc_fcp_timeout by checking FC_RP_FLAGS_REC_SUPPORTED flag first.

The changed order won't cause any issue during clearing
FC_RP_FLAGS_REC_SUPPORTED on failed IO with target not supporting
FC_RP_FLAGS_REC_SUPPORTED, since retry on failed IO would succeed.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
2012-12-04 10:16:38 -08:00
Yi Zou
3b64b18811 [SCSI] libfc: fix lun reset failure bugs in fc_fcp_resp handling of FCP_RSP_INFO
In LUN RESET testing involving NetApp targets, it is observed that LUN
RESET is failing. The fc_fcp_resp() is not completing the completion
for the LUN RESET task since fc_fcp_resp assumes that the FCP_RSP_INFO
is 8 bytes with the 4 byte reserved field, where in case of NetApp targets
the FCP_RSP to LUN RESET only has 4 bytes of FCP_RSP_INFO. This leads
fc_fcp_resp to error out w/o completing the task completion, eventually
causing LUN RESET to be escalated to host reset, which is not very nice.

Per FCP-3 r04, clause 9.5.15 and Table 23, the FCP_RSP_INFO field can be either
4 bytes or 8 bytes, with the last 4 bytes as "Reserved (if any)". Therefore it
is valid to have 4 bytes FCP_RSP_INFO like some of the NetApp targets behave.
Fixing this by validating the FCP_RSP_INFO against both the two spec allowed
length.

Reported-by: Frank Zhang <frank_1.zhang@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-10-07 11:52:55 +01:00
Yi Zou
a752359f2b [SCSI] libfc: fix sending REC after FCP_RESP is received
This is exposed in the case the FCP_DATA frames somehow got lost and fc_fcp got
the FCP_RSP, in fc_fcp_recv_resp(), since xfer_len is less than the expected_len
it resets the the timer to wait to 2 more jiffies in case the data frames are
already queued locally. However, for target does not support REC, it would just
send RJT w/ ELS_RJT_UNSUP. The rec response handler thus only clears the rport
flag for not doing REC later, but does not do fcp_io_complete() on the
associated fsp.

The fix is just check status of FCP_RSP being received already, i.e. using the
FC_SRB_RCV_STATUS flag, in fc_fcp_timeout before start sending REC. We should
have waited long enough if there is truely data frames queued locally.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:56 +01:00
Vasu Dev
ac166d2fbd [SCSI] libfc: fix retries with FDMI lport states
The FC-GS-3 sepc requires to wait for least 3 times R_A_TOV per
sec 4.6.1 "If the Requesting_CT does not receive a Response
CT_IU from the Responding_CT within three times R_A_TOV,
it shall consider this to be a protocol error."

This means added four new states with management server
could add significant delay with multiple retries
on default 12 second timeout(3 * R_A_TOV), so instead
just skip these states on very first timeout on any of
these states to not stuck with states for such longer
period.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:56 +01:00
Yi Zou
db95fc004e [SCSI] libfc: don't exch_done() on invalid sequence ptr
The lport_recv(), i.e., fc_lport_recv_req() may get called w/o the sequence ptr
being set in fr_seq(), particularly in the case of vn2vn mode, this may happen
if the passive fcp provider, e.g., tcm_fc, has not been registered yet.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:56 +01:00
Vasu Dev
b29a4f309f [SCSI] libfc: add exch timer debug info
Add exch timeout info to have debug log with exch timeout
value to match with retries, also add debug info
on exch timer cancel.

Added common fc_exch_timer_cancel() func and grouped this
along with fc_exch_timer_set() function, so that
added debug code is not repeated.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:55 +01:00
Vasu Dev
4e5fae7adb [SCSI] libfc: update fcp and exch stats
Updates newly added stats from fc_get_host_stats,
added new function fc_exch_update_stats to
update exches related stats from fc_exch.c
by going thru internal ema_list elements.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Acked-by : Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:31:48 +01:00
Vasu Dev
0f02a66528 [SCSI] libfc: adds FCP failures stats
Adds stats to track FCP pkt and frame alloc
failure.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Acked-by : Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:31:48 +01:00
Vasu Dev
1bd49b4820 [SCSI] libfc, fcoe, bnx2fc: cleanup fcoe_dev_stats
The libfc is used by fcoe but fcoe agnostic,
and therefore should not have any fcoe references.

So renaming fcoe_dev_stats from libfc as its for fc_stats.
After that libfc is fcoe string free except some strings for
Open-FCoE.org.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Acked-by : Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:31:47 +01:00
James Bottomley
e346933365 isci update for 3.5
1/ Rework remote-node-context (RNC) handling for proper management of
    the silicon state machine in error handling and hot-plug conditions.
    Further details below, suffice to say if the RNC is mismanaged the
    silicon state machines may lock up.
 
 2/ Refactor the initialization code to be reused for suspend/resume support
 
 3/ Miscellaneous bug fixes to address discovery issues and hardware
    compatibility.
 
 RNC rework details from Jeff Skirvin:
 
 In the controller, devices as they appear on a SAS domain (or
 direct-attached SATA devices) are represented by memory structures known
 as "Remote Node Contexts" (RNCs).  These structures are transferred from
 main memory to the controller using a set of register commands; these
 commands include setting up the context ("posting"), removing the
 context ("invalidating"), and commands to control the scheduling of
 commands and connections to that remote device ("suspensions" and
 "resumptions").  There is a similar path to control RNC scheduling from
 the protocol engine, which interprets the results of command and data
 transmission and reception.
 
 In general, the controller chooses among non-suspended RNCs to find one
 that has work requiring scheduling the transmission of command and data
 frames to a target.  Likewise, when a target tries to return data back
 to the initiator, the state of the RNC is used by the controller to
 determine how to treat the incoming request. As an example, if the RNC
 is in the state "TX/RX Suspended", incoming SSP connection requests from
 the target will be rejected by the controller hardware.  When an RNC is
 "TX Suspended", it will not be selected by the controller hardware to
 start outgoing command or data operations (with certain priority-based
 exceptions).
 
 As mentioned above, there are two sources for management of the RNC
 states: commands from driver software, and the result of transmission
 and reception conditions of commands and data signaled by the controller
 hardware.  As an example of the latter, if an outgoing SSP command ends
 with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will
 transition to the "TX Suspended" state, and this is signaled by the
 controller hardware in the status to the completion of the pending
 command as well as signaled in a controller hardware event.  Examples of
 the former are included in the patch changelogs.
 
 Driver software is required to suspend the RNC in a "TX/RX Suspended"
 condition before any outstanding commands can be terminated.  Failure to
 guarantee this can lead to a complete hardware hang condition.  Earlier
 versions of the driver software did not guarantee that an RNC was
 correctly managed before I/O termination, and so operated in an unsafe
 way.
 
 Further, the driver performed unnecessary contortions to preserve the
 remote device command state and so was more complicated than it needed
 to be.  A simplifying driver assumption is that once an I/O has entered
 the error handler path without having completed in the target, the
 requirement on the driver is that all use of the sas_task must end.
 Beyond that, recovery of operation is dependent on libsas and other
 components to reset, rediscover and reconfigure the device before normal
 operation can restart.  In the driver, this simplifying assumption meant
 that the RNC management could be reduced to entry into the suspended
 state, terminating the targeted I/O request, and resuming the RNC as
 needed for device-specific management such as an SSP Abort Task or LUN
 Reset Management request.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPtYFXAAoJEB7SkWpmfYgCJkcP/1VvsuuitNy/YM9P1tb/RvQ7
 ytJzjGtWiAABHVWwjgB+Ng7hUTaP2r6l8KeNfwxpwXyNdBAUNysYEUBHAfPsKzKz
 espTmw3wVCnREajgKXZwFp9aTj8DcYFB6vKcC/ddACt3uRNjjA9En36+6797r8Vg
 YdebyjFX2FxwoUj0icTUiV/OXgb8w723imnCl8bfOhhFRi4eFZ4EJ23AdMkUya1i
 uYePAPJPSQJuU/87gNIx4JcR0qHJ1ziGPEY+XC47CzEeXbBTSPgWOwanQ6KPoXRJ
 XVxamfcKAjRdtwQ4m1vYBSE32RTdrjhujVbkGiPi6QaEbCLjhLSCIYyuS3XMckV+
 TCZ16o5kd/I6ZtZOeP4zZRGnNBkOPzY44qiJeKffjWDhTrFacx4XWJB/ftWPgEA5
 N2zFH3RM4sY0FUJ3I/Qe5CERNdCXMtcj+UAf3nHpAIVcv46Lp+qoSkdEx05uuuiN
 +D/dSlubktuvuzmB5WisL3qrjNEkkLTAGQpZs1j0ojLEBm0XAgV5EzqmHiZ0GOPD
 OQNFxeei9SlqgtKIIP0bymRispPrG2HVCOvExYMxzKR6fjxofZLAs/aWOsdhxgMq
 TlAyZJ6OmGI+KX68HHzoMpT9iquvmP64WGkfHzCx296BfSKiruLh/Jzt5gGwv+Z1
 5tlpnUr9dUTxx7qkQXvj
 =HYvO
 -----END PGP SIGNATURE-----

Merge tag 'isci-for-3.5' into misc

isci update for 3.5

1/ Rework remote-node-context (RNC) handling for proper management of
   the silicon state machine in error handling and hot-plug conditions.
   Further details below, suffice to say if the RNC is mismanaged the
   silicon state machines may lock up.

2/ Refactor the initialization code to be reused for suspend/resume support

3/ Miscellaneous bug fixes to address discovery issues and hardware
   compatibility.

RNC rework details from Jeff Skirvin:

In the controller, devices as they appear on a SAS domain (or
direct-attached SATA devices) are represented by memory structures known
as "Remote Node Contexts" (RNCs).  These structures are transferred from
main memory to the controller using a set of register commands; these
commands include setting up the context ("posting"), removing the
context ("invalidating"), and commands to control the scheduling of
commands and connections to that remote device ("suspensions" and
"resumptions").  There is a similar path to control RNC scheduling from
the protocol engine, which interprets the results of command and data
transmission and reception.

In general, the controller chooses among non-suspended RNCs to find one
that has work requiring scheduling the transmission of command and data
frames to a target.  Likewise, when a target tries to return data back
to the initiator, the state of the RNC is used by the controller to
determine how to treat the incoming request. As an example, if the RNC
is in the state "TX/RX Suspended", incoming SSP connection requests from
the target will be rejected by the controller hardware.  When an RNC is
"TX Suspended", it will not be selected by the controller hardware to
start outgoing command or data operations (with certain priority-based
exceptions).

As mentioned above, there are two sources for management of the RNC
states: commands from driver software, and the result of transmission
and reception conditions of commands and data signaled by the controller
hardware.  As an example of the latter, if an outgoing SSP command ends
with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will
transition to the "TX Suspended" state, and this is signaled by the
controller hardware in the status to the completion of the pending
command as well as signaled in a controller hardware event.  Examples of
the former are included in the patch changelogs.

Driver software is required to suspend the RNC in a "TX/RX Suspended"
condition before any outstanding commands can be terminated.  Failure to
guarantee this can lead to a complete hardware hang condition.  Earlier
versions of the driver software did not guarantee that an RNC was
correctly managed before I/O termination, and so operated in an unsafe
way.

Further, the driver performed unnecessary contortions to preserve the
remote device command state and so was more complicated than it needed
to be.  A simplifying driver assumption is that once an I/O has entered
the error handler path without having completed in the target, the
requirement on the driver is that all use of the sas_task must end.
Beyond that, recovery of operation is dependent on libsas and other
components to reset, rediscover and reconfigure the device before normal
operation can restart.  In the driver, this simplifying assumption meant
that the RNC management could be reduced to entry into the suspended
state, terminating the targeted I/O request, and resuming the RNC as
needed for device-specific management such as an SSP Abort Task or LUN
Reset Management request.
2012-05-21 12:17:30 +01:00