We saw periodic messages like:
WARNING: at drivers/scsi/libfc/fc_exch.c:825 fc_seq_start_next+0x30/0x4b
This was due to trying to allocate a sequence in a request handler
when the exchange had been reset.
Delete the WARN_ON.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
During an fcoe module unload, we saw a problem where fc_rport_work()
finds the lport has been freed. The rdata points to an area
containing 0x6b6b6b6b... the pool poison value from kmem_free().
In fcoe_if_destroy() we call fc_fabric_logoff() then fc_lport_destroy().
fc_fabric_logoff() flushes the remote port work, but we're still receiving
requests, and an RSCN or PLOGI arrives which creates more rports.
Note that although the LLD also checks link_up, it doesn't do it
under the lport mutex, so it can deliver frames to
fc_lport_recv_req() even after link_up is cleared.
So, re-check link_up there.
We need to flush the rports by calling disc_stop_final()
after we clear link_up.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When removing the fcoe module, several lports were being shut down
through fc_lport_fabric_logoff().
Occasionally, one would enter reset state before fc_lport_destroy()
was called, and since link_up was still true, it would log back in.
If we just clear link_up earlier, then we wouldn't be accepting LOGO
requests from other initiators while we are shutting down.
Fix by changing the LOGO response handler to enter DISABLED instead
of RESET. Add an fc_lport_enter_disabled() function which does
what fc_lport_enter_reset() did, except it doesn't proceed to FLOGI state.
Move the code that was common between fc_lport_enter_reset() and
fc_lport_enter_disabled() into a new fc_lport_reset_locked() function.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The state NONE was meant to be invalid, but has been used as
the initial state. Rename it to be DISABLED, as more descriptive.
Further patches will make it the like the RESET state, except
it won't transition to FLOGI until fc_lport_fabric_login() is called.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Currently the fc_exch_rrq is called with fc_exch's ex_lock held.
The fc_exch_rrq allocates new exch and that requires taking
ex_lock again after EM lock. This locking order causes warning,
see more details on this warning at :-
http://www.open-fcoe.org/pipermail/devel/2009-July/003251.html
This patch fixes this by dropping the ex_lock before calling
fc_exch_rrq().
The fc_exch_rrq needs to grab ex_lock lock again to schedule
RRQ retry and in the meanwhile fc_exch_reset could occur before
ex_lock is grabbed inside fc_exch_rrq. So to handle this case,
this patch adds additional check to detect fc_exch_reset after
ex_lock acquired and in case the fc_exch_reset occurred then
abandons the RRQ retry and releases the exch.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds the /sys/module/libfc/parameters/debug_logging
file to sysfs as a module parameter. It accepts an integer
bitmask for logging. Currently it supports:
bit
LSB 0 = general libfc debugging
1 = lport debugging
2 = disc debugging
3 = rport debugging
4 = fcp debugging
5 = EM debugging
6 = exch/seq debugging
7 = scsi logging (mostly error handling)
the other bits are not used at this time.
The patch converts all of the libfc source files to use
these new macros and removes the old FC_DBG macro.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When a sequence is received in response to an exchange we issued previously,
we should check to see if the exchange has completed. If yes, the sequence
should be discarded. Since the exchange might be still in the completion
process, it should be untouched.
Signed-off-by: Steve Ma <steve.ma@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we aborted a command, because it timed out we should not use
DID_ABORT. It will fail the command right away back to the upper
layer. We want to use something that indicated that the problem
did not complete normally, but it was not a fatal problem.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This allows fnic to configure number of retries for lport and rport
separately.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fix function declarations:
drivers/scsi/fcoe/fcoe.c:1356:28: warning: non-ANSI function declaration of function 'fcoe_dev_setup'
drivers/scsi/libfc/fc_rport.c:1293:20: warning: non-ANSI function declaration of function 'fc_setup_rport'
drivers/scsi/libfc/fc_rport.c:1302:23: warning: non-ANSI function declaration of function 'fc_destroy_rport'
[jejb: fixed wrong doc in comment noticed during inspection]
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When building with a .config generated from 'make allmodconfig'
some build warnings are generated. This patch corrects the warnings,
adds a FC_FID_NONE (= 0) enumeration for FC-IDs and cleans up one
variable naming to meet our variable naming conventions. For example,
fc_lport's should be named "lport," not "lp."
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When a delete event is queued for an rport, set state to NONE so that no
other processing is done on the rport as it is being removed.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
After lport_destroy, the local port should not be used again. Transition
to state NONE, any incoming frames or link up should not transition out
of this state since we are deleting exchange table and cleaning up the
local port. Also, mark link as down.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We want to generate the rport queue event (from the logoff)
before flushing the queue otherwise the event may still be
in the queue when we logoff.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Rogue ports are currently not tracked on any list. The only reference
to them is through any outstanding exchanges pending on the rogue ports.
If the module is removed while a retry is set on a rogue port
(say a Plogi retry for instance), this retry is not cancelled because there
is no reference to the rogue port in the discovery rports list. Thus the
local port can clean itself up, delete the exchange pool, and then the
rogue port timeout can fire and try to start up another exchange.
This patch tracks the rogue ports in a new list disc->rogue_rports. Creating
a new list instead of using the disc->rports list keeps remote port code
change to a minimum.
1) Whenever a rogue port is created, it is immediately added to the
disc->rogue_rports list.
2) When the rogues port goes to ready, it is removed from the rogue list
and the real remote port is added to the disc->rports list
3) The removal of the rogue from the disc->rogue_rports list is done in
the context of the fc_rport_work() workQ thread in discovery callback.
4) Real rports are removed from the disc->rports list like before. Lookup
is done only in the real rports list. This avoids making large changes
to the remote port code.
5) In fc_disc_stop_rports, the rogues list is traversed in addition to the
real list to stop the rogue ports and issue logoffs on them. This way, rogue
ports get cleaned up when the local port goes away.
6) rogue remote ports are not removed from the list right away, but
removed late in fc_rport_work() context, multiple threads can find the same
remote port in the list and call rport_logoff(). Rport_logoff() only
continues with the logoff if port is not in NONE state, thus preventing
multiple logoffs and multiple list deletions.
7) Since the rport is removed from the disc list at a later stage
(in the disc callback), incoming frames can find the rport even if
rport_logoff() has been called on the rport. When rport_logoff() is called,
the rport state is set to NONE, and we are trying to cancel all exchanges
and retries on that port. While in this state, if an incoming
Plogi/Prli/Logo or other frames match the rport, we should not reply
because the rport is in the NONE state. Just drop the frame, since the
rport will be deleted soon in the disc callback (fc_rport_work)
8) In fc_disc_single(), remove rport lookup and call to fc_disc_del_target.
fc_disc_single() is called from recv_rscn_req() where rport lookup
and rport_logoff is already done.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
For instance, if there is a Plogi pending (remote port is in Plogi state),
and the state changes to say NONE (because the port is being logged off),
then when the Plogi resp times out, do not start a retry.
This patch partially reverts an earlier patch (libfc: check for err when
recv and state is incorrect), by moving the state check back to before
checking for error. However, if the state does not match, then there is
an additional check to see if its an error ptr or a real frame before
jumping to err or out respectively.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
gpn_ft_resp processing currently does not hold the discovery lock.
disc_done() thus gets called from gpn_ft_resp or from gpn_ft_parse
without the lock held. This then sets disc->pending to zero or calls
gpn_ft_req() without disc_lock held.
- Hold disc mutex during gpn_ft resp processing
- In disc_done, release the disc mutex while calling lport callback
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Just sets up build environment for libfcoe module towards a
libfcoe library for libfc LLDs using FCoE as libfc transport.
Common library code to libfcoe is added in next patch.
Also, updated MODULE_LICENSE from "GPL" string to "GPL v2" for
libfc, libfcoe and fcoe modules to accurately match the licenses.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Remove the hotplug creation of dev_stats, we allocate for all possible CPUs
now when we allocate the lport.
v2: Durring the 2.6.30 merge window, before these patches were comitted,
'percpu_ptr' was renamed 'per_cpu_ptr'. This latest update updates this
patch for the name change.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When LLD supports direct data placement (ddp) for large receive of an scsi
i/o coming into fc_fcp, we call into libfc_function_template's ddp_setup()
to prepare for a ddp of large receive for this read I/O. When I/O is complete,
we call the corresponding ddp_done() to get the length of data ddped as well
as to let LLD do clean up.
fc_fcp_ddp_setup()/fc_fcp_ddp_done() are added to setup and complete a ddped
read I/O described by the given fc_fcp_pkt. They would call into corresponding
ddp_setup/ddp_done implemented by the fcoe layer. Eventually, fcoe layer calls
into LLD's ddp_setup/ddp_done provided through net_device
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Make sure for large send is supported by LLD in outgoing FCP data, we are only
sending the lso_max a time in one single large send, since that is what
supported by LLD.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
!ep->esb_stat is either 1 or 0, and the rightmost bit of ESB_ST_COMPLETE is
always 0, making the result of !ep->esb_stat & ESB_ST_COMPLETE always 0.
Thus parentheses around the argument to ! seem needed.
The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@ expression E; constant C; @@
(
!E & !C
|
- !E & C
+ !(E & C)
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The SUGGEST_* flags in the SCSI command result have been out of fashion
for a while and we don't actually use them in the error handling.
Remove the remaining occurrences.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
I got the following warnings on IA64:
drivers/scsi/libfc/fc_lport.c: In function 'fc_lport_recv_flogi_req':
drivers/scsi/libfc/fc_lport.c:788: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'u64'
drivers/scsi/libfc/fc_lport.c:792: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'u64'
scsi/libfc/fc_rport.c: In function 'fc_rport_recv_plogi_req':
/home/fujita/git/linux-2.6/drivers/scsi/libfc/fc_rport.c:968: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We shouldn't be altering inbound frames.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
1) There were a few functions with a strange layout, i.e. all
arguments on the second line, when not necessary.
Where ever possible I moved the return value to the same line
as the function name. However, when the line was too long
to have a single argument on the same line I moved the
return value to above line. For example:
<short return> <function name>(<arg 1>, <arg2>)
and
<very long return value>
<function name>(<arg1>,
<arg2>)
2) Removed one extra whitespace line
3) Fixed two typos
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
1) Added '()' for function names in kerneldoc comments
2) Changed comment bookends from '**/' to '*/'. The comment on the the
mailing list was that '**/' "is consistently unconventional. Not
wrong, just odd." The Documentation/kernel-doc-nano-HOWTO.txt
states that kerneldoc comment blocks should end with '**/' but most
(if not all) instance I found under drivers/scsi/ were only using
the '*/' so I converted to that style.
3) Removed incorrect linebreaks in kerneldoc comments where found
4) Removed a few unnecessary blank comment lines in kerneldoc comment
blocks
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we've just created an interface and the an rport is
logging in we may have a request on the wire (say PRLI).
If we destroy the interface, we'll go through each rport
on the disc->rports list and set each rport's state to NONE.
Then the lport will reset the EM. The EM reset will send a
CLOSED event to the prli_resp() handler which will notice
that the state != PRLI. In this case it frees the frame
pointer, decrements the refcount and unlocks the rport.
The problem is that there isn't a frame in this case. It's
just a pointer with an embedded error code. The free causes
an Oops.
This patch moves the error checking to be before the state
checking.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Just rename the variable as per our naming convention.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We only need to use this macro when assigning a value to
rport->dd_data. All other accesses should just use dd_data.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When a sequence cannot be delivered to the target, the local
port will schedule retries, While this process is in progress,
if we destroy the FCoE interface, the fcoe_sw_destroy routine is
entered, and the fc_exch_mgr_free(lp->emp) is called. Thus
if fc_exch_alloc() is called when retrying the sequence,
the mempool_alloc() will fail to allocate the exchange because
the mempool of the exchange manager has already been released.
This patch is to cancel any pending retry work of the local
port before we start to destroy the interface.
Also, when resetting the local port, we should also stop the
scheduled pending retries.
Signed-off-by: Steve Ma <steve.ma@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The fc_fcp_complete_locked detected data underrun in this case and set
the FC_DATA_UNDRUN but that was ignored by fc_io_compl for all cases
including read underrun.
Added code to not to ignore FC_DATA_UNDRUN for read IO and instead
suggested scsi-ml to retry cmd to recover from lost data frame.
Not sure if it is okay to ignore FC_DATA_UNDRUN for other case, so let
code as is for other cases but removed or-ing with zero valued fsp->cdb_status
for those cases.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This allows any rport ELS to retry on LS_RJT.
The rport error handling would only retry on resource allocation failures
and exchange timeouts. I have a target that will occasionally reject PLOGI
when we do a quick LOGO/PLOGI. When a critical ELS was rejected, libfc would
fail silently leaving the rport in a dead state.
The retry count and delay are managed by fc_rport_error_retry. If the retry
count is exceeded fc_rport_error will be called. When retrying is not the
correct course of action, fc_rport_error can be called directly.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The fcoe_xmit could call fc_pause in case the pending skb queue len is larger
than FCOE_MAX_QUEUE_DEPTH, the fc_pause was trying to grab lport->lp_muex to
change lport->link_status and that had these issues :-
1. The fcoe_xmit was getting called with bh disabled, thus causing
"BUG: scheduling while atomic" when grabbing lport->lp_muex with bh disabled.
2. fc_linkup and fc_linkdown function calls lport_enter function with
lport->lp_mutex held and these enter function in turn calls fcoe_xmit to send
lport related FC frame, e.g. fc_linkup => fc_lport_enter_flogi to send flogi
req. In this case grabbing the same lport->lp_mutex again in fc_puase from
fcoe_xmit would cause deadlock.
The lport->lp_mutex was used for setting FC_PAUSE in fcoe_xmit path but
FC_PAUSE bit was not used anywhere beside just setting and clear this
bit in lport->link_status, instead used a separate field qfull in fc_lport
to eliminate need for lport->lp_mutex to track pending queue full condition
and in turn avoid above described two locking issues.
Also added check for lp->qfull in fc_fcp_lport_queue_ready to trigger
SCSI_MLQUEUE_HOST_BUSY when lp->qfull is set to prevent more scsi-ml cmds
while lp->qfull is set.
This patch eliminated FC_LINK_UP and FC_PAUSE and instead used dedicated
fields in fc_lport for this, this simplified all related conditional
code.
Also removed fc_pause and fc_unpause functions and instead used newly added
lport->qfull directly in fcoe.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The fc_seq_start_next grabs ep->ex_lock but this lock was already held here,
so instead called fc_seq_start_next_locked to avoid soft lockup.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cleanup exchange held due to RRQ when RRQ exch times out, in this case the
ABTS is already done causing RRQ req therefore proceeding with cleanup in
fc_exch_rrq_resp should be okay to restore exch resource.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When a rport goes away, libFC does a plogi which will reset exchanges
at the rport. Clean exchanges at our end, both in transport and libFC.
If transport hooks into exch_mgr_reset, it will call back into
fc_exch_mgr_reset() to clean up libFC exchanges.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
fc_exch_mgr structure is private to fc_exch.c. To export exch_mgr_reset to
transport, transport needs access to the exch manager. Change
exch_mgr_reset to use lport param which is the shared structure between
libFC and transport.
Alternatively, fc_exch_mgr definition can be moved to libfc.h so that lport
can be accessed from mp*.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
libFC is composed of 4 blocks supported by an exchange manager
and a framing library. The upper 4 layers are fc_lport, fc_disc,
fc_rport and fc_fcp. A LLD that uses libfc could choose to
either use libfc's block, or using the transport template
defined in libfc.h, override one or more blocks with its own
implementation.
The EM (Exchange Manager) manages exhcanges/sequences for all
commands- ELS, CT and FCP.
The framing library frames ELS and CT commands.
The fc_lport block manages the library's representation of the
host's FC enabled ports.
The fc_disc block manages discovery of targets as well as
handling changes that occur in the FC fabric (via. RSCN events).
The fc_rport block manages the library's representation of other
entities in the FC fabric. Currently the library uses this block
for targets, its peer when in point-to-point mode and the
directory server, but can be extended for other entities if
needed.
The fc_fcp block interacts with the scsi-ml and handles all
I/O.
Signed-off-by: Robert Love <robert.w.love@intel.com>
[jejb: added include of delay.h to fix ppc64 compile prob spotted by sfr]
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>