mirror of
https://github.com/torvalds/linux.git
synced 2024-11-08 21:21:47 +00:00
735f7d2fed
Arrange for the deallocation of a struct domain_device object when it no longer has: 1/ any children 2/ references by any scsi_targets 3/ references by a lldd The comment about domain_device lifetime in Documentation/scsi/libsas.txt is stale as it appears mainline never had a version of a struct domain_device that was registered as a kobject. We now manage domain_device reference counts on behalf of external agents. Reviewed-by: Jack Wang <jack_wang@usish.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
470 lines
16 KiB
Plaintext
470 lines
16 KiB
Plaintext
SAS Layer
|
|
---------
|
|
|
|
The SAS Layer is a management infrastructure which manages
|
|
SAS LLDDs. It sits between SCSI Core and SAS LLDDs. The
|
|
layout is as follows: while SCSI Core is concerned with
|
|
SAM/SPC issues, and a SAS LLDD+sequencer is concerned with
|
|
phy/OOB/link management, the SAS layer is concerned with:
|
|
|
|
* SAS Phy/Port/HA event management (LLDD generates,
|
|
SAS Layer processes),
|
|
* SAS Port management (creation/destruction),
|
|
* SAS Domain discovery and revalidation,
|
|
* SAS Domain device management,
|
|
* SCSI Host registration/unregistration,
|
|
* Device registration with SCSI Core (SAS) or libata
|
|
(SATA), and
|
|
* Expander management and exporting expander control
|
|
to user space.
|
|
|
|
A SAS LLDD is a PCI device driver. It is concerned with
|
|
phy/OOB management, and vendor specific tasks and generates
|
|
events to the SAS layer.
|
|
|
|
The SAS Layer does most SAS tasks as outlined in the SAS 1.1
|
|
spec.
|
|
|
|
The sas_ha_struct describes the SAS LLDD to the SAS layer.
|
|
Most of it is used by the SAS Layer but a few fields need to
|
|
be initialized by the LLDDs.
|
|
|
|
After initializing your hardware, from the probe() function
|
|
you call sas_register_ha(). It will register your LLDD with
|
|
the SCSI subsystem, creating a SCSI host and it will
|
|
register your SAS driver with the sysfs SAS tree it creates.
|
|
It will then return. Then you enable your phys to actually
|
|
start OOB (at which point your driver will start calling the
|
|
notify_* event callbacks).
|
|
|
|
Structure descriptions:
|
|
|
|
struct sas_phy --------------------
|
|
Normally this is statically embedded to your driver's
|
|
phy structure:
|
|
struct my_phy {
|
|
blah;
|
|
struct sas_phy sas_phy;
|
|
bleh;
|
|
};
|
|
And then all the phys are an array of my_phy in your HA
|
|
struct (shown below).
|
|
|
|
Then as you go along and initialize your phys you also
|
|
initialize the sas_phy struct, along with your own
|
|
phy structure.
|
|
|
|
In general, the phys are managed by the LLDD and the ports
|
|
are managed by the SAS layer. So the phys are initialized
|
|
and updated by the LLDD and the ports are initialized and
|
|
updated by the SAS layer.
|
|
|
|
There is a scheme where the LLDD can RW certain fields,
|
|
and the SAS layer can only read such ones, and vice versa.
|
|
The idea is to avoid unnecessary locking.
|
|
|
|
enabled -- must be set (0/1)
|
|
id -- must be set [0,MAX_PHYS)
|
|
class, proto, type, role, oob_mode, linkrate -- must be set
|
|
oob_mode -- you set this when OOB has finished and then notify
|
|
the SAS Layer.
|
|
|
|
sas_addr -- this normally points to an array holding the sas
|
|
address of the phy, possibly somewhere in your my_phy
|
|
struct.
|
|
|
|
attached_sas_addr -- set this when you (LLDD) receive an
|
|
IDENTIFY frame or a FIS frame, _before_ notifying the SAS
|
|
layer. The idea is that sometimes the LLDD may want to fake
|
|
or provide a different SAS address on that phy/port and this
|
|
allows it to do this. At best you should copy the sas
|
|
address from the IDENTIFY frame or maybe generate a SAS
|
|
address for SATA directly attached devices. The Discover
|
|
process may later change this.
|
|
|
|
frame_rcvd -- this is where you copy the IDENTIFY/FIS frame
|
|
when you get it; you lock, copy, set frame_rcvd_size and
|
|
unlock the lock, and then call the event. It is a pointer
|
|
since there's no way to know your hw frame size _exactly_,
|
|
so you define the actual array in your phy struct and let
|
|
this pointer point to it. You copy the frame from your
|
|
DMAable memory to that area holding the lock.
|
|
|
|
sas_prim -- this is where primitives go when they're
|
|
received. See sas.h. Grab the lock, set the primitive,
|
|
release the lock, notify.
|
|
|
|
port -- this points to the sas_port if the phy belongs
|
|
to a port -- the LLDD only reads this. It points to the
|
|
sas_port this phy is part of. Set by the SAS Layer.
|
|
|
|
ha -- may be set; the SAS layer sets it anyway.
|
|
|
|
lldd_phy -- you should set this to point to your phy so you
|
|
can find your way around faster when the SAS layer calls one
|
|
of your callbacks and passes you a phy. If the sas_phy is
|
|
embedded you can also use container_of -- whatever you
|
|
prefer.
|
|
|
|
|
|
struct sas_port --------------------
|
|
The LLDD doesn't set any fields of this struct -- it only
|
|
reads them. They should be self explanatory.
|
|
|
|
phy_mask is 32 bit, this should be enough for now, as I
|
|
haven't heard of a HA having more than 8 phys.
|
|
|
|
lldd_port -- I haven't found use for that -- maybe other
|
|
LLDD who wish to have internal port representation can make
|
|
use of this.
|
|
|
|
|
|
struct sas_ha_struct --------------------
|
|
It normally is statically declared in your own LLDD
|
|
structure describing your adapter:
|
|
struct my_sas_ha {
|
|
blah;
|
|
struct sas_ha_struct sas_ha;
|
|
struct my_phy phys[MAX_PHYS];
|
|
struct sas_port sas_ports[MAX_PHYS]; /* (1) */
|
|
bleh;
|
|
};
|
|
|
|
(1) If your LLDD doesn't have its own port representation.
|
|
|
|
What needs to be initialized (sample function given below).
|
|
|
|
pcidev
|
|
sas_addr -- since the SAS layer doesn't want to mess with
|
|
memory allocation, etc, this points to statically
|
|
allocated array somewhere (say in your host adapter
|
|
structure) and holds the SAS address of the host
|
|
adapter as given by you or the manufacturer, etc.
|
|
sas_port
|
|
sas_phy -- an array of pointers to structures. (see
|
|
note above on sas_addr).
|
|
These must be set. See more notes below.
|
|
num_phys -- the number of phys present in the sas_phy array,
|
|
and the number of ports present in the sas_port
|
|
array. There can be a maximum num_phys ports (one per
|
|
port) so we drop the num_ports, and only use
|
|
num_phys.
|
|
|
|
The event interface:
|
|
|
|
/* LLDD calls these to notify the class of an event. */
|
|
void (*notify_ha_event)(struct sas_ha_struct *, enum ha_event);
|
|
void (*notify_port_event)(struct sas_phy *, enum port_event);
|
|
void (*notify_phy_event)(struct sas_phy *, enum phy_event);
|
|
|
|
When sas_register_ha() returns, those are set and can be
|
|
called by the LLDD to notify the SAS layer of such events
|
|
the SAS layer.
|
|
|
|
The port notification:
|
|
|
|
/* The class calls these to notify the LLDD of an event. */
|
|
void (*lldd_port_formed)(struct sas_phy *);
|
|
void (*lldd_port_deformed)(struct sas_phy *);
|
|
|
|
If the LLDD wants notification when a port has been formed
|
|
or deformed it sets those to a function satisfying the type.
|
|
|
|
A SAS LLDD should also implement at least one of the Task
|
|
Management Functions (TMFs) described in SAM:
|
|
|
|
/* Task Management Functions. Must be called from process context. */
|
|
int (*lldd_abort_task)(struct sas_task *);
|
|
int (*lldd_abort_task_set)(struct domain_device *, u8 *lun);
|
|
int (*lldd_clear_aca)(struct domain_device *, u8 *lun);
|
|
int (*lldd_clear_task_set)(struct domain_device *, u8 *lun);
|
|
int (*lldd_I_T_nexus_reset)(struct domain_device *);
|
|
int (*lldd_lu_reset)(struct domain_device *, u8 *lun);
|
|
int (*lldd_query_task)(struct sas_task *);
|
|
|
|
For more information please read SAM from T10.org.
|
|
|
|
Port and Adapter management:
|
|
|
|
/* Port and Adapter management */
|
|
int (*lldd_clear_nexus_port)(struct sas_port *);
|
|
int (*lldd_clear_nexus_ha)(struct sas_ha_struct *);
|
|
|
|
A SAS LLDD should implement at least one of those.
|
|
|
|
Phy management:
|
|
|
|
/* Phy management */
|
|
int (*lldd_control_phy)(struct sas_phy *, enum phy_func);
|
|
|
|
lldd_ha -- set this to point to your HA struct. You can also
|
|
use container_of if you embedded it as shown above.
|
|
|
|
A sample initialization and registration function
|
|
can look like this (called last thing from probe())
|
|
*but* before you enable the phys to do OOB:
|
|
|
|
static int register_sas_ha(struct my_sas_ha *my_ha)
|
|
{
|
|
int i;
|
|
static struct sas_phy *sas_phys[MAX_PHYS];
|
|
static struct sas_port *sas_ports[MAX_PHYS];
|
|
|
|
my_ha->sas_ha.sas_addr = &my_ha->sas_addr[0];
|
|
|
|
for (i = 0; i < MAX_PHYS; i++) {
|
|
sas_phys[i] = &my_ha->phys[i].sas_phy;
|
|
sas_ports[i] = &my_ha->sas_ports[i];
|
|
}
|
|
|
|
my_ha->sas_ha.sas_phy = sas_phys;
|
|
my_ha->sas_ha.sas_port = sas_ports;
|
|
my_ha->sas_ha.num_phys = MAX_PHYS;
|
|
|
|
my_ha->sas_ha.lldd_port_formed = my_port_formed;
|
|
|
|
my_ha->sas_ha.lldd_dev_found = my_dev_found;
|
|
my_ha->sas_ha.lldd_dev_gone = my_dev_gone;
|
|
|
|
my_ha->sas_ha.lldd_max_execute_num = lldd_max_execute_num; (1)
|
|
|
|
my_ha->sas_ha.lldd_queue_size = ha_can_queue;
|
|
my_ha->sas_ha.lldd_execute_task = my_execute_task;
|
|
|
|
my_ha->sas_ha.lldd_abort_task = my_abort_task;
|
|
my_ha->sas_ha.lldd_abort_task_set = my_abort_task_set;
|
|
my_ha->sas_ha.lldd_clear_aca = my_clear_aca;
|
|
my_ha->sas_ha.lldd_clear_task_set = my_clear_task_set;
|
|
my_ha->sas_ha.lldd_I_T_nexus_reset= NULL; (2)
|
|
my_ha->sas_ha.lldd_lu_reset = my_lu_reset;
|
|
my_ha->sas_ha.lldd_query_task = my_query_task;
|
|
|
|
my_ha->sas_ha.lldd_clear_nexus_port = my_clear_nexus_port;
|
|
my_ha->sas_ha.lldd_clear_nexus_ha = my_clear_nexus_ha;
|
|
|
|
my_ha->sas_ha.lldd_control_phy = my_control_phy;
|
|
|
|
return sas_register_ha(&my_ha->sas_ha);
|
|
}
|
|
|
|
(1) This is normally a LLDD parameter, something of the
|
|
lines of a task collector. What it tells the SAS Layer is
|
|
whether the SAS layer should run in Direct Mode (default:
|
|
value 0 or 1) or Task Collector Mode (value greater than 1).
|
|
|
|
In Direct Mode, the SAS Layer calls Execute Task as soon as
|
|
it has a command to send to the SDS, _and_ this is a single
|
|
command, i.e. not linked.
|
|
|
|
Some hardware (e.g. aic94xx) has the capability to DMA more
|
|
than one task at a time (interrupt) from host memory. Task
|
|
Collector Mode is an optional feature for HAs which support
|
|
this in their hardware. (Again, it is completely optional
|
|
even if your hardware supports it.)
|
|
|
|
In Task Collector Mode, the SAS Layer would do _natural_
|
|
coalescing of tasks and at the appropriate moment it would
|
|
call your driver to DMA more than one task in a single HA
|
|
interrupt. DMBS may want to use this by insmod/modprobe
|
|
setting the lldd_max_execute_num to something greater than
|
|
1.
|
|
|
|
(2) SAS 1.1 does not define I_T Nexus Reset TMF.
|
|
|
|
Events
|
|
------
|
|
|
|
Events are _the only way_ a SAS LLDD notifies the SAS layer
|
|
of anything. There is no other method or way a LLDD to tell
|
|
the SAS layer of anything happening internally or in the SAS
|
|
domain.
|
|
|
|
Phy events:
|
|
PHYE_LOSS_OF_SIGNAL, (C)
|
|
PHYE_OOB_DONE,
|
|
PHYE_OOB_ERROR, (C)
|
|
PHYE_SPINUP_HOLD.
|
|
|
|
Port events, passed on a _phy_:
|
|
PORTE_BYTES_DMAED, (M)
|
|
PORTE_BROADCAST_RCVD, (E)
|
|
PORTE_LINK_RESET_ERR, (C)
|
|
PORTE_TIMER_EVENT, (C)
|
|
PORTE_HARD_RESET.
|
|
|
|
Host Adapter event:
|
|
HAE_RESET
|
|
|
|
A SAS LLDD should be able to generate
|
|
- at least one event from group C (choice),
|
|
- events marked M (mandatory) are mandatory (only one),
|
|
- events marked E (expander) if it wants the SAS layer
|
|
to handle domain revalidation (only one such).
|
|
- Unmarked events are optional.
|
|
|
|
Meaning:
|
|
|
|
HAE_RESET -- when your HA got internal error and was reset.
|
|
|
|
PORTE_BYTES_DMAED -- on receiving an IDENTIFY/FIS frame
|
|
PORTE_BROADCAST_RCVD -- on receiving a primitive
|
|
PORTE_LINK_RESET_ERR -- timer expired, loss of signal, loss
|
|
of DWS, etc. (*)
|
|
PORTE_TIMER_EVENT -- DWS reset timeout timer expired (*)
|
|
PORTE_HARD_RESET -- Hard Reset primitive received.
|
|
|
|
PHYE_LOSS_OF_SIGNAL -- the device is gone (*)
|
|
PHYE_OOB_DONE -- OOB went fine and oob_mode is valid
|
|
PHYE_OOB_ERROR -- Error while doing OOB, the device probably
|
|
got disconnected. (*)
|
|
PHYE_SPINUP_HOLD -- SATA is present, COMWAKE not sent.
|
|
|
|
(*) should set/clear the appropriate fields in the phy,
|
|
or alternatively call the inlined sas_phy_disconnected()
|
|
which is just a helper, from their tasklet.
|
|
|
|
The Execute Command SCSI RPC:
|
|
|
|
int (*lldd_execute_task)(struct sas_task *, int num,
|
|
unsigned long gfp_flags);
|
|
|
|
Used to queue a task to the SAS LLDD. @task is the tasks to
|
|
be executed. @num should be the number of tasks being
|
|
queued at this function call (they are linked listed via
|
|
task::list), @gfp_mask should be the gfp_mask defining the
|
|
context of the caller.
|
|
|
|
This function should implement the Execute Command SCSI RPC,
|
|
or if you're sending a SCSI Task as linked commands, you
|
|
should also use this function.
|
|
|
|
That is, when lldd_execute_task() is called, the command(s)
|
|
go out on the transport *immediately*. There is *no*
|
|
queuing of any sort and at any level in a SAS LLDD.
|
|
|
|
The use of task::list is two-fold, one for linked commands,
|
|
the other discussed below.
|
|
|
|
It is possible to queue up more than one task at a time, by
|
|
initializing the list element of struct sas_task, and
|
|
passing the number of tasks enlisted in this manner in num.
|
|
|
|
Returns: -SAS_QUEUE_FULL, -ENOMEM, nothing was queued;
|
|
0, the task(s) were queued.
|
|
|
|
If you want to pass num > 1, then either
|
|
A) you're the only caller of this function and keep track
|
|
of what you've queued to the LLDD, or
|
|
B) you know what you're doing and have a strategy of
|
|
retrying.
|
|
|
|
As opposed to queuing one task at a time (function call),
|
|
batch queuing of tasks, by having num > 1, greatly
|
|
simplifies LLDD code, sequencer code, and _hardware design_,
|
|
and has some performance advantages in certain situations
|
|
(DBMS).
|
|
|
|
The LLDD advertises if it can take more than one command at
|
|
a time at lldd_execute_task(), by setting the
|
|
lldd_max_execute_num parameter (controlled by "collector"
|
|
module parameter in aic94xx SAS LLDD).
|
|
|
|
You should leave this to the default 1, unless you know what
|
|
you're doing.
|
|
|
|
This is a function of the LLDD, to which the SAS layer can
|
|
cater to.
|
|
|
|
int lldd_queue_size
|
|
The host adapter's queue size. This is the maximum
|
|
number of commands the lldd can have pending to domain
|
|
devices on behalf of all upper layers submitting through
|
|
lldd_execute_task().
|
|
|
|
You really want to set this to something (much) larger than
|
|
1.
|
|
|
|
This _really_ has absolutely nothing to do with queuing.
|
|
There is no queuing in SAS LLDDs.
|
|
|
|
struct sas_task {
|
|
dev -- the device this task is destined to
|
|
list -- must be initialized (INIT_LIST_HEAD)
|
|
task_proto -- _one_ of enum sas_proto
|
|
scatter -- pointer to scatter gather list array
|
|
num_scatter -- number of elements in scatter
|
|
total_xfer_len -- total number of bytes expected to be transferred
|
|
data_dir -- PCI_DMA_...
|
|
task_done -- callback when the task has finished execution
|
|
};
|
|
|
|
DISCOVERY
|
|
---------
|
|
|
|
The sysfs tree has the following purposes:
|
|
a) It shows you the physical layout of the SAS domain at
|
|
the current time, i.e. how the domain looks in the
|
|
physical world right now.
|
|
b) Shows some device parameters _at_discovery_time_.
|
|
|
|
This is a link to the tree(1) program, very useful in
|
|
viewing the SAS domain:
|
|
ftp://mama.indstate.edu/linux/tree/
|
|
I expect user space applications to actually create a
|
|
graphical interface of this.
|
|
|
|
That is, the sysfs domain tree doesn't show or keep state if
|
|
you e.g., change the meaning of the READY LED MEANING
|
|
setting, but it does show you the current connection status
|
|
of the domain device.
|
|
|
|
Keeping internal device state changes is responsibility of
|
|
upper layers (Command set drivers) and user space.
|
|
|
|
When a device or devices are unplugged from the domain, this
|
|
is reflected in the sysfs tree immediately, and the device(s)
|
|
removed from the system.
|
|
|
|
The structure domain_device describes any device in the SAS
|
|
domain. It is completely managed by the SAS layer. A task
|
|
points to a domain device, this is how the SAS LLDD knows
|
|
where to send the task(s) to. A SAS LLDD only reads the
|
|
contents of the domain_device structure, but it never creates
|
|
or destroys one.
|
|
|
|
Expander management from User Space
|
|
-----------------------------------
|
|
|
|
In each expander directory in sysfs, there is a file called
|
|
"smp_portal". It is a binary sysfs attribute file, which
|
|
implements an SMP portal (Note: this is *NOT* an SMP port),
|
|
to which user space applications can send SMP requests and
|
|
receive SMP responses.
|
|
|
|
Functionality is deceptively simple:
|
|
|
|
1. Build the SMP frame you want to send. The format and layout
|
|
is described in the SAS spec. Leave the CRC field equal 0.
|
|
open(2)
|
|
2. Open the expander's SMP portal sysfs file in RW mode.
|
|
write(2)
|
|
3. Write the frame you built in 1.
|
|
read(2)
|
|
4. Read the amount of data you expect to receive for the frame you built.
|
|
If you receive different amount of data you expected to receive,
|
|
then there was some kind of error.
|
|
close(2)
|
|
All this process is shown in detail in the function do_smp_func()
|
|
and its callers, in the file "expander_conf.c".
|
|
|
|
The kernel functionality is implemented in the file
|
|
"sas_expander.c".
|
|
|
|
The program "expander_conf.c" implements this. It takes one
|
|
argument, the sysfs file name of the SMP portal to the
|
|
expander, and gives expander information, including routing
|
|
tables.
|
|
|
|
The SMP portal gives you complete control of the expander,
|
|
so please be careful.
|