This just adds a helper function to check if a device is configured and it
converts the target users to use it. The next patch will add a backend
module user so those types of modules do not have to know the lio core
details.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use INIT_LIST_HEAD to initialize node list head.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The caller of queue_cmd_ring grabs and releases the lock, so the
tcmu_setup_cmd_timer failure handling inside queue_cmd_ring should not call
mutex_unlock.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
- rounddown CXGBIT_MAX_ISO_PAYLOAD by csk->emss before calculating
max_iso_npdu to get max TCP payload in multiple of mss.
- call cxgbit_set_digest() before cxgbit_set_iso_npdu() to set
csk->submode, it is used in calculating number of iso pdus.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
se_dev_entry.ua_count is only used to check whether or not
se_dev_entry.ua_list is empty. Use list_empty_careful() instead. Checking
whether or not ua_list is empty without holding the lock that protects that
list is fine because the code that dequeues from that list will check again
whether or not that list is empty.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Send a valid ASC / ASCQ combination back to the initiator if a SCSI command
is received after a LUN has been removed. This patch fixes the following
call trace:
WARNING: CPU: 0 PID: 4 at drivers/target/target_core_transport.c:3131 translate_sense_reason+0x164/0x190 [target_core_mod]
Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
RIP: 0010:translate_sense_reason+0x164/0x190 [target_core_mod]
Call Trace:
transport_send_check_condition_and_sense+0x95/0x1c0 [target_core_mod]
transport_generic_request_failure+0x102/0x270 [target_core_mod]
transport_generic_new_cmd+0x138/0x340 [target_core_mod]
transport_handle_cdb_direct+0x2f/0x80 [target_core_mod]
target_submit_cmd_map_sgls+0x212/0x2a0 [target_core_mod]
srpt_handle_new_iu+0x244/0x680 [ib_srpt]
__ib_process_cq+0x6d/0xc0 [ib_core]
ib_cq_poll_work+0x18/0x50 [ib_core]
process_one_work+0x20b/0x6a0
worker_thread+0x35/0x380
kthread+0x117/0x130
ret_from_fork+0x24/0x30
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The code under the "release:" label can only be reached after se_cmd has
been set to a non-NULL value. Hence remove the if (se_cmd) test. Keep the
else-part since calling transport_generic_free_cmd() is not necessary for a
command that has not been submitted to the core.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 4d3895d5ea ("target/tcm_loop: Merge struct tcm_loop_cmd and struct tcm_loop_tmr")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since most target drivers do not use the second fabric_make_tpg() argument
("group") and since it is trivial to derive the group pointer from the wwn
pointer, do not pass the group pointer to fabric_make_tpg().
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes: e48354ce07 ("iscsi-target: Add iSCSI fabric support for target v4.1")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of calling __iscsit_free_cmd() from inside iscsit_aborted_task() if
a command has been aborted and from inside iscsit_free_cmd() if a command
has not been aborted, call __iscsit_free_cmd() from inside
lio_release_cmd(). The latter function is namely called for all commands
once the reference count has dropped to zero.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Varun Prakash <varun@chelsio.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of embedding the completion that is used for waiting for command
completion in struct se_cmd, let the context that waits for command
completion allocate it. This makes it possible to have a single code path
for non-aborted and aborted commands in target_release_cmd_kref() and
avoids that transport_generic_free_cmd() has to call
cmd->se_tfo->release_cmd() directly. This patch does not change any
functionality. Note: transport_generic_free_cmd() only waits until the
se_cmd reference count has reached zero after it has set both
CMD_T_FABRIC_STOP and CMD_T_ABORTED.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since target_wait_free_cmd() skips TMFs with no associated LUN, it is safe
to call that function for such commands. Use this to simplify
transport_generic_free_cmd(). The only functional change in this patch is
that CMD_T_FABRIC_STOP gets set for TMFs with no associated LUN by
transport_generic_free_cmd().
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move identical code outside an if/else statement. This patch does not
change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For the two calls to transport_cmd_finish_abort() outside
core_tmr_handle_tas_abort() it is guaranteed that CMD_T_TAS is not set. Use
this property to fold core_tmr_handle_tas_abort() into
transport_cmd_finish_abort(). This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The code that can set CMD_T_TAS is executed by the same thread as the
thread that executes core_tmr_handle_tas_abort(). That means that no
locking is needed to check CMD_T_TAS from inside
core_tmr_handle_tas_abort(). This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Document those aspects of transport_cmd_check_stop_to_fabric() and
transport_generic_free_cmd() of which it is nontrivial to derive these from
their implementation.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Target drivers must call target_sess_cmd_list_set_waiting() and
target_wait_for_sess_cmds() before freeing a session. Since freeing a
session is only safe after all commands that are associated with a session
have finished, make target_wait_for_sess_cmds() also wait for commands that
are being aborted. Instead of setting a flag in each pending command from
target_sess_cmd_list_set_waiting() and waiting in
target_wait_for_sess_cmds() on a per-command completion, only set a
per-session flag in the former function and wait on a per-session
completion in the latter function. This change is safe because once a SCSI
initiator system has submitted a command a target system is always allowed
to execute it to completion. See also commit 0f4a943168 ("target: Fix
remote-port TMR ABORT + se_cmd fabric stop").
This patch is based on the following two patches:
* Bart Van Assche, target: Simplify session shutdown code, February 19, 2015
(8df5463d7d).
* Christoph Hellwig, target: Rework session shutdown code, December 7, 2015
(http://thread.gmane.org/gmane.linux.scsi.target.devel/10695).
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Other than initializing xcopy_pt_sess.sess_wait_list, this patch does not
change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch does not change any functionality but makes the next patch
easier to read.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The approach for adding a device to the devices_idr data structure and for
removing it is as follows:
* &dev->dev_group.cg_item is initialized before a device is added to
devices_idr.
* If the reference count of a device drops to zero then
target_free_device() removes the device from devices_idr.
* All devices_idr manipulations are protected by device_mutex.
This means that increasing the reference count of a device is sufficient to
prevent removal from devices_idr and also that it is safe access
dev_group.cg_item for any device that is referenced by devices_idr. Use
this to modify target_find_device() and target_for_each_device() such that
these functions no longer introduce a dependency between device_mutex and
the configfs root inode mutex.
Note: it is safe to pass a NULL pointer to config_item_put() and also to
config_item_get_unless_zero().
This patch prevents that lockdep reports the following complaint:
======================================================
WARNING: possible circular locking dependency detected
4.12.0-rc1-dbg+ #1 Not tainted
------------------------------------------------------
rmdir/12053 is trying to acquire lock:
(device_mutex#2){+.+.+.}, at: [<ffffffffa010afce>]
target_free_device+0xae/0xf0 [target_core_mod]
but task is already holding lock:
(&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c5c30>]
vfs_rmdir+0x50/0x140
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&sb->s_type->i_mutex_key#14){++++++}:
lock_acquire+0x59/0x80
down_write+0x36/0x70
configfs_depend_item+0x3a/0xb0 [configfs]
target_depend_item+0x13/0x20 [target_core_mod]
target_xcopy_locate_se_dev_e4_iter+0x87/0x100 [target_core_mod]
target_devices_idr_iter+0x16/0x20 [target_core_mod]
idr_for_each+0x39/0xc0
target_for_each_device+0x36/0x50 [target_core_mod]
target_xcopy_locate_se_dev_e4+0x28/0x80 [target_core_mod]
target_xcopy_do_work+0x2e9/0xdd0 [target_core_mod]
process_one_work+0x1ca/0x3f0
worker_thread+0x49/0x3b0
kthread+0x109/0x140
ret_from_fork+0x31/0x40
-> #0 (device_mutex#2){+.+.+.}:
__lock_acquire+0x101f/0x11d0
lock_acquire+0x59/0x80
__mutex_lock+0x7e/0x950
mutex_lock_nested+0x16/0x20
target_free_device+0xae/0xf0 [target_core_mod]
target_core_dev_release+0x10/0x20 [target_core_mod]
config_item_put+0x6e/0xb0 [configfs]
configfs_rmdir+0x1a6/0x300 [configfs]
vfs_rmdir+0xb7/0x140
do_rmdir+0x1f4/0x200
SyS_rmdir+0x11/0x20
entry_SYSCALL_64_fastpath+0x23/0xc2
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&sb->s_type->i_mutex_key#14);
lock(device_mutex#2);
lock(&sb->s_type->i_mutex_key#14);
lock(device_mutex#2);
*** DEADLOCK ***
3 locks held by rmdir/12053:
#0: (sb_writers#10){.+.+.+}, at: [<ffffffff811e223f>]
mnt_want_write+0x1f/0x50
#1: (&sb->s_type->i_mutex_key#14/1){+.+.+.}, at: [<ffffffff811cb97e>]
do_rmdir+0x15e/0x200
#2: (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c5c30>]
vfs_rmdir+0x50/0x140
stack backtrace:
CPU: 3 PID: 12053 Comm: rmdir Not tainted 4.12.0-rc1-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0x86/0xcf
print_circular_bug+0x1c7/0x220
__lock_acquire+0x101f/0x11d0
lock_acquire+0x59/0x80
__mutex_lock+0x7e/0x950
mutex_lock_nested+0x16/0x20
target_free_device+0xae/0xf0 [target_core_mod]
target_core_dev_release+0x10/0x20 [target_core_mod]
config_item_put+0x6e/0xb0 [configfs]
configfs_rmdir+0x1a6/0x300 [configfs]
vfs_rmdir+0xb7/0x140
do_rmdir+0x1f4/0x200
SyS_rmdir+0x11/0x20
entry_SYSCALL_64_fastpath+0x23/0xc2
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
[Rebased to handle conflict withe target_find_device removal]
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some target code uses config_item_name() while other code accesses .ci_name
directly. Make the target code consistent by switching to
config_item_name().
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix warning:
smatch warnings:
drivers/target/target_core_user.c:301 tcmu_genl_cmd_done() warn: KERN_*
level not at start of string
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
target_find_device is no longer used, so remove it.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Tested-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch based on Xiubo's patches adds 2 tcmu attr to block and reset the
netlink interface. It's used during userspace daemon reinitialization after
the daemon has crashed while there is outstanding nl requests. The daemon
can block the nl interface, kill outstanding requests in the kernel and
then reopen the netlink socket and unblock it to allow new requests.
[mkp: typo]
Signed-off-by: Mike Christie <mchristi@redhat.com>
Tested-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some misc cleanup of the nl rework patches.
1. Fix space instead of tabs use and extra newline
2. Drop initializing variables to 0 when not needed
3. Just pass the skb_buff and msg_header pointers to
tcmu_netlink_event_send.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Tested-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Just return EBUSY if a nl request comes in while processing one. The upper
layers do not support sending multiple create/remove requests at the same
time (you cannot have a create and remove at the same time or do multiple
creates or removes at the same time) and doing a reconfig while a
create/remove is still executing does not make sense.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Tested-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The next patch is going to fix the hung nl command issue so this adds a
list of outstanding nl commands that we can later abort when the daemon is
restarted.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When this code changed, this was never cleaned up.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Tested-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The sbitmap and the percpu_ida perform essentially the same task,
allocating tags for commands. The sbitmap outperforms the percpu_ida as
documented here: https://lkml.org/lkml/2014/4/22/553
The sbitmap interface is a little harder to use, but being able to remove
the percpu_ida code and getting better performance justifies the additional
complexity.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> # f_tcm
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce target_free_tag() and convert all drivers to use it.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
SPC5r17 states that the contents of the ADDITIONAL LENGTH field are not
altered based on the allocation length, so always calculate and pack the
full key list length even if the list itself is truncated.
According to Maged:
Yes it fixes the "Storage Spaces Persistent Reservation" test in the
Windows 2016 Server Failover Cluster validation suites when having
many connections that result in more than 8 registrations. I tested
your patch on 4.17 with iblock.
This behaviour can be tested using the libiscsi PrinReadKeys.Truncate test.
Cc: stable@vger.kernel.org
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Maged Mokhtar <mmokhtar@petasan.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since the TCMU_RING_SIZE macro is not using here will discard it and at the
same time clean up the code style.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Generally target core and TCMUser seem to work fine for tape devices and
media changers. But there is at least one situation where TCMUser is not
able to support sequential access device emulation correctly.
The situation is when an initiator sends a SCSI READ CDB with a length that
is greater than the length of the tape block to read. We can distinguish
two subcases:
A) The initiator sent the READ CDB with the SILI bit being set.
In this case the sequential access device has to transfer the data from
the tape block (only the length of the tape block) and transmit a good
status. The current interface between TCMUser and the userspace does
not support reduction of the read data size by the userspace program.
The patch below fixes this subcase by allowing the userspace program to
specify a reduced data size in read direction.
B) The initiator sent the READ CDB with the SILI bit not being set.
In this case the sequential access device has to transfer the data from
the tape block as in A), but additionally has to transmit CHECK
CONDITION with the ILI bit set and NO SENSE in the sensebytes. The
information field in the sensebytes must contain the residual count.
With the below patch a user space program can specify the real read data
length and appropriate sensebytes. TCMUser then uses the se_cmd flag
SCF_TREAT_READ_AS_NORMAL, to force target core to transmit the real data
size and the sensebytes. Note: the flag SCF_TREAT_READ_AS_NORMAL is
introduced by Lee Duncan's patch "[PATCH v4] target: transport should
handle st FM/EOM/ILI reads" from Tue, 15 May 2018 18:25:24 -0700.
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
xfcp, hisi_sas, cxlflash, qla2xxx. In the absence of Nic, we're also
taking target updates which are mostly minor except for the tcmu
refactor. The only real core change to worry about is the removal of
high page bouncing (in sas, storvsc and iscsi). This has been well
tested and no problems have shown up so far.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWx1pbCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishUucAP42pccS
ziKyiOizuxv9fZ4Q+nXd1A9zhI5tqqpkHjcQegEA40qiZSi3EKGKR8W0UpX7Ntmo
tqrZJGojx9lnrAM2RbQ=
=NMXg
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
xfcp, hisi_sas, cxlflash, qla2xxx.
In the absence of Nic, we're also taking target updates which are
mostly minor except for the tcmu refactor.
The only real core change to worry about is the removal of high page
bouncing (in sas, storvsc and iscsi). This has been well tested and no
problems have shown up so far"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (268 commits)
scsi: lpfc: update driver version to 12.0.0.4
scsi: lpfc: Fix port initialization failure.
scsi: lpfc: Fix 16gb hbas failing cq create.
scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
scsi: lpfc: correct oversubscription of nvme io requests for an adapter
scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx)
scsi: hisi_sas: Mark PHY as in reset for nexus reset
scsi: hisi_sas: Fix return value when get_free_slot() failed
scsi: hisi_sas: Terminate STP reject quickly for v2 hw
scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
scsi: hisi_sas: Try wait commands before before controller reset
scsi: hisi_sas: Init disks after controller reset
scsi: hisi_sas: Create a scsi_host_template per HW module
scsi: hisi_sas: Reset disks when discovered
scsi: hisi_sas: Add LED feature for v3 hw
scsi: hisi_sas: Change common allocation mode of device id
scsi: hisi_sas: change slot index allocation mode
scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
scsi: hisi_sas: fix a typo in hisi_sas_task_prep()
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJbFIrHAAoJEPfTWPspceCm2+kQAKo7o7HL30aRxJYu+gYafkuW
PV47zr3e4vhMDEzDaMsh1+V7I7bm3uS+NZu6cFbcV+N9KXFpeb4V4Hvvm5cs+OC3
WCOBi4eC1h4qnDQ3ZyySrCMN+KHYJ16pZqddEjqw+fhVudx8i+F+jz3Y4ZMDDc3q
pArKZvjKh2wEuYXUMFTjaXY46IgPt+er94OwvrhyHk+4AcA+Q/oqSfSdDahUC8jb
BVR3FV4I3NOHUaru0RbrUko13sVZSboWPCIFrlTDz8xXcJOnVHzdVS1WLFDXLHnB
O8q9cADCfa4K08kz68RxykcJiNxNvz5ChDaG0KloCFO+q1tzYRoXLsfaxyuUDg57
Zd93OFZC6hAzXdhclDFIuPET9OQIjDzwphodfKKmDsm3wtyOtydpA0o7JUEongp0
O1gQsEfYOXmQsXlo8Ot+Z7Ne/HvtGZ91JahUa/59edxQbcKaMrktoyQsQ/d1nOEL
4kXID18wPcFHWRQHYXyVuw6kbpRtQnh/U2m1eenSZ7tVQHwoe6mF3cfSf5MMseak
k8nAnmsfEvOL4Ar9ftg61GOrImaQlidxOC2A8fmY5r0Sq/ZldvIFIZizsdTTCcni
8SOTxcQowyqPf5NvMNQ8cKqqCJap3ppj4m7anZNhbypDIF2TmOWsEcXcMDn4y9on
fax14DPLo59gBRiPCn5f
=nga/
-----END PGP SIGNATURE-----
Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
- clean up how we pass around gfp_t and
blk_mq_req_flags_t (Christoph)
- prepare us to defer scheduler attach (Christoph)
- clean up drivers handling of bounce buffers (Christoph)
- fix timeout handling corner cases (Christoph/Bart/Keith)
- bcache fixes (Coly)
- prep work for bcachefs and some block layer optimizations (Kent).
- convert users of bio_sets to using embedded structs (Kent).
- fixes for the BFQ io scheduler (Paolo/Davide/Filippo)
- lightnvm fixes and improvements (Matias, with contributions from Hans
and Javier)
- adding discard throttling to blk-wbt (me)
- sbitmap blk-mq-tag handling (me/Omar/Ming).
- remove the sparc jsflash block driver, acked by DaveM.
- Kyber scheduler improvement from Jianchao, making it more friendly
wrt merging.
- conversion of symbolic proc permissions to octal, from Joe Perches.
Previously the block parts were a mix of both.
- nbd fixes (Josef and Kevin Vigor)
- unify how we handle the various kinds of timestamps that the block
core and utility code uses (Omar)
- three NVMe pull requests from Keith and Christoph, bringing AEN to
feature completeness, file backed namespaces, cq/sq lock split, and
various fixes
- various little fixes and improvements all over the map
* tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits)
blk-mq: update nr_requests when switching to 'none' scheduler
block: don't use blocking queue entered for recursive bio submits
dm-crypt: fix warning in shutdown path
lightnvm: pblk: take bitmap alloc. out of critical section
lightnvm: pblk: kick writer on new flush points
lightnvm: pblk: only try to recover lines with written smeta
lightnvm: pblk: remove unnecessary bio_get/put
lightnvm: pblk: add possibility to set write buffer size manually
lightnvm: fix partial read error path
lightnvm: proper error handling for pblk_bio_add_pages
lightnvm: pblk: fix smeta write error path
lightnvm: pblk: garbage collect lines with failed writes
lightnvm: pblk: rework write error recovery path
lightnvm: pblk: remove dead function
lightnvm: pass flag on graceful teardown to targets
lightnvm: pblk: check for chunk size before allocating it
lightnvm: pblk: remove unnecessary argument
lightnvm: pblk: remove unnecessary indirection
lightnvm: pblk: return NVM_ error on failed submission
lightnvm: pblk: warn in case of corrupted write buffer
...
Convert the target code to embedded bio sets.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Trivial fix to spelling mistake in pr_err message text.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a tape drive is exported via LIO using the pscsi module, a read
that requests more bytes per block than the tape can supply returns an
empty buffer. This is because the pscsi pass-through target module sees
the "ILI" illegal length bit set and thinks there is no reason to return
the data.
This is a long-standing transport issue, since it assumes that no data
need be returned under a check condition, which isn't always the case
for tape.
Add in a check for tape reads with the ILI, EOM, or FM bits set, with a
sense code of NO_SENSE, treating such cases as if the read
succeeded. The layered tape driver then "does the right thing" when it
gets such a response.
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Problem:
$ cat /sys/kernel/config/target/core/user_0/block/attrib/qfull_time_out
-1
$ echo "-1" > /sys/kernel/config/target/core/user_0/block/attrib/qfull_time_out
-bash: echo: write error: Invalid argument
Fix:
This patch will help reset qfull_time_out to its default
i.e. qfull_time_out=-1.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are two advantages:
* Direct I/O allows to avoid the write-back cache, so it reduces affects
to other processes in the system.
* Async I/O allows to handle a few commands concurrently.
DIO + AIO shows a better perfomance for random write operations:
Mode: O_DSYNC Async: 1
$ ./fio --bs=4K --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --name=/dev/sda --runtime=20 --numjobs=2
WRITE: bw=45.9MiB/s (48.1MB/s), 21.9MiB/s-23.0MiB/s (22.0MB/s-25.2MB/s), io=919MiB (963MB), run=20002-20020msec
Mode: O_DSYNC Async: 0
$ ./fio --bs=4K --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --name=/dev/sdb --runtime=20 --numjobs=2
WRITE: bw=1607KiB/s (1645kB/s), 802KiB/s-805KiB/s (821kB/s-824kB/s), io=31.8MiB (33.4MB), run=20280-20295msec
Known issue:
DIF (PI) emulation doesn't work when a target uses async I/O, because
DIF metadata is saved in a separate file, and it is another non-trivial
task how to synchronize writing in two files, so that a following read
operation always returns a consisten metadata for a specified block.
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Minor optimization - remove a pointer indirection when using fs_bio_set.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Switch everyone to blk_get_request_flags, and then rename
blk_get_request_flags to blk_get_request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event attribute
TCMU_ATTR_WRITECACHE(belongs to TCMU_CMD_RECONFIG_DEVICE) which is also
emulate_write_cache in configFS.
Removed tcmu_netlink_event() since we have new netlink
events helpers now.
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event attribute
TCMU_ATTR_DEV_SIZE(belongs to TCMU_CMD_RECONFIG_DEVICE) which is also
dev_size in configFS.
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event attribute
TCMU_ATTR_DEV_CFG(belongs to TCMU_CMD_RECONFIG_DEVICE) which is also
dev_config in configFS.
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event TCMU_CMD_REMOVED_DEVICE
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event TCMU_CMD_ADDED_DEVICE
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add new netlink events helpers tcmu_netlink_event_init() and
tcmu_netlink_event_send(). These new functions intend to replace
existing netlink events helper function tcmu_netlink_event().
The existing function tcmu_netlink_event() works well for events like
TCMU_ADDED_DEVICE and TCMU_REMOVED_DEVICE which only has one netlink
attribute. But if there is a command requires more than one attributes
to send out, we have to use a struct to adapt the paremeter
reconfig_data, it is hard to use one struct or a union in one struct to
adapt every command with different attributes, it may get long and ugly.
With the new two functions, we can call tcmu_netlink_event_init() to
initialize a netlink event, then add all attributes we need by using
nla_put_xxx(), at last use tcmu_netlink_event_send() to send it out. So
that we don't need to use a long struct or union if we want to send
mulitple attributes for different commands.
[mkp: typos]
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make documentation on target-supported userspace-I/O design be
usable by kernel-doc by using "DOC:". This is used in the driver-api
Documentation chapter.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: linux-scsi@vger.kernel.org
Cc: target-devel@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For exported functions that already have near-kernel-doc notation,
fix them to begin with "/**" and make a few corrections so that they
don't have any kernel-doc warnings.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: linux-scsi@vger.kernel.org
Cc: target-devel@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct a function parameter's name to eliminate kernel-doc warnings
in drivers/target/target_core_transport.c.
Fixes these kernel-doc warnings: (tested by adding these files to a new
target.rst documentation file)
../drivers/target/target_core_transport.c:1671: warning: No description found for parameter 'fabric_tmr_ptr'
../drivers/target/target_core_transport.c:1671: warning: Excess function parameter 'fabric_context' description in 'target_submit_tmr'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: linux-scsi@vger.kernel.org
Cc: target-devel@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use new return type vm_fault_t for fault handler in struct
vm_operations_struct.
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The target database root directory, dbroot, has defaulted to /var/target
for a while, but its main client, targetcli-fb, has been moving it to
/etc/target for quite some time. With the plethora of target drivers now
appearing, it has become more difficult to initialize this attribute
before use by any child drivers.
If the directory /etc/target exists, use that as the DB root. Otherwise,
fall back to using /var/target.
The ability to override this dbroot attribute still exists via sysfs.
Signed-off-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the current page can't be added to bio, one new bio should be
created for adding this page again, instead of ignoring this page.
This patch fixes kernel crash with iscsi target and dvd, as reported by
Wakko.
Cc: Wakko Warner <wakko@animx.eu.org>
Cc: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: target-devel@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Christoph Hellwig <hch@lst.de>
Fixes: 84c8590646 ("target: avoid accessing .bi_vcnt directly")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJawr05AAoJEPfTWPspceCmT2UP/1uuaqwzyl4VjFNb/k7KS7UM
+Cs/1HBlGomgMA8orDTGqtWqLRdR3z4RSh0+MvXTzQ78HpFVYz7CbDc9itHm+G9M
X0ypD4kF/JGCFb5cxk+x6qv28uO2nv4DP3+0hHqJWLH4UVJBWDY6bs4BPShsf9QB
I6XjioNMhoqylXgdOITLODJZz+TcChlJMDAqwhpJwh9TH1wjobleAZ6AdmCPfgi5
h0UCKMUKzcVJlNZwQUrzrs2cxcx9Uhunnbz7HK0ZV4n/FKFtDpGynFpQQ71pZxKe
Be0ZOBPCQvC3ykOM/egCIvC/e5y7FgrjORD6jxyu1PTwAugI5E1VYSMxHkXvgPAx
zOo9A7RT4GPO2tDQv+DbzNFpqeSAclTgSmr+/y1wmheBs8DiSt7MPVBiNM4zdCNv
NLk9z7IEjFhdmluSB/LbTb1aokypMb/q7QTLouPHdwGn80k7yrhFyLHgdjpNTQ2K
UHfHZvGxkOX6SmFhBNOtIFUkuSceenh64a0RkRle7filx+ImpbCVm2/GYi9zZNCu
EtctgzLbLmz40zMiyDaZS2bxBgGzfn6yf4xd9LsaAJPMhvZnmXogT0D9ctWXB0WU
mMaS7sOkLnNjnGkzF1fHkeiZ/oigrstJbe+CA7BtOdwxpWn6MZBgKEoFQ6iA2b3X
5J1axMgVH5LAsIEcEQVq
=RVhK
-----END PGP SIGNATURE-----
Merge tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
"It's a pretty quiet round this time, which is nice. This contains:
- series from Bart, cleaning up the way we set/test/clear atomic
queue flags.
- series from Bart, fixing races between gendisk and queue
registration and removal.
- set of bcache fixes and improvements from various folks, by way of
Michael Lyle.
- set of lightnvm updates from Matias, most of it being the 1.2 to
2.0 transition.
- removal of unused DIO flags from Nikolay.
- blk-mq/sbitmap memory ordering fixes from Omar.
- divide-by-zero fix for BFQ from Paolo.
- minor documentation patches from Randy.
- timeout fix from Tejun.
- Alpha "can't write a char atomically" fix from Mikulas.
- set of NVMe fixes by way of Keith.
- bsg and bsg-lib improvements from Christoph.
- a few sed-opal fixes from Jonas.
- cdrom check-disk-change deadlock fix from Maurizio.
- various little fixes, comment fixes, etc from various folks"
* tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block: (139 commits)
blk-mq: Directly schedule q->timeout_work when aborting a request
blktrace: fix comment in blktrace_api.h
lightnvm: remove function name in strings
lightnvm: pblk: remove some unnecessary NULL checks
lightnvm: pblk: don't recover unwritten lines
lightnvm: pblk: implement 2.0 support
lightnvm: pblk: implement get log report chunk
lightnvm: pblk: rename ppaf* to addrf*
lightnvm: pblk: check for supported version
lightnvm: implement get log report chunk helpers
lightnvm: make address conversions depend on generic device
lightnvm: add support for 2.0 address format
lightnvm: normalize geometry nomenclature
lightnvm: complete geo structure with maxoc*
lightnvm: add shorten OCSSD version in geo
lightnvm: add minor version to generic geometry
lightnvm: simplify geometry structure
lightnvm: pblk: refactor init/exit sequences
lightnvm: Avoid validation of default op value
lightnvm: centralize permission check for lightnvm ioctl
...
Use blk_queue_flag_set() instead of open-coding this function.
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Changes since v1:
Added changes in these files:
drivers/infiniband/hw/usnic/usnic_transport.c
drivers/staging/lustre/lnet/lnet/lib-socket.c
drivers/target/iscsi/iscsi_target_login.c
drivers/vhost/net.c
fs/dlm/lowcomms.c
fs/ocfs2/cluster/tcp.c
security/tomoyo/network.c
Before:
All these functions either return a negative error indicator,
or store length of sockaddr into "int *socklen" parameter
and return zero on success.
"int *socklen" parameter is awkward. For example, if caller does not
care, it still needs to provide on-stack storage for the value
it does not need.
None of the many FOO_getname() functions of various protocols
ever used old value of *socklen. They always just overwrite it.
This change drops this parameter, and makes all these functions, on success,
return length of sockaddr. It's always >= 0 and can be differentiated
from an error.
Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.
rpc_sockname() lost "int buflen" parameter, since its only use was
to be passed to kernel_getsockname() as &buflen and subsequently
not used in any way.
Userspace API is not changed.
text data bss dec hex filename
30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
30108109 2633612 873672 33615393 200ee21 vmlinux.o
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: linux-bluetooth@vger.kernel.org
CC: linux-decnet-user@lists.sourceforge.net
CC: linux-wireless@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: linux-sctp@vger.kernel.org
CC: linux-nfs@vger.kernel.org
CC: linux-x25@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull SCSI target updates from Nicholas Bellinger:
"The highlights include:
- numerous target-core-user improvements related to queue full and
timeout handling. (MNC)
- prevent target-core-user corruption when invalid data page is
requested. (MNC)
- add target-core device action configfs attributes to allow
user-space to trigger events separate from existing attributes
exposed to end-users. (MNC)
- fix iscsi-target NULL pointer dereference 4.6+ regression in CHAP
error path. (David Disseldorp)
- avoid target-core backend UNMAP callbacks if range is zero. (Andrei
Vagin)
- fix a iscsi-target 4.14+ regression related multiple PDU logins,
that was exposed due to removal of TCP prequeue support. (Florian
Westphal + MNC)
Also, there is a iser-target bug still being worked on for post -rc1
code to address a long standing issue resulting in persistent
ib_post_send() failures, for RNICs with small max_send_sge"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits)
iscsi-target: make sure to wake up sleeping login worker
tcmu: Fix trailing semicolon
tcmu: fix cmd user after free
target: fix destroy device in target_configure_device
tcmu: allow userspace to reset ring
target core: add device action configfs files
tcmu: fix error return code in tcmu_configure_device()
target_core_user: add cmd id to broken ring message
target: add SAM_STAT_BUSY sense reason
tcmu: prevent corruption when invalid data page requested
target: don't call an unmap callback if a range length is zero
target/iscsi: avoid NULL dereference in CHAP auth error path
cxgbit: call neigh_event_send() to update MAC address
target: tcm_loop: Use seq_puts() in tcm_loop_show_info()
target: tcm_loop: Delete an unnecessary return statement in tcm_loop_submission_work()
target: tcm_loop: Delete two unnecessary variable initialisations in tcm_loop_issue_tmr()
target: tcm_loop: Combine substrings for 26 messages
target: tcm_loop: Improve a size determination in two functions
target: tcm_loop: Delete an error message for a failed memory allocation in four functions
sbp-target: Delete an error message for a failed memory allocation in three functions
...
Pull block updates from Jens Axboe:
"This is the main pull request for block IO related changes for the
4.16 kernel. Nothing major in this pull request, but a good amount of
improvements and fixes all over the map. This contains:
- BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
Paolo.
- Support for SMR zones for deadline and mq-deadline from Damien and
Christoph.
- Set of fixes for bcache by way of Michael Lyle, including fixes
from himself, Kent, Rui, Tang, and Coly.
- Series from Matias for lightnvm with fixes from Hans Holmberg,
Javier, and Matias. Mostly centered around pblk, and the removing
rrpc 1.2 in preparation for supporting 2.0.
- A couple of NVMe pull requests from Christoph. Nothing major in
here, just fixes and cleanups, and support for command tracing from
Johannes.
- Support for blk-throttle for tracking reads and writes separately.
From Joseph Qi. A few cleanups/fixes also for blk-throttle from
Weiping.
- Series from Mike Snitzer that enables dm to register its queue more
logically, something that's alwways been problematic on dm since
it's a stacked device.
- Series from Ming cleaning up some of the bio accessor use, in
preparation for supporting multipage bvecs.
- Various fixes from Ming closing up holes around queue mapping and
quiescing.
- BSD partition fix from Richard Narron, fixing a problem where we
can't mount newer (10/11) FreeBSD partitions.
- Series from Tejun reworking blk-mq timeout handling. The previous
scheme relied on atomic bits, but it had races where we would think
a request had timed out if it to reused at the wrong time.
- null_blk now supports faking timeouts, to enable us to better
exercise and test that functionality separately. From me.
- Kill the separate atomic poll bit in the request struct. After
this, we don't use the atomic bits on blk-mq anymore at all. From
me.
- sgl_alloc/free helpers from Bart.
- Heavily contended tag case scalability improvement from me.
- Various little fixes and cleanups from Arnd, Bart, Corentin,
Douglas, Eryu, Goldwyn, and myself"
* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
block: remove smart1,2.h
nvme: add tracepoint for nvme_complete_rq
nvme: add tracepoint for nvme_setup_cmd
nvme-pci: introduce RECONNECTING state to mark initializing procedure
nvme-rdma: remove redundant boolean for inline_data
nvme: don't free uuid pointer before printing it
nvme-pci: Suspend queues after deleting them
bsg: use pr_debug instead of hand crafted macros
blk-mq-debugfs: don't allow write on attributes with seq_operations set
nvme-pci: Fix queue double allocations
block: Set BIO_TRACE_COMPLETION on new bio during split
blk-throttle: use queue_is_rq_based
block: Remove kblockd_schedule_delayed_work{,_on}()
blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
lib/scatterlist: Fix chaining support in sgl_alloc_order()
blk-throttle: track read and write request individually
block: add bdev_read_only() checks to common helpers
block: fail op_is_write() requests to read-only partitions
blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
...
Mike Christie reports:
Starting in 4.14 iscsi logins will fail around 50% of the time.
Problem appears to be that iscsi_target_sk_data_ready() callback may
return without doing anything in case it finds the login work queue
is still blocked in sock_recvmsg().
Nicholas Bellinger says:
It would indicate users providing their own ->sk_data_ready() callback
must be responsible for waking up a kthread context blocked on
sock_recvmsg(..., MSG_WAITALL), when a second ->sk_data_ready() is
received before the first sock_recvmsg(..., MSG_WAITALL) completes.
So, do this and invoke the original data_ready() callback -- in
case of tcp sockets this takes care of waking the thread.
Disclaimer: I do not understand why this problem did not show up before
tcp prequeue removal.
(Drop WARN_ON usage - nab)
Reported-by: Mike Christie <mchristi@redhat.com>
Bisected-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Diagnosed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fixes: e7942d0633 ("tcp: remove prequeue support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Cc: stable@vger.kernel.org # 4.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The trailing semicolon is an empty statement that does no operation.
It is completely stripped out by the compiler. Removing it since it doesn't do
anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If we are failing the command due to a qfull timeout we are
also freeing the tcmu command, so we cannot access it later
to get the se_cmd.
Note: The clearing of cmd->se_cmd is not needed. We do not check
it later for something like determining if the command was failed
due to a timeout. As a result I am dropping it.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
After dev->transport->configure_device succeeds, target_configure_device
exits abnormally, dev_flags has not set DF_CONFIGURED yet, does not call
destroy_device function in free_device.
Signed-off-by: tangwenji <tang.wenji@zte.com.cn>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds 2 tcmu attrs to block/unblock a device and
reset the ring buffer. They are used when the userspace
daemon has crashed or forced to shutdown while IO is executing.
On restart, the daemon can block the device so new IO is not
sent to userspace while it puts the ring in a clean state.
Notes: The reset ring opreation is specific to tcmu, but the
block one could be generic. I kept it tcmu specific, because
it requires some extra locking/state checks in the main IO
path and since other backend modules did not need this
functionality I thought only tcmu should take the perf hit.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a new group of files that are to be used to
have the kernel module execution some action. The next patch
will have target_core_user use the group/files to be able to block
a device and to reset its memory buffer used to pass commands
between user/kernel space.
This type of file is different from the existing device attributes
in that they may be write only and when written to they result in
the kernel module executing some function. These need to be
separate from the normal device attributes which get/set device
values so userspace can continue to loop over all the attribs and
get/set them during initialization.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fix to return error code -ENOMEM from the kzalloc() error handling
case instead of 0, as done elsewhere in this function.
Fixes: 80eb876 ("tcmu: allow max block and global max blocks to be settable")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Log cmd id that was not found in the tcmu_handle_completions
lookup failure path.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Add SAM_STAT_BUSY sense_reason. The next patch will have
target_core_user return this value while it is temporarily
blocked and restarting.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We will always have a page mapped for cmd data if it is
valid command. If the mapping does not exist then something
bad happened in userspace and it should not proceed. This
has us return VM_FAULT_SIGBUS when this happens instead of
returning a freshly allocated paged. The latter can cause
corruption because userspace might write the pages data
overwriting valid data or return it to the initiator.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If a length of a range is zero, it means there is nothing to unmap
and we can skip this range.
Here is one more reason, why we have to skip such ranges. An unmap
callback calls file_operations->fallocate(), but the man page for the
fallocate syscall says that fallocate(fd, mode, offset, let) returns
EINVAL, if len is zero. It means that file_operations->fallocate() isn't
obligated to handle zero ranges too.
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If chap_server_compute_md5() fails early, e.g. via CHAP_N mismatch, then
crypto_free_shash() is called with a NULL pointer which gets
dereferenced in crypto_shash_tfm().
Fixes: 69110e3ced ("iscsi-target: Use shash and ahash")
Suggested-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Cc: stable@vger.kernel.org # 4.6+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If nud_state is not valid then call neigh_event_send() to update MAC
address.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The script "checkpatch.pl" pointed information out like the following.
WARNING: Prefer seq_puts to seq_printf
Thus fix the affected source code place.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The script "checkpatch.pl" pointed information out like the following.
WARNING: void function return statements are not generally useful
Thus remove such a statement in the affected function.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The variables "se_cmd" and "tl_cmd" will eventually be set to appropriate
pointers a bit later.
Thus omit the explicit initialisation at the beginning.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The script "checkpatch.pl" pointed information out like the following.
WARNING: quoted string split across lines
Thus fix the affected source code places.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Replace the specification of data structures by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Omit an extra message for a memory allocation failure in these functions.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Omit an extra message for a memory allocation failure in these functions.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Chris Boot <bootc@boo.tc>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Users might have a physical system to a target so they could
have a lot more than 2 gigs of memory they want to devote to
tcmu. OTOH, we could be running in a vm and so a 2 gig
global and 1 gig per dev limit might be too high. This patch
allows the user to specify the limits.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This adds a timer, qfull_time_out, that controls how long a
device will wait for ring buffer space to open before
failing the commands in the queue. It is useful to separate
this timer from the cmd_time_out and default 30 sec one,
because for HA setups cmd_time_out may be disbled and 30
seconds is too long to wait when some OSs like ESX will
timeout commands after as little as 8 - 15 seconds.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch has tcmu internally queue cmds if its ring buffer
is full. It also makes the TCMU_GLOBAL_MAX_BLOCKS limit a
hint instead of a hard limit, so we do not have to add any
new locks/atomics in the main IO path except when IO is not
running.
This fixes the following bugs:
1. We cannot sleep from the submitting context because it might be
called from a target recv context. This results in transport level
commands timing out. For example if the ring is full, we would
sleep, and a iscsi initiator would send a iscsi ping/nop which
times out because the target's recv thread is sleeping here.
2. Devices were not fairly scheduled to run when they hit the global
limit so they could time out waiting for ring space while others
got run.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We do not really save a lot by trying to increase thresh
a multiple of the existing value. This just simplifies the
code by increasing it to whatever is needed for the command
being executed.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In the next patches we will call queue_cmd_ring from the submitting
context and also the completion path. This changes the queue_cmd_ring
return code so in the next patches we can return a sense_reason_t
and also signal if a command was requeued.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Add some comments to make the scatter code to be more readable,
and drop unused arg to new_iov.
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The blocks_left calculation does not account for free blocks
between 0 and thresh, so we could be queueing/waiting when
there are enough blocks free.
This has us add in the blocks between 0 and thresh as well as
at the end from thresh to DATA_BLOCK_BITS.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
scatter_data_area always returns 0, so stop checking
for errors.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If we cannot setup a cmd because we run out of ring space
or global pages release the blocks before sleeping. This
prevents a deadlock where dev0 has waiting_blocks set and
needs N blocks, but dev1 to devX have each allocated N / X blocks
and also hit the global block limit so they went to sleep.
find_free_blocks is not able to take the sleeping dev's
blocks becaause their waiting_blocks is set and even
if it was not the block returned by find_last_bit could equal
dbi_max. The latter will probably never happen because
DATA_BLOCK_BITS is so high but in the next patches
DATA_BLOCK_BITS and TCMU_GLOBAL_MAX_BLOCKS will be settable so
it might be lower and could happen.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
No need for the commands_lock. The cmdr_lock is already held during
idr addition and deletion, so just grab it during traversal.
Note: This also fixes a issue where we should have been using at
least _bh locking in tcmu_handle_completions when taking the commands
lock to prevent the case where tcmu_handle_completions could be
interrupted by a timer softirq while the commands_lock is held.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This moves the expired command completion handling to
the unmap wq, so the next patch can use a mutex
in tcmu_check_expired_cmd.
Note:
tcmu_device_timedout's use of spin_lock_irq was not needed.
The commands_lock is used between thread context (tcmu_queue_cmd_ring
and tcmu_irqcontrol (even though this is named irqcontrol it is not
run in irq context)) and timer/bh context. In the timer/bh context
bhs are disabled, so you need to use the _bh lock calls from the
thread context callers.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If the unmap thread has already run find_free_blocks
but not yet run prepare_to_wait when a wake_up(&unmap_wait)
call is done, the unmap thread is going to miss the wake
call. Instead of adding checks for if new waiters were added
this just has us use a work queue which will run us again
in this type of case.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Separate unmap_thread_fn to make it easier to read.
Note: this patch does not fix the bug where we might
miss a wake up call. The next patch will fix that.
This patch only separates the code into functions.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>