Commit Graph

288 Commits

Author SHA1 Message Date
Michael S. Tsirkin
4c5a84421c vhost: cleanup iterator update logic
Recent iterator-related changes in vhost made it
harder to follow the logic fixing up the header.
In fact, the fixup always happens at the same
offset: sizeof(virtio_net_hdr): sometimes the
fixup iterator is updated by copy_to_iter,
sometimes-by iov_iter_advance.

Rearrange code to make this obvious.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:44 -05:00
Jason Wang
0960b6417e vhost_net: fix wrong iter offset when setting number of buffers
In commit ba7438aed9 ("vhost: don't bother copying iovecs in
handle_rx(), kill memcpy_toiovecend()"), we advance iov iter fixup
sizeof(struct virtio_net_hdr) bytes and fill the number of buffers
after doing the socket recvmsg(). This work well but was broken after
commit 6e03f896b5 ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") which tries
to advance sizeof(struct virtio_net_hdr_mrg_rxbuf). It will fill the
number of buffers at the wrong place. This patch fixes this.

Fixes 6e03f896b5
("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-15 08:17:15 -08:00
David S. Miller
6e03f896b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/vxlan.c
	drivers/vhost/net.c
	include/linux/if_vlan.h
	net/core/dev.c

The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.

In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.

In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.

In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 14:33:28 -08:00
Michael S. Tsirkin
5201aa49b0 vhost/net: fix up num_buffers endian-ness
In virtio 1.0 mode, when mergeable buffers are enabled on a big-endian
host, num_buffers wasn't byte-swapped correctly, so large incoming
packets got corrupted.

To fix, fill it in within hdr - this also makes sure it gets
the correct type.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 13:59:31 -08:00
Al Viro
57dd8a0735 vhost: vhost_scsi_handle_vq() should just use copy_from_user()
it has just verified that it asks no more than the length of the
first segment of iovec.

And with that the last user of stuff in lib/iovec.c is gone.
RIP.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro
ba7438aed9 vhost: don't bother copying iovecs in handle_rx(), kill memcpy_toiovecend()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro
98a527aac1 vhost: don't bother with copying iovec in handle_tx()
just advance the msg.msg_iter and be done with that.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro
aad9a1cec7 vhost: switch vhost get_indirect() to iov_iter, kill memcpy_fromiovec()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:15 -05:00
David S. Miller
3f3558bb51 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/xen-netfront.c

Minor overlapping changes in xen-netfront.c, mostly to do
with some buffer management changes alongside the split
of stats into TX and RX.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 00:53:17 -05:00
Jiri Pirko
df8a39defa net: rename vlan_tx_* helpers since "tx" is misleading there
The same macros are used for rx as well. So rename it.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13 17:51:08 -05:00
Linus Torvalds
fb43bd08af Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
 "Mostly minor fixes this time, including:

   - Add missing virtio-scsi -> TCM attribute conversion in vhost-scsi.
   - Fix persistent reservations write exclusive handling to allow
     readers for all registered I_T nexuses.
   - Drop arbitrary maximum I/O size limit in order to process I/Os
     larger than 4 MB, required for initiators that don't honor block
     limits EVPD.
   - Drop the now left-over fabric_max_sectors attribute"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Fix typos in enum cmd_flags_table
  MAINTAINERS: Add entry for iSER target driver
  target: Allow Write Exclusive non-reservation holders to READ
  target: Drop left-over fabric_max_sectors attribute
  target: Drop arbitrary maximum I/O size limit
  Documentation/target: Update fabric_ops to latest code
  vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion
2015-01-13 15:23:26 +13:00
Michael S. Tsirkin
99975cc6ad vhost/net: length miscalculation
commit 8b38694a2d
    vhost/net: virtio 1.0 byte swap
had this chunk:
-       heads[headcount - 1].len += datalen;
+       heads[headcount - 1].len = cpu_to_vhost32(vq, len - datalen);

This adds datalen with the wrong sign, causing guest panics.

Fixes: 8b38694a2d
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Suggested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-01-07 12:22:00 +02:00
Nicholas Bellinger
4624386080 vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion
While looking at hch's recent conversion to drop the MSG_*_TAG
definitions, I noticed a long standing bug in vhost-scsi where
the VIRTIO_SCSI_S_* attribute definitions where incorrectly
being passed directly into target_submit_cmd_map_sgls().

This patch adds the missing virtio-scsi to TCM/SAM task attribute
conversion.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org # 3.5
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-01-06 13:46:29 -08:00
Michael S. Tsirkin
5d9a07b0de vhost: relax used address alignment
virtio 1.0 only requires used address to be 4 byte aligned,
vhost required 8 bytes (size of vring_used_elem).
Fix up vhost to match that.

Additionally, while vhost correctly requires 8 byte
alignment for log, it's unconnected to used ring:
it's a consequence that log has u64 entries.
Tweak code to make that clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-29 10:55:06 +02:00
Linus Torvalds
64ec45bff6 vhost/virtio: virtio 1.0 related fixes
Most importantly, this fixes using virtio_pci as a module.
 
 Further, the big virtio 1.0 conversion missed a couple of places. This fixes
 them up.
 
 This isn't 100% sparse-clean yet because on many architectures get_user
 triggers sparse warnings when used with __bitwise tag (when same tag is on both
 pointer and value read).
 
 I posted a patchset to fix it up by adding __force on all
 arches that don't already have it (many do), when that's
 merged these warnings will go away.
 
 Cc: Rusty Russell <rusty@rustcorp.com.au>
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUkSztAAoJECgfDbjSjVRp5DkH/ibw+0ZaEFP/SXWnw6WONpaG
 pzMsrfMG/vxlOfutSUdDqG+oqqU2fSLvFq5qDK6Xk9/emRSwGduz29ZaxGh8J1MZ
 /Ojqtu/HSLl+UASTs+fMw49itghoIjmAPBwwMkQjvanfqLclgdz9UxzoCOc4YkO0
 PJQA/Vw6blVSP1m0p97PvzZkAiIetI2ixZn2vPJZc8vSkOHtygM9HdXKTv785HbG
 ycRbR9B3OBMvq26FIuWeyuY93FnyX2Qtf2bzwSSRdzo7qlsNhVVG7sKyWOOR+1xG
 TLmjhyTF57oA2GgZCVfgnFxsfiuIKMumfG0jbABTXmBGgA/ULef7HcF/lzLgdq8=
 =32cU
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael S Tsirkin:
 "virtio 1.0 related fixes

  Most importantly, this fixes using virtio_pci as a module.

  Further, the big virtio 1.0 conversion missed a couple of places.
  This fixes them up.

  This isn't 100% sparse-clean yet because on many architectures
  get_user triggers sparse warnings when used with __bitwise tag (when
  same tag is on both pointer and value read).

  I posted a patchset to fix it up by adding __force on all arches that
  don't already have it (many do), when that's merged these warnings
  will go away"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_pci: restore module attributes
  mic/host: fix up virtio 1.0 APIs
  vringh: update for virtio 1.0 APIs
  vringh: 64 bit features
  tools/virtio: add virtio 1.0 in vringh_test
  tools/virtio: add virtio 1.0 in virtio_test
  tools/virtio: enable -Werror
  tools/virtio: 64 bit features
  tools/virtio: fix vringh test
  tools/virtio: more stubs
  virtio: core support for config generation
  virtio_pci: add VIRTIO_PCI_NO_LEGACY
  virtio_pci: move probe to common file
  virtio_pci_common.h: drop VIRTIO_PCI_NO_LEGACY
  virtio_config: fix virtio_cread_bytes
  virtio: set VIRTIO_CONFIG_S_FEATURES_OK on restore
2014-12-18 20:50:30 -08:00
Michael S. Tsirkin
b9f7ac8c72 vringh: update for virtio 1.0 APIs
When switching everything over to virtio 1.0 memory access APIs,
I missed converting vringh.
Fortunately, it's straight-forward.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-15 23:49:28 +02:00
Michael S. Tsirkin
b97a8a9006 vringh: 64 bit features
Pass u64 everywhere.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-15 23:49:23 +02:00
Linus Torvalds
70e71ca0af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

 2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers.  Thanks to Al Viro
    and Herbert Xu.

 3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

 4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

 5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

 6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

 7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

 8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

 9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets.  From Alexei
    Starovoitov.

10) Support TSO/LSO in sunvnet driver, from David L Stevens.

11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

12) Remote checksum offload, from Tom Herbert.

13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

14) Add MPLS support to openvswitch, from Simon Horman.

15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet.  This tries to resolve the conflicting goals between the
    desired handling of bulk vs.  RPC-like traffic.

17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU.  From Eric Dumazet.

18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

21) Add VLAN packet scheduler action, from Jiri Pirko.

22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
  Fix race condition between vxlan_sock_add and vxlan_sock_release
  net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
  net/mlx4: Add support for A0 steering
  net/mlx4: Refactor QUERY_PORT
  net/mlx4_core: Add explicit error message when rule doesn't meet configuration
  net/mlx4: Add A0 hybrid steering
  net/mlx4: Add mlx4_bitmap zone allocator
  net/mlx4: Add a check if there are too many reserved QPs
  net/mlx4: Change QP allocation scheme
  net/mlx4_core: Use tasklet for user-space CQ completion events
  net/mlx4_core: Mask out host side virtualization features for guests
  net/mlx4_en: Set csum level for encapsulated packets
  be2net: Export tunnel offloads only when a VxLAN tunnel is created
  gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
  cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
  net: fec: only enable mdio interrupt before phy device link up
  net: fec: clear all interrupt events to support i.MX6SX
  net: fec: reset fep link status in suspend function
  net: sock: fix access via invalid file descriptor
  net: introduce helper macro for_each_cmsghdr
  ...
2014-12-11 14:27:06 -08:00
Al Viro
c0371da604 put iov_iter into msghdr
Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09 16:29:03 -05:00
Jason Wang
2e73c716ba vhost: remove unnecessary forward declarations in vhost.h
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:06:33 +02:00
Michael S. Tsirkin
e72fd72e26 vhost/scsi: partial virtio 1.0 support
Include all endian conversions as required by virtio 1.0.
Don't set virtio 1.0 yet, since that requires ANY_LAYOUT
which we don't yet support.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-09 12:06:31 +02:00
Michael S. Tsirkin
41e3e42108 vhost/net: enable virtio 1.0
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:30 +02:00
Michael S. Tsirkin
e4fca7d6ff vhost/net: larger header for virtio 1.0
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:30 +02:00
Michael S. Tsirkin
8b38694a2d vhost/net: virtio 1.0 byte swap
I had to add an explicit tag to suppress compiler warning:
gcc isn't smart enough to notice that
len is always initialized since function is called with size > 0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin
3b1bbe8935 vhost: virtio 1.0 endian-ness support
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin
64f7f0510c vhost: switch to __get/__put_user exclusively
Most places in vhost can use __get/__put_user rather than
get/put_user since addresses are pre-validated.
This should be good for performance, but this also
will help make code sparse-clean: get/put_user macros
don't play well with __virtioXX bitwise tags.
Switch to get/put_user to __ variants everywhere in vhost.
There's one exception - for consistency switch that
as well, and add an explicit access_ok check.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin
bf99573496 vhost/net: force len for TX to host endian
vhost/net keeps a copy of the used ring in host memory but (ab)uses
the length field for internal house-keeping. This works because the
length in the used ring for tx is always 0. In order to suppress sparse
warnings, we force native endianness here.
Note that these values are never exposed to guests.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin
e05fd12b38 vhost: add memory access wrappers
Add guest memory access wrappers to handle virtio endianness
conversions.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin
bd82752a83 vhost: make features 64 bit
We need to use bit 32 for virtio 1.0.
Make vhost_has_feature bool to avoid discarding high bits.

Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2014-12-09 12:05:28 +02:00
Nicholas Bellinger
ab8edab132 vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT
This patch addresses a bug where individual vhost-scsi configfs endpoint
groups can be removed from below while active exports to QEMU userspace
still exist, resulting in an OOPs.

It adds a configfs_depend_item() in vhost_scsi_set_endpoint() to obtain
an explicit dependency on se_tpg->tpg_group in order to prevent individual
vhost-scsi WWPN endpoints from being released via normal configfs methods
while an QEMU ioctl reference still exists.

Also, add matching configfs_undepend_item() in vhost_scsi_clear_endpoint()
to release the dependency, once QEMU's reference to the individual group
at /sys/kernel/config/target/vhost/$WWPN/$TPGT is released.

(Fix up vhost_scsi_clear_endpoint() error path - DanC)

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: <stable@vger.kernel.org> # 3.6+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-10-28 13:54:16 -07:00
Michael S. Tsirkin
6840444155 vhost-scsi: don't open-code kvfree
Now that we have kvfree, use it in vhost-scsi instead of
the open-coded version.

Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 09:22:48 +03:00
Romain Francoise
d04257b07f vhost-net: don't open-code kvfree
Commit 23cc5a991c ("vhost-net: extend device allocation to vmalloc")
added another open-coded version of kvfree (which is available since
v3.15-rc5), nuke it.

Signed-off-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 09:22:48 +03:00
Linus Torvalds
ed9ea4ed3a Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

   - Add support for T10 PI pass-through between vhost-scsi +
     virtio-scsi (MST + Paolo + MKP + nab)
   - Add support for T10 PI in qla2xxx target mode (Quinn + MKP + hch +
     nab, merged through scsi.git)
   - Add support for percpu-ida pre-allocation in qla2xxx target code
     (Quinn + nab)
   - A number of iser-target fixes related to hardening the network
     portal shutdown path (Sagi + Slava)
   - Fix response length residual handling for a number of control CDBs
     (Roland + Christophe V.)
   - Various iscsi RFC conformance fixes in the CHAP authentication path
     (Tejas and Calsoft folks + nab)
   - Return TASK_SET_FULL status for tcm_fc(FCoE) DataIn + Response
     failures (Vasu + Jun + nab)
   - Fix long-standing ABORT_TASK + session reset hang (nab)
   - Convert iser-initiator + iser-target to include T10 bytes into EDTL
     (Sagi + Or + MKP + Mike Christie)
   - Fix NULL pointer dereference regression related to XCOPY introduced
     in v3.15 + CC'ed to v3.12.y (nab)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (34 commits)
  target: Fix NULL pointer dereference for XCOPY in target_put_sess_cmd
  vhost-scsi: Include prot_bytes into expected data transfer length
  TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
  libiscsi, iser: Adjust data_length to include protection information
  scsi_cmnd: Introduce scsi_transfer_length helper
  target: Report correct response length for some commands
  target/sbc: Check that the LBA and number of blocks are correct in VERIFY
  target/sbc: Remove sbc_check_valid_sectors()
  Target/iscsi: Fix sendtargets response pdu for iser transport
  Target/iser: Fix a wrong dereference in case discovery session is over iser
  iscsi-target: Fix ABORT_TASK + connection reset iscsi_queue_req memory leak
  target: Use complete_all for se_cmd->t_transport_stop_comp
  target: Set CMD_T_ACTIVE bit for Task Management Requests
  target: cleanup some boolean tests
  target/spc: Simplify INQUIRY EVPD=0x80
  tcm_fc: Generate TASK_SET_FULL status for response failures
  tcm_fc: Generate TASK_SET_FULL status for DataIN failures
  iscsi-target: Reject mutual authentication with reflected CHAP_C
  iscsi-target: Remove no-op from iscsit_tpg_del_portal_group
  iscsi-target: Fix CHAP_A parameter list handling
  ...
2014-06-12 22:38:32 -07:00
Linus Torvalds
3c81bdd9e7 vhost: infrastructure changes for 3.16
This reworks vhost core dropping unnecessary RCU uses in favor of VQ mutexes
 which are used on fast path anyway.  This fixes worst-case latency for users
 which change the memory mappings a lot.
 Memory allocation for vhost-net now supports fallback on vmalloc (same as for
 vhost-scsi) this makes it possible to create the device on systems where memory
 is very fragmented, with slightly lower performance.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTmFpNAAoJECgfDbjSjVRpD5oH/225t/P6h6kZWxzeZP+P+/Rj
 cEonwdwYRg+OOLWPnmJptK1zeWZPQhVoNxxREn3S8Zx3BN7QiBidegrMErM2x+uQ
 ZDcF0WC1mNc6rHPQ5n4N0ZItVrG8KQz5r2EYe0eKwMoy16C4rhZBY1Zj16VePMDK
 A00AqPDiUH1M7XPUQabXdRqNlxLXFoQYbrkAY+bgBeSl2qAfEbdMLW4ty7vqLOyT
 W5ZyJSBuv4BYmYh1KhmLW0WZBCleSFKkoGjHy5EZZ8fqZ63hpbvpqswl6PJeT4qW
 sq1yxyt8CBTi5lJwW3hIiiJwzDNsIc4zDhbyEfkQ3uHtOJ/SBzFptEWHun1hbRI=
 =DYAy
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull vhost infrastructure updates from Michael S. Tsirkin:
 "This reworks vhost core dropping unnecessary RCU uses in favor of VQ
  mutexes which are used on fast path anyway.  This fixes worst-case
  latency for users which change the memory mappings a lot.  Memory
  allocation for vhost-net now supports fallback on vmalloc (same as for
  vhost-scsi) this makes it possible to create the device on systems
  where memory is very fragmented, with slightly lower performance"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: move memory pointer to VQs
  vhost: move acked_features to VQs
  vhost: replace rcu with mutex
  vhost-net: extend device allocation to vmalloc
2014-06-11 17:08:16 -07:00
Nicholas Bellinger
9f977ef7b6 vhost-scsi: Include prot_bytes into expected data transfer length
This patch updates vhost_scsi_get_tag() to accept the combined
expected data transfer length + T10 PI bytes as the value passed
into target_submit_cmd().

This is required now that target-core logic in commit 14ef9200
expects to subtract se_cmd->prot_length from se_cmd->data_length.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-11 13:06:50 -07:00
Michael S. Tsirkin
47283bef7e vhost: move memory pointer to VQs
commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc
    vhost: replace rcu with mutex
replaced rcu sync for memory accesses with VQ mutex locl/unlock.
This is correct since all accesses are under VQ mutex, but incomplete:
we still do useless rcu lock/unlock operations, someone might copy this
code into some other context where this won't be right.
This use of RCU is also non standard and hard to understand.
Let's copy the pointer to each VQ structure, this way
the access rules become straight-forward, and there's
no need for RCU anymore.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:07 +03:00
Michael S. Tsirkin
ea16c51433 vhost: move acked_features to VQs
Refactor code to make sure features are only accessed
under VQ mutex. This makes everything simpler, no need
for RCU here anymore.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:06 +03:00
Michael S. Tsirkin
98f9ca0a3f vhost: replace rcu with mutex
All memory accesses are done under some VQ mutex.
So lock/unlock all VQs is a faster equivalent of synchronize_rcu()
for memory access changes.
Some guests cause a lot of these changes, so it's helpful
to make them faster.

Reported-by: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:06 +03:00
Michael S. Tsirkin
23cc5a991c vhost-net: extend device allocation to vmalloc
Michael Mueller provided a patch to reduce the size of
vhost-net structure as some allocations could fail under
memory pressure/fragmentation. We are still left with
high order allocations though.

This patch is handling the problem at the core level, allowing
vhost structures to use vmalloc() if kmalloc() failed.

As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT
to kzalloc() flags to do this fallback only when really needed.

People are still looking at cleaner ways to handle the problem
at the API level, probably passing in multiple iovecs.
This hack seems consistent with approaches
taken since then by drivers/vhost/scsi.c and net/core/dev.c

Based on patch by Romain Francoise.

Cc: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: Romain Francoise <romain@orebokech.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:05 +03:00
Nicholas Bellinger
95e7c4341b vhost/scsi: Enable T10 PI IOV -> SGL memory mapping
This patch updates vhost_scsi_handle_vq() to check for the existance
of virtio_scsi_cmd_req_pi comparing vq->iov[0].iov_len in order to
calculate seperate data + protection SGLs from data_num.

Also update tcm_vhost_submission_work() to pass the pre-allocated
cmd->tvc_prot_sgl[] memory into target_submit_cmd_map_sgls(), and
update vhost_scsi_get_tag() parameters to accept scsi_tag, lun, and
task_attr.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-02 12:42:14 -07:00
Nicholas Bellinger
e31885dd90 vhost/scsi: Add T10 PI IOV -> SGL memory mapping logic
This patch adds vhost_scsi_map_iov_to_prot() to perform the mapping of
T10 data integrity memory between virtio iov + struct scatterlist using
get_user_pages_fast() following existing code.

As with vhost_scsi_map_iov_to_sgl(), this does sanity checks against the
total prot_sgl_count vs. pre-allocated SGLs, and loops across protection
iovs using vhost_scsi_map_to_sgl() to perform the actual memory mapping.

Also update tcm_vhost_release_cmd() to release associated tvc_prot_sgl[]
struct page.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-02 12:42:07 -07:00
Nicholas Bellinger
b1935f687b vhost/scsi: Add preallocation of protection SGLs
This patch updates tcm_vhost_make_nexus() to pre-allocate per descriptor
tcm_vhost_cmd->tvc_prot_sgl[] used to expose protection SGLs from within
virtio-scsi guest memory to vhost-scsi.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-02 12:42:01 -07:00
Nicholas Bellinger
5a01d08217 vhost/scsi: Move sanity check into vhost_scsi_map_iov_to_sgl
Move the overflow check for sgl_count > TCM_VHOST_PREALLOC_SGLS into
vhost_scsi_map_iov_to_sgl() so that it's based on the total number
of SGLs for all IOVs, instead of single IOVs.

Also, rename TCM_VHOST_PREALLOC_PAGES -> TCM_VHOST_PREALLOC_UPAGES
to better describe pointers to user-space pages.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-02 12:41:54 -07:00
Peter Zijlstra
4e857c58ef arch: Mass conversion of smp_mb__*()
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18 14:20:48 +02:00
Linus Torvalds
141eaccd01 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Here are the target pending updates for v3.15-rc1.  Apologies in
  advance for waiting until the second to last day of the merge window
  to send these out.

  The highlights this round include:

   - iser-target support for T10 PI (DIF) offloads (Sagi + Or)
   - Fix Task Aborted Status (TAS) handling in target-core (Alex Leung)
   - Pass in transport supported PI at session initialization (Sagi + MKP + nab)
   - Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi)
   - Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab)
   - Fix tcm_fc use-after-free of ft_tpg (Andy Grover)
   - Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn)

  Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI
  metadata into KVM guest have been left-out for now, as there where a
  few comments from MST + Paolo that where not able to be addressed in
  time for v3.15.  Please expect this feature for v3.16-rc1"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits)
  ib_srpt: Use correct ib_sg_dma primitives
  target/tcm_fc: Rename ft_tport_create to ft_tport_get
  target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn
  target/tcm_fc: Rename structs and list members for clarity
  target/tcm_fc: Limit to 1 TPG per wwn
  target/tcm_fc: Don't export ft_lport_list
  target/tcm_fc: Fix use-after-free of ft_tpg
  target: Add check to prevent Abort Task from aborting itself
  target: Enable READ_STRIP emulation in target_complete_ok_work
  target/sbc: Add sbc_dif_read_strip software emulation
  target: Enable WRITE_INSERT emulation in target_execute_cmd
  target/sbc: Add sbc_dif_generate software emulation
  target/sbc: Only expose PI read_cap16 bits when supported by fabric
  target/spc: Only expose PI mode page bits when supported by fabric
  target/spc: Only expose PI inquiry bits when supported by fabric
  target: Pass in transport supported PI at session initialization
  target/iblock: Fix double bioset_integrity_free bug
  Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist
  target/rd: T10-Dif: RAM disk is allocating more space than required.
  iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug
  ...
2014-04-12 16:51:08 -07:00
Nicholas Bellinger
e70beee783 target: Pass in transport supported PI at session initialization
In order to support local WRITE_INSERT + READ_STRIP operations for
non PI enabled fabrics, the fabric driver needs to be able signal
what protection offload operations are supported.

This is done at session initialization time so the modes can be
signaled by individual se_wwn + se_portal_group endpoints, as well
as optionally across different transports on the same endpoint.

For iser-target, set TARGET_PROT_ALL if the underlying ib_device
has already signaled PI offload support, and allow this to be
exposed via a new iscsit_transport->iscsit_get_sup_prot_ops()
callback.

For loopback, set TARGET_PROT_ALL to signal SCSI initiator mode
operation.

For all other drivers, set TARGET_PROT_NORMAL to disable fabric
level PI.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-04-07 01:48:54 -07:00
Nicholas Bellinger
131e6abc67 target: Add TFO->abort_task for aborted task resources release
Now that TASK_ABORTED status is not generated for all cases by
TMR ABORT_TASK + LUN_RESET, a new TFO->abort_task() caller is
necessary in order to give fabric drivers a chance to unmap
hardware / software resources before the se_cmd descriptor is
released via the normal TFO->release_cmd() codepath.

This patch adds TFO->aborted_task() in core_tmr_abort_task()
in place of the original transport_send_task_abort(), and
also updates all fabric drivers to implement this caller.

The fabric drivers that include changes to perform cleanup
via ->aborted_task() are:

  - iscsi-target
  - iser-target
  - srpt
  - tcm_qla2xxx

The fabric drivers that currently set ->aborted_task() to
NOPs are:

  - loopback
  - tcm_fc
  - usb-gadget
  - sbp-target
  - vhost-scsi

For the latter five, there appears to be no additional cleanup
required before invoking TFO->release_cmd() to release the
se_cmd descriptor.

v2 changes:
  - Move ->aborted_task() call into transport_cmd_finish_abort (Alex)

Cc: Alex Leung <amleung21@yahoo.com>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Vu Pham <vu@mellanox.com>
Cc: Chris Boot <bootc@bootc.net>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Cc: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-04-07 01:48:51 -07:00
Al Viro
09aaacf02a vhost: don't open-code sockfd_put()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01 23:19:10 -04:00
Michael S. Tsirkin
a39ee449f9 vhost: validate vhost_get_vq_desc return value
vhost fails to validate negative error code
from vhost_get_vq_desc causing
a crash: we are using -EFAULT which is 0xfffffff2
as vector size, which exceeds the allocated size.

The code in question was introduced in commit
8dd014adfe
    vhost-net: mergeable buffers support

CVE-2014-0055

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:10:35 -04:00
Michael S. Tsirkin
d8316f3991 vhost: fix total length when packets are too short
When mergeable buffers are disabled, and the
incoming packet is too large for the rx buffer,
get_rx_bufs returns success.

This was intentional in order for make recvmsg
truncate the packet and then handle_rx would
detect err != sock_len and drop it.

Unfortunately we pass the original sock_len to
recvmsg - which means we use parts of iov not fully
validated.

Fix this up by detecting this overrun and doing packet drop
immediately.

CVE-2014-0077

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:07:31 -04:00