Commit Graph

726583 Commits

Author SHA1 Message Date
Jakub Kicinski
a0d8637f0f i40e: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:09 -05:00
Jakub Kicinski
a60c3fd64f ixgbe: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:08 -05:00
Jakub Kicinski
312324f124 bnxt: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:08 -05:00
Jakub Kicinski
9ab88e83fd mlx5: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:08 -05:00
Jakub Kicinski
2a84bbafc0 cxgb4: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:08 -05:00
Jakub Kicinski
3107fdc8b2 nfp: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:08 -05:00
Jakub Kicinski
a2b212a507 netdevsim: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:08 -05:00
Jakub Kicinski
878db9f0f2 pkt_cls: add new tc cls helper to check offload flag and chain index
Very few (mlxsw) upstream drivers seem to allow offload of chains
other than 0.  Save driver developers typing and add a helper for
checking both if ethtool's TC offload flag is on and if chain is 0.
This helper will set the extack appropriately in both error cases.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:07 -05:00
Mickaël Salaün
9c147b56fc bpf: Use the IS_FD_ARRAY() macro in map_update_elem()
Make the code more readable.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 18:05:24 -08:00
Linus Torvalds
993ca2068b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "The main item is that we try to better handle the newer trackpoints on
  Lenovo devices that are now being produced by Elan/ALPS/NXP and only
  implement a small subset of the original IBM trackpoint controls"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Revert "Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01"
  Input: trackpoint - only expose supported controls for Elan, ALPS and NXP
  Input: trackpoint - force 3 buttons if 0 button is reported
  Input: xpad - add support for PDP Xbox One controllers
  Input: stmfts,s6sy671 - add SPDX identifier
2018-01-25 17:30:47 -08:00
Martin Brandenburg
6793f1c450 orangefs: fix deadlock; do not write i_size in read_iter
After do_readv_writev, the inode cache is invalidated anyway, so i_size
will never be read.  It will be fetched from the server which will also
know about updates from other machines.

Fixes deadlock on 32-bit SMP.

See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-25 17:26:24 -08:00
Alexei Starovoitov
82f1e0f3ac Merge branch 'bpf-more-sock_ops-callbacks'
Lawrence Brakmo says:

====================
This patchset adds support for:

- direct R or R/W access to many tcp_sock fields
- passing up to 4 arguments to sock_ops BPF functions
- tcp_sock field bpf_sock_ops_cb_flags for controlling callbacks
- optionally calling sock_ops BPF program when RTO fires
- optionally calling sock_ops BPF program when packet is retransmitted
- optionally calling sock_ops BPF program when TCP state changes
- access to tclass and sk_txhash
- new selftest

v2: Fixed commit message 0/11. The commit is to "bpf-next" but the patch
    below used "bpf" and Patchwork didn't work correctly.
v3: Cleaned RTO callback as per  Yuchung's comment
    Added BPF enum for TCP states as per  Alexei's comment
v4: Fixed compile warnings related to detecting changes between TCP
    internal states and the BPF defined states.
v5: Fixed comment issues in some selftest files
    Fixed accesss issue with u64 fields in bpf_sock_ops struct
v6: Made fixes based on comments form Eric Dumazet:
    The field bpf_sock_ops_cb_flags was addded in a hole on 64bit kernels
    Field bpf_sock_ops_cb_flags is now set through a helper function
    which returns an error when a BPF program tries to set bits for
    callbacks that are not supported in the current kernel.
    Added a comment indicating that when adding fields to bpf_sock_ops_kern
    they should be added before the field named "temp" if they need to be
    cleared before calling the BPF function.
v7: Enfornced fields "op" and "replylong[1] .. replylong[3]" not be writable
    based on comments form Eric Dumazet and Alexei Starovoitov.
    Filled 32 bit hole in bpf_sock_ops struct with sk_txhash based on
    comments from Daniel Borkmann.
    Removed unused functions (tcp_call_bpf_1arg, tcp_call_bpf_4arg) based
    on comments from Daniel Borkmann.
v8: Add commit message 00/12
    Add Acked-by as appropriate
v9: Moved the bug fix to the front of the patchset
    Changed RETRANS_CB so it is always called (before it was only called if
    the retransmit succeeded). It is now called with an extra argument, the
    return value of tcp_transmit_skb (0 => success). Based on comments
    from Yuchung Cheng.
    Added support for reading 2 new fields, sacked_out and lost_out, based on
    comments from Yuchung Cheng.
v10: Moved the callback flags from include/uapi/linux/tcp.h to
     include/uapi/linux/bpf.h
     Cleaned up the test in selftest. Added a timeout so it always completes,
     even if the client is not communicating with the server. Made it faster
     by removing the sleeps. Made sure it works even when called back-to-back
     20 times.

Consists of the following patches:
[PATCH bpf-next v10 01/12] bpf: Only reply field should be writeable
[PATCH bpf-next v10 02/12] bpf: Make SOCK_OPS_GET_TCP size
[PATCH bpf-next v10 03/12] bpf: Make SOCK_OPS_GET_TCP struct
[PATCH bpf-next v10 04/12] bpf: Add write access to tcp_sock and sock
[PATCH bpf-next v10 05/12] bpf: Support passing args to sock_ops bpf
[PATCH bpf-next v10 06/12] bpf: Adds field bpf_sock_ops_cb_flags to
[PATCH bpf-next v10 07/12] bpf: Add sock_ops RTO callback
[PATCH bpf-next v10 08/12] bpf: Add support for reading sk_state and
[PATCH bpf-next v10 09/12] bpf: Add sock_ops R/W access to tclass
[PATCH bpf-next v10 10/12] bpf: Add BPF_SOCK_OPS_RETRANS_CB
[PATCH bpf-next v10 11/12] bpf: Add BPF_SOCK_OPS_STATE_CB
[PATCH bpf-next v10 12/12] bpf: add selftest for tcpbpf
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:16 -08:00
Lawrence Brakmo
d6d4f60c3a bpf: add selftest for tcpbpf
Added a selftest for tcpbpf (sock_ops) that checks that the appropriate
callbacks occured and that it can access tcp_sock fields and that their
values are correct.

Run with command: ./test_tcpbpf_user
Adding the flag "-d" will show why it did not pass.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:15 -08:00
Lawrence Brakmo
d44874910a bpf: Add BPF_SOCK_OPS_STATE_CB
Adds support for calling sock_ops BPF program when there is a TCP state
change. Two arguments are used; one for the old state and another for
the new state.

There is a new enum in include/uapi/linux/bpf.h that exports the TCP
states that prepends BPF_ to the current TCP state names. If it is ever
necessary to change the internal TCP state values (other than adding
more to the end), then it will become necessary to convert from the
internal TCP state value to the BPF value before calling the BPF
sock_ops function. There are a set of compile checks added in tcp.c
to detect if the internal and BPF values differ so we can make the
necessary fixes.

New op: BPF_SOCK_OPS_STATE_CB.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:15 -08:00
Lawrence Brakmo
a31ad29e6a bpf: Add BPF_SOCK_OPS_RETRANS_CB
Adds support for calling sock_ops BPF program when there is a
retransmission. Three arguments are used; one for the sequence number,
another for the number of segments retransmitted, and the last one for
the return value of tcp_transmit_skb (0 => success).
Does not include syn-ack retransmissions.

New op: BPF_SOCK_OPS_RETRANS_CB.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
6f9bd3d731 bpf: Add sock_ops R/W access to tclass
Adds direct write access to sk_txhash and access to tclass for ipv6
flows through getsockopt and setsockopt. Sample usage for tclass:

  bpf_getsockopt(skops, SOL_IPV6, IPV6_TCLASS, &v, sizeof(v))

where skops is a pointer to the ctx (struct bpf_sock_ops).

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
44f0e43037 bpf: Add support for reading sk_state and more
Add support for reading many more tcp_sock fields

  state,	same as sk->sk_state
  rtt_min	same as sk->rtt_min.s[0].v (current rtt_min)
  snd_ssthresh
  rcv_nxt
  snd_nxt
  snd_una
  mss_cache
  ecn_flags
  rate_delivered
  rate_interval_us
  packets_out
  retrans_out
  total_retrans
  segs_in
  data_segs_in
  segs_out
  data_segs_out
  lost_out
  sacked_out
  sk_txhash
  bytes_received (__u64)
  bytes_acked    (__u64)

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
f89013f66d bpf: Add sock_ops RTO callback
Adds an optional call to sock_ops BPF program based on whether the
BPF_SOCK_OPS_RTO_CB_FLAG is set in bpf_sock_ops_flags.
The BPF program is passed 2 arguments: icsk_retransmits and whether the
RTO has expired.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
b13d880721 bpf: Adds field bpf_sock_ops_cb_flags to tcp_sock
Adds field bpf_sock_ops_cb_flags to tcp_sock and bpf_sock_ops. Its primary
use is to determine if there should be calls to sock_ops bpf program at
various points in the TCP code. The field is initialized to zero,
disabling the calls. A sock_ops BPF program can set it, per connection and
as necessary, when the connection is established.

It also adds support for reading and writting the field within a
sock_ops BPF program. Reading is done by accessing the field directly.
However, writing is done through the helper function
bpf_sock_ops_cb_flags_set, in order to return an error if a BPF program
is trying to set a callback that is not supported in the current kernel
(i.e. running an older kernel). The helper function returns 0 if it was
able to set all of the bits set in the argument, a positive number
containing the bits that could not be set, or -EINVAL if the socket is
not a full TCP socket.

Examples of where one could call the bpf program:

1) When RTO fires
2) When a packet is retransmitted
3) When the connection terminates
4) When a packet is sent
5) When a packet is received

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
de525be2ca bpf: Support passing args to sock_ops bpf function
Adds support for passing up to 4 arguments to sock_ops bpf functions. It
reusues the reply union, so the bpf_sock_ops structures are not
increased in size.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
b73042b8a2 bpf: Add write access to tcp_sock and sock fields
This patch adds a macro, SOCK_OPS_SET_FIELD, for writing to
struct tcp_sock or struct sock fields. This required adding a new
field "temp" to struct bpf_sock_ops_kern for temporary storage that
is used by sock_ops_convert_ctx_access. It is used to store and recover
the contents of a register, so the register can be used to store the
address of the sk. Since we cannot overwrite the dst_reg because it
contains the pointer to ctx, nor the src_reg since it contains the value
we want to store, we need an extra register to contain the address
of the sk.

Also adds the macro SOCK_OPS_GET_OR_SET_FIELD that calls one of the
GET or SET macros depending on the value of the TYPE field.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:14 -08:00
Lawrence Brakmo
34d367c592 bpf: Make SOCK_OPS_GET_TCP struct independent
Changed SOCK_OPS_GET_TCP to SOCK_OPS_GET_FIELD and added 2
arguments so now it can also work with struct sock fields.
The first argument is the name of the field in the bpf_sock_ops
struct, the 2nd argument is the name of the field in the OBJ struct.

Previous: SOCK_OPS_GET_TCP(FIELD_NAME)
New:      SOCK_OPS_GET_FIELD(BPF_FIELD, OBJ_FIELD, OBJ)

Where OBJ is either "struct tcp_sock" or "struct sock" (without
quotation). BPF_FIELD is the name of the field in the bpf_sock_ops
struct and OBJ_FIELD is the name of the field in the OBJ struct.

Although the field names are currently the same, the kernel struct names
could change in the future and this change makes it easier to support
that.

Note that adding access to tcp_sock fields in sock_ops programs does
not preclude the tcp_sock fields from being removed as long as we are
willing to do one of the following:

  1) Return a fixed value (e.x. 0 or 0xffffffff), or
  2) Make the verifier fail if that field is accessed (i.e. program
    fails to load) so the user will know that field is no longer
    supported.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:13 -08:00
Lawrence Brakmo
a33de39734 bpf: Make SOCK_OPS_GET_TCP size independent
Make SOCK_OPS_GET_TCP helper macro size independent (before only worked
with 4-byte fields.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:13 -08:00
Lawrence Brakmo
2585cd62f0 bpf: Only reply field should be writeable
Currently, a sock_ops BPF program can write the op field and all the
reply fields (reply and replylong). This is a bug. The op field should
not have been writeable and there is currently no way to use replylong
field for indices >= 1. This patch enforces that only the reply field
(which equals replylong[0]) is writeable.

Fixes: 40304b2a15 ("bpf: BPF support for sock_ops")
Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25 16:41:13 -08:00
Lyude Paul
0fd189a95f drm/nouveau: Move irq setup/teardown to pci ctor/dtor
For a while we've been having issues with seemingly random interrupts
coming from nvidia cards when resuming them. Originally the fix for this
was thought to be just re-arming the MSI interrupt registers right after
re-allocating our IRQs, however it seems a lot of what we do is both
wrong and not even nessecary.

This was made apparent by what appeared to be a regression in the
mainline kernel that started introducing suspend/resume issues for
nouveau:

        a0c9259dc4 (irq/matrix: Spread interrupts on allocation)

After this commit was introduced, we started getting interrupts from the
GPU before we actually re-allocated our own IRQ (see references below)
and assigned the IRQ handler. Investigating this turned out that the
problem was not with the commit, but the fact that nouveau even
free/allocates it's irqs before and after suspend/resume.

For starters: drivers in the linux kernel haven't had to handle
freeing/re-allocating their IRQs during suspend/resume cycles for quite
a while now. Nouveau seems to be one of the few drivers left that still
does this, despite the fact there's no reason we actually need to since
disabling interrupts from the device side should be enough, as the
kernel is already smart enough to know to disable host-side interrupts
for us before going into suspend. Since we were tearing down our IRQs by
hand however, that means there was a short period during resume where
interrupts could be received before we re-allocated our IRQ which would
lead to us getting an unhandled IRQ. Since we never handle said IRQ and
re-arm the interrupt registers, this would cause us to miss all of the
interrupts from the GPU and cause our init process to start timing out
on anything requiring interrupts.

So, since this whole setup/teardown every suspend/resume cycle is
useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor
functions instead so they're only called at driver load and driver
unload. This should fix most of the issues with pending interrupts on
resume, along with getting suspend/resume for nouveau to work again.

As well, this probably means we can also just remove the msi rearm call
inside nvkm_pci_init(). But since our main focus here is to fix
suspend/resume before 4.15, we'll save that for a later patch.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-01-26 09:44:39 +10:00
Rohit Visavalia
fdd6d771c7 qed: code indent should use tabs where possible
Issue found by checkpatch.

Signed-off-by: Rohit Visavalia <rohit.visavalia@softnautics.com>
Acked-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:36:54 -05:00
Rohit Visavalia
5f834cf4b7 be2net: networking block comments don't use an empty /* line
Resolved Warning: networking block comments don't use an empty /* line,
use /* Comment...
Issue found by checkpatch.

Signed-off-by: Rohit Visavalia <rohit.visavalia@softnautics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:34:27 -05:00
David S. Miller
525d0ae7a2 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2018-01-25

Here's one last bluetooth-next pull request for the 4.16 kernel:

 - Improved support for Intel controllers
 - New set_parity method to serdev (agreed with maintainers to be taken
   through bluetooth-next)
 - Fix error path in hci_bcm (missing call to serdev close)
 - New ID for BCM4343A0 UART controller

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:32:28 -05:00
Ganesh Goudar
d9ac2d9978 cxgb4: fix possible deadlock
t4_wr_mbox_meat_timeout() can be called from both softirq
context and process context, hence protect the mbox with
spin_lock_bh() instead of simple spin_lock()

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:30:54 -05:00
Nicolas Dichtel
f15ca723c1 net: don't call update_pmtu unconditionally
Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
"BUG: unable to handle kernel NULL pointer dereference at           (null)"

Let's add a helper to check if update_pmtu is available before calling it.

Fixes: 52a589d51f ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff44 ("vxlan: update skb dst pmtu on tx path")
CC: Roman Kapl <code@rkapl.cz>
CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:27:34 -05:00
David Ahern
955ec4cb3b net/ipv6: Do not allow route add with a device that is down
IPv6 allows routes to be installed when the device is not up (admin up).
Worse, it does not mark it as LINKDOWN. IPv4 does not allow it and really
there is no reason for IPv6 to allow it, so check the flags and deny if
device is admin down.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:22:02 -05:00
David S. Miller
bfd4b329ac Merge branch 'net-smc-more-socket-closing-improvements'
Ursula Braun says:

====================
net/smc: more socket closing improvements

these patches improve the smc behavior for abnormal socket closing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:10:43 -05:00
Ursula Braun
1a0a04c7a8 net/smc: check for healthy link group resp. connections
If a problem for at least one connection of a link group is detected,
the whole link group and all its connections are terminated.
This patch adds a check for healthy link group when trying to reserve
a work request, and checks for healthy connections before starting
a tx worker.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:10:42 -05:00
Ursula Braun
732720fafd net/smc: wake up wr_reg_wait when terminating a link group
If a new connection with a new rmb is added to a link group, its
memory region is registered. If a link group is terminated, a pending
registration requires a wake up.

And consolidate setting of tx_flag peer_conn_abort in smc_lgr_terminate().

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:10:42 -05:00
Ursula Braun
610db66f37 net/smc: do not reuse a linkgroup with setup problems
Once a linkgroup is created successfully, it stays alive for a
certain time to service more connections potentially created.
If one of the initialization steps for a new linkgroup fails,
the linkgroup should not be reused by other connections following.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:10:42 -05:00
Ursula Braun
b4772b3a87 net/smc: terminate link group for ib_post_send problems
If ib_post_send() fails, terminate all connections of this
link group.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:10:42 -05:00
Ursula Braun
5ac92a00aa net/smc: handle state SMC_PEERFINCLOSEWAIT correctly
A state transition from closing state SMC_PEERFINCLOSEWAIT to closing
state SMC_APPFINCLOSEWAIT is not allowed. Once a closing indication
from the peer has been received, the socket reaches state SMC_CLOSED.

And receiving a peer_conn_abort just changes the state of the socket
into one of the states SMC_PROCESSABORT or SMC_CLOSED;
sending a peer_conn_abort occurs in smc_close_active() for state
SMC_PROCESSABORT only.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:10:42 -05:00
Ursula Braun
611b63a127 net/smc: cancel tx worker in case of socket aborts
If an SMC socket is aborted, the tx worker should be cancelled.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:10:42 -05:00
David S. Miller
2611df7a79 Merge branch 'sfc-support-PTP-on-8000-and-X2000-series-NICs'
Edward Cree says:

====================
sfc: support PTP on 8000 and X2000 series NICs

Starting from the 8000-series (Medford 1), SFC NICs can timestamp TX packets
 sent through an ordinary DMA queue, rather than a special control-plane
 operation as in the 7000-series.  Patches 2-8 implement support for this.
The X2000-series (Medford 2) changes the format of timestamps, from seconds+
 (2^27)ths to seconds + quarter nanoseconds, as well as changing the shift
 of the frequency adjustment for increased precision.  Patches 9-12
 implement support for these changes.
Patch #1 is an unrelated fix for NAPI budget handling, needed in order for
 TX completion changes in the later patches to apply cleanly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:15 -05:00
Laurence Evans
88a4fb5fce sfc: support Medford2 frequency adjustment format
Support increased precision frequency adjustment format (FP44) used
 by Medford2 adapters.

Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:15 -05:00
Edward Cree
1280c0f8aa sfc: support second + quarter ns time format for receive datapath
The time_format that we stash in the PTP data structure is never
 referenced, so we can remove it.  Instead, store the information needed
 to interpret sync event timestamps.
Also rolls in a couple of other related minor PTP fixes.

Based on patches by Bert Kenward <bkenward@solarflare.com> and Laurence
 Evans <levans@solarflare.com>.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:15 -05:00
Laurence Evans
04796f4c4d sfc: support separate PTP and general timestamping
Support MC_CMD_PTP_OUT_GET_TIMESTAMP_CORRECTIONS_V2.  Extract general
 timestamp corrections in addition to PTP corrections.  Apply receive
 timestamp corrections for general datapath receive timestamping, and
 correspondingly for transmit.

Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:14 -05:00
Laurence Evans
c4f64fcc4d sfc: simplify RX datapath timestamping
Use timestamp conversion function with correction to avoid duplicate
 correction handling.

Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:14 -05:00
Martin Habets
6aa47c87cb sfc: only advertise TX timestamping if we have the license for it
We check the license for TX hardware timestamping capability.
The PTP probe will have enabled PTP sync events from the adapter.  If
 later, at TX queue init, it turns out we do not have the license, we
 don't need the sync events either.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:14 -05:00
Edward Cree
2935e3c382 sfc: on 8000 series use TX queues for TX timestamps
For this we create and use one or more new TX queues on the PTP channel,
 and enable sync events for it.
Based on a patch by Martin Habets <mhabets@solarflare.com>.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:14 -05:00
Martin Habets
c1d0d33946 sfc: MAC TX timestamp handling on the 8000 series
TX timestamps on 8000 series are supplied from the MAC. This timestamp is
 only 48 bits long. The high order bits from the last time sync event are
 used for the top 16 bits.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:14 -05:00
Martin Habets
50663fe180 sfc: only enable TX timestamping if the adapter is licensed for it
If we try to enable the feature and do not have the license for it, the
 MCPU will refuse and fail our TX queue init.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:14 -05:00
Martin Habets
23418dc131 sfc: use main datapath for HW timestamps if available
We can now transmit SKBs in 2 ways:
1. Via the MC (for the 7XXX series and earlier), using
   efx_ptp_xmit_skb_mc().
2. Via the TX queues on the dedicated PTP channel (8XXX series and later),
   using efx_ptp_xmit_skb_queue().
The PTP worker thread uses the method set up at probe time. It never
 checked the return code from the old efx_ptp_xmit_skb(), so it now
 returns void.
We increment the TX dropped counter of the device if the transmit fails.

As a result of the probe per channel the remove gets called multiple times.
 Clean up efx->ptp_data properly to avoid the 2nd call blowing up.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:14 -05:00
Martin Habets
9c3afb33ae sfc: add function to determine which TX timestamping method to use
Use MC capability MC_CMD_GET_CAPABILITIES_V2_OUT_TX_MAC_TIMESTAMPING to
 detect whether the NIC supports timestamping packets sent out the main
 datapath.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:14 -05:00
Martin Habets
b9b603d46d sfc: handle TX timestamps in the normal data path
Before this work, TX timestamping is done by sending each SKB to the MC.
On the 8000 series (Medford1) we have high speed timestamping via the
 MAC, which means we can use normal TX queues for this without a
 significant drop in bandwidth.  On the X2000 series (Medford2) support
 for transmitting via the MC is removed, so the new way must be used.

This patch enables timestamping on a TX queue, if requested.
It also enhances TX event handling to process the extra completion events,
 and puts the time in the SKB.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:05:13 -05:00