Commit Graph

581 Commits

Author SHA1 Message Date
Bert Kenward
09a04204f0 sfc: Downgrade or remove some error messages
Depending on configuration the NIC may return errors for unprivileged
functions and/or VFs. Where these are expected and handled, reduce the
level of any output.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-23 22:06:39 -05:00
Tomáš Pilař
8c578368e8 sfc: Downgrade EPERM messages from MCDI to debug
When running in an unprivileged function we expect some MC commands
to fail with permission errors. To avoid log spew downgrade these to
debug only.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-23 22:06:39 -05:00
Bert Kenward
e65a510918 sfc: Make failed filter removal less noisy
There are situations - mostly reset related - where our view of the
filter table differs from the hardware. In this case we may try and
remove filters that aren't actually installed. This isn't that
interesting in most situations, so downgrade the logging.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-23 22:06:39 -05:00
Bert Kenward
acd43a9097 sfc: Handle MCDI proxy authorisation
For unprivileged functions operations can be authorised by an admin
function. Extra steps are introduced to the MCDI protocol in this
situation - the initial response from the MCDI tells us that the
operation has been deferred, and we must retry when told. We then
receive an event telling us to retry.

Note that this provides only the functionality for the unprivileged
functions, not the handling of the administrative side.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-23 22:06:39 -05:00
Bert Kenward
ac28d179b8 sfc: Retry MCDI after NO_EVB_PORT error on a VF
After reboot the vswitch configuration from the PF may not be
complete before the VF attempts to restore filters. In that
case we see NO_EVB_PORT errors from the MC. Retry up to a time
limit or until a different result is seen.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-23 22:06:39 -05:00
David S. Miller
b3e0d3d7ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/geneve.c

Here we had an overlapping change, where in 'net' the extraneous stats
bump was being removed whilst in 'net-next' the final argument to
udp_tunnel6_xmit_skb() was being changed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-17 22:08:28 -05:00
Tom Herbert
c8cd0989bd net: Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUM
These netif flags are unnecessary convolutions. It is more
straightforward to just use NETIF_F_HW_CSUM, NETIF_F_IP_CSUM,
and NETIF_F_IPV6_CSUM directly.

This patch also:
    - Cleans up can_checksum_protocol
    - Simplifies netdev_intersect_features

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:50:20 -05:00
Tom Herbert
a188222b6e net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK
The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
set of features for offloading all checksums. This is a mask of the
checksum offload related features bits. It is incorrect to set both
NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
features of a device.

This patch:
  - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
    NETIF_F_ALL_CSUM is being used as a mask).
  - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
    use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:50:08 -05:00
Dan Carpenter
fe0be35e2c sfc: fix a timeout loop
We test for if "tries" is zero at the end but "tries--" is a post-op so
it will end with "tries" set to -1.  I have changed it to a pre-op
instead.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 12:46:26 -05:00
Bert Kenward
f1c2ef40c6 sfc: only use RSS filters if we're using RSS
Without this, filter insertion on a VF would fail if only one channel
was in use. This would include the unicast station filter and therefore
no traffic would be received.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:26:18 -05:00
Daniel Pieczko
abd86a55f4 sfc: check warm_boot_count after other functions have been reset
A change in MCFW behaviour means that the net driver must update its record
of the warm_boot_count by reading it from the ER_DZ_BIU_MC_SFT_STATUS
register.

On v4.6.x MCFW the global boot count was incremented when some functions
needed to be reset to enable multicast chaining, so all functions saw the
same value.  In that case, the driver needed to increment its
warm_boot_count when other functions were reset, to avoid noticing it later
and then trying to reset itself to recover unnecessarily.

With v4.7+ MCFW, the boot count in firmware doesn't change as that is
unnecessary since the PFs that have been reset will each receive an MC
reboot notification.  In that case, the driver re-reads the unchanged
value.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 17:59:10 -05:00
Jarod Wilson
6f24e5d599 sfc: use ALIGN macro for aligning frame sizes
Don't open-code it.

CC: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02 23:56:37 -05:00
Bert Kenward
dd248f1bc6 sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC
Also add support for 7000 series 40G NIC VF.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:46:39 -05:00
Bert Kenward
93171b14a5 sfc: make TSO version a per-queue parameter
The Solarflare 8000 series NIC will use a new TSO scheme. The current
driver refuses to load if the current TSO scheme is not found. Remove
that check and instead make the TSO version a per-queue parameter.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:46:39 -05:00
Eric Dumazet
93d05d4a32 net: provide generic busy polling to all NAPI drivers
NAPI drivers no longer need to observe a particular protocol
to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)

napi_hash_add() and napi_hash_del() are automatically called
from core networking stack, respectively from
netif_napi_add() and netif_napi_del()

This patch depends on free_netdev() and netif_napi_del() being
called from process context, which seems to be the norm.

Drivers might still prefer to call napi_hash_del() on their
own, since they might combine all the rcu grace periods into
a single one, knowing their NAPI structures lifetime, while
core networking stack has no idea of a possible combining.

Once this patch proves to not bring serious regressions,
we will cleanup drivers to either remove napi_hash_del()
or provide appropriate rcu grace periods combining.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:42 -05:00
Eric Dumazet
93f93a4404 net: move skb_mark_napi_id() into core networking stack
We would like to automatically provide busy polling support
to all NAPI drivers, without them having to implement anything.

skb_mark_napi_id() can be called from napi_gro_receive() and
napi_get_frags().

Few drivers are still calling skb_mark_napi_id() because
they use netif_receive_skb(). They should eventually call
napi_gro_receive() instead. I will leave this to drivers
maintainers.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:41 -05:00
Julia Lawall
c300366b6b sfc: constify pci_error_handlers structures
This pci_error_handlers structure is never modified, like all the other
pci_error_handlers structures, so declare it as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-16 15:07:29 -05:00
Christoph Hellwig
8722b8fbce sfc: don't call dma_supported
dma_set_mask already checks for a supported DMA mask before updating it,
the call to dma_supported is redundant.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Shradha Shah <sshah@solarflare.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-10 16:32:11 -08:00
Linus Torvalds
b0f85fa11a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

Changes of note:

 1) Allow to schedule ICMP packets in IPVS, from Alex Gartrell.

 2) Provide FIB table ID in ipv4 route dumps just as ipv6 does, from
    David Ahern.

 3) Allow the user to ask for the statistics to be filtered out of
    ipv4/ipv6 address netlink dumps.  From Sowmini Varadhan.

 4) More work to pass the network namespace context around deep into
    various packet path APIs, starting with the netfilter hooks.  From
    Eric W Biederman.

 5) Add layer 2 TX/RX checksum offloading to qeth driver, from Thomas
    Richter.

 6) Use usec resolution for SYN/ACK RTTs in TCP, from Yuchung Cheng.

 7) Support Very High Throughput in wireless MESH code, from Bob
    Copeland.

 8) Allow setting the ageing_time in switchdev/rocker.  From Scott
    Feldman.

 9) Properly autoload L2TP type modules, from Stephen Hemminger.

10) Fix and enable offload features by default in 8139cp driver, from
    David Woodhouse.

11) Support both ipv4 and ipv6 sockets in a single vxlan device, from
    Jiri Benc.

12) Fix CWND limiting of thin streams in TCP, from Bendik Rønning
    Opstad.

13) Fix IPSEC flowcache overflows on large systems, from Steffen
    Klassert.

14) Convert bridging to track VLANs using rhashtable entries rather than
    a bitmap.  From Nikolay Aleksandrov.

15) Make TCP listener handling completely lockless, this is a major
    accomplishment.  Incoming request sockets now live in the
    established hash table just like any other socket too.

    From Eric Dumazet.

15) Provide more bridging attributes to netlink, from Nikolay
    Aleksandrov.

16) Use hash based algorithm for ipv4 multipath routing, this was very
    long overdue.  From Peter Nørlund.

17) Several y2038 cures, mostly avoiding timespec.  From Arnd Bergmann.

18) Allow non-root execution of EBPF programs, from Alexei Starovoitov.

19) Support SO_INCOMING_CPU as setsockopt, from Eric Dumazet.  This
    influences the port binding selection logic used by SO_REUSEPORT.

20) Add ipv6 support to VRF, from David Ahern.

21) Add support for Mellanox Spectrum switch ASIC, from Jiri Pirko.

22) Add rtl8xxxu Realtek wireless driver, from Jes Sorensen.

23) Implement RACK loss recovery in TCP, from Yuchung Cheng.

24) Support multipath routes in MPLS, from Roopa Prabhu.

25) Fix POLLOUT notification for listening sockets in AF_UNIX, from Eric
    Dumazet.

26) Add new QED Qlogic river, from Yuval Mintz, Manish Chopra, and
    Sudarsana Kalluru.

27) Don't fetch timestamps on AF_UNIX sockets, from Hannes Frederic
    Sowa.

28) Support ipv6 geneve tunnels, from John W Linville.

29) Add flood control support to switchdev layer, from Ido Schimmel.

30) Fix CHECKSUM_PARTIAL handling of potentially fragmented frames, from
    Hannes Frederic Sowa.

31) Support persistent maps and progs in bpf, from Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1790 commits)
  sh_eth: use DMA barriers
  switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursion
  net: sched: kill dead code in sch_choke.c
  irda: Delete an unnecessary check before the function call "irlmp_unregister_service"
  net: dsa: mv88e6xxx: include DSA ports in VLANs
  net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports
  net/core: fix for_each_netdev_feature
  vlan: Invoke driver vlan hooks only if device is present
  arcnet/com20020: add LEDS_CLASS dependency
  bpf, verifier: annotate verbose printer with __printf
  dp83640: Only wait for timestamps for packets with timestamping enabled.
  ptp: Change ptp_class to a proper bitmask
  dp83640: Prune rx timestamp list before reading from it
  dp83640: Delay scheduled work.
  dp83640: Include hash in timestamp/packet matching
  ipv6: fix tunnel error handling
  net/mlx5e: Fix LSO vlan insertion
  net/mlx5e: Re-eanble client vlan TX acceleration
  net/mlx5e: Return error in case mlx5e_set_features() fails
  net/mlx5e: Don't allow more than max supported channels
  ...
2015-11-04 09:41:05 -08:00
Linus Torvalds
d63a978865 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
 "The main changes in this cycle were:

   - More gradual enhancements to atomic ops: new atomic*_read_ctrl()
     ops, synchronize atomic_{read,set}() ordering requirements between
     architectures, add atomic_long_t bitops.  (Peter Zijlstra)

   - Add _{relaxed|acquire|release}() variants for inc/dec atomics and
     use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
     This enables weakly ordered architectures (such as arm64) to make
     use of more locking related optimizations.  (Davidlohr Bueso)

   - Implement atomic[64]_{inc,dec}_relaxed() on ARM.  (Will Deacon)

   - Futex kernel data cache footprint micro-optimization.  (Rasmus
     Villemoes)

   - pvqspinlock runtime overhead micro-optimization.  (Waiman Long)

   - misc smaller fixlets"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
  locking/rwsem: Use acquire/release semantics
  locking/mcs: Use acquire/release semantics
  locking/rtmutex: Use acquire/release semantics
  locking/mutex: Use acquire/release semantics
  locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
  atomic: Implement atomic_read_ctrl()
  atomic, arch: Audit atomic_{read,set}()
  atomic: Add atomic_long_t bitops
  futex: Force hot variables into a single cache line
  locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
  locking/osq: Relax atomic semantics
  locking/qrwlock: Rename ->lock to ->wait_lock
  locking/Documentation/lockstat: Fix typo - lokcing -> locking
  locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h
2015-11-03 16:10:43 -08:00
David S. Miller
73186df8d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor overlapping changes in net/ipv4/ipmr.c, in 'net' we were
fixing the "BH-ness" of the counter bumps whilst in 'net-next'
the functions were modified to take an explicit 'net' parameter.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 13:41:45 -05:00
Martin Habets
b2663a4f30 sfc: push partner queue for skb->xmit_more
When the IP stack passes SKBs the sfc driver puts them in 2 different TX
queues (called partners), one for checksummed and one for not checksummed.
If the SKB has xmit_more set the driver will delay pushing the work to the
NIC.

When later it does decide to push the buffers this patch ensures it also
pushes the partner queue, if that also has any delayed work. Before this
fix the work in the partner queue would be left for a long time and cause
a netdev watchdog.

Fixes: 70b33fb ("sfc: add support for skb->xmit_more")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-02 23:02:58 -05:00
Bert Kenward
c0f9c7e45d sfc: replace spinlocks with bit ops for busy poll locking
This patch reduces the overhead of locking for busy poll.
Previously the state was protected by a lock, whereas now
it's manipulated solely with atomic operations.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:40:33 -07:00
Daniel Pieczko
c577e59ed7 sfc: fully reset if MC_REBOOT event received without warm_boot_count increment
On EF10, MC_CMD_VPORT_RECONFIGURE can cause a CODE_MC_REBOOT event
to be sent to a function without incrementing the (adapter-wide)
warm_boot_count.  In this case, the reboot is not detected by the
loop on efx_mcdi_poll_reboot(), so prepare for recovery from an MC
reboot anyway.  When this codepath is run, the MC has always just
rebooted, so this recovery is valid.

The loop on efx_mcdi_poll_reboot() is still required for other MC
reboot cases, so that actions in response to an MC reboot are
performed, such as clearing locally calculated statistics.
Siena NICs are unaffected by this change as the above scenario
does not apply.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:35:25 -07:00
Arnd Bergmann
090e2edb41 net: sfc: avoid using timespec
The sfc driver internally uses a time format based on 32-bit (unsigned)
seconds and 32-bit nanoseconds. This means it will overflow in 2106,
but the value we pass into it is a signed 32-bit tv_sec that already
overflows in 2038 to a negative value.

This patch changes the logic to use the lower 32 bits of the timespec64
tv_sec in efx_ptp_ns_to_s_ns, which will have the correct value beyond the overflow.
While this does not change any of the register values, it lets us
keep using the driver after we deprecate the use of the timespec type
in the kernel.

In the efx_ptp_process_times function, the change to use timespec64
is similar, in that the tv_sec portion is ignored anyway and we only
care about the nanosecond portion that remains unchanged.

Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2015-10-01 09:59:24 -07:00
Arnd Bergmann
ade1bdffe9 ntp/pps: use y2038 safe types in pps_event_time
The pps_event_time uses two 'timespec' structures internally, which
suffer from the y2038 problem. The uses of this structure are
fairly self-contained in the pps code, so this replaces them all at
once.

Unfortunately, this includes the sfc ethernet driver aside from the
pps subsystem, so we change that one as well. Both touch the
same data structure, and there probably is no good way to split
the patch into smaller units.

Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2015-10-01 09:59:16 -07:00
Boqun Feng
8456799561 locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h
After commit:

  654672d4ba ("locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations")

Architectures may only provide {cmp,}xchg_relaxed definitions in
asm/cmpxchg.h. Other variants, such as {cmp,}xchg, may be built in
linux/atomic.h, which means simply including asm/cmpxchg.h may not get
the definitions of all the{cmp,}xchg variants.

Therefore, we should privatize the inclusions of asm/cmpxchg.h to
keep it only included in arch/* and replace the inclusions outside
with linux/atomic.h

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aybuke Ozdemir <aybuke.147@gmail.com>
Cc: Chris Brannon <chris@the-brannons.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirk Reiser <kirk@reisers.ca>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
Cc: Shradha Shah <sshah@solarflare.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: William Hubbs <w.d.hubbs@gmail.com>
Cc: devel@driverdev.osuosl.org
Cc: linux-net-drivers@solarflare.com
Cc: speakup@linux-speakup.org
Link: http://lkml.kernel.org/r/1440589966-26280-1-git-send-email-boqun.feng@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-13 10:35:46 +02:00
Shradha Shah
b0fbdae127 sfc: Allow driver to cope with a lower number of VIs than it needs for RSS
Previously, the driver would refuse to load if it couldn't secure
enough VIs from the MC to fulfill its RSS requirements.
This was causing probe to fail on later functions in
configurations where we'd run out of VIs, such as having many
VFs.

This change allows the driver to load with fewer VIs, down to a
minimum of 2. A warning will be printed saying that RSS
requirements were not met, possibly affecting performance.

efx->max_tx_channels needs to be set to avoid going down the
failure path in efx_probe_nic() immediately in the loop after the
probe() NIC-type function.
Also, Set rc=ENOSPC when bombing out of efx_probe_nic due to lack
of VIs.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28 13:53:47 -07:00
David S. Miller
0d36938bb8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-27 21:45:31 -07:00
Bert Kenward
fbe4307e9f sfc: only use vadaptor stats if firmware is capable
Some of the stats handling code differs based on SR-IOV support,
and SRIOV support is only available if full-featured firmware is
used.
Do not use vadaptor stats if firmware mode is not set to
full-featured.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 11:27:01 -07:00
Daniel Pieczko
774ad031dd sfc: MC allocations must be restored following an entity reset
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-31 15:32:05 -07:00
Daniel Pieczko
2732482020 sfc: allow ethtool selftest and MC reboot to complete on an unprivileged function
The policy in the net driver is to attempt MCDI commands and
then handle any EPERM error codes appropriately when returned
by unprivileged functions.
The ethtool selftest contains some tests which are useful on
an unprivileged function, such as the event queue interrupt
tests, but other tests cannot be performed as the function
does not have the required permissions.

If a test returns -EPERM, act as though the test was not run
and continue.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-31 15:32:05 -07:00
Edward Cree
12fb0da45c sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode
Separate functions for inserting individual and promisc filters; explicit
 fallback logic in efx_ef10_filter_sync_rx_mode(), in order not to overload
 the 'promisc' flag as also meaning "fall back to promisc".

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Daniel Pieczko
ab8b1f7cf8 sfc: support cascaded multicast filters
If the workaround to support cascaded multicast filters ("workaround_26807") is
enabled, the broadcast filter and individual multicast filters are not inserted
when in promiscuous or allmulti mode.

There is a race while inserting and removing filters when entering and leaving
promiscuous mode.  When changing promiscuous state with cascaded multicast
filters, the old multicast filters are removed before inserting the new filters
to avoid duplicating packets; this can lead to dropped packets until all
filters have been inserted.

The efx_nic:mc_promisc flag is added to record the presence of a multicast
promiscuous filter; this gives a simple way to tell if the promiscuous state is
changing.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Daniel Pieczko
822b96f87f sfc: re-factor efx_ef10_filter_sync_rx_mode()
This change is only re-factoring; there are no changes to functionality
 except for a slight elaboration of an error message (on mismatch filter
 insertion failure).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Jon Cooper
b6f568e27b sfc: Insert multicast filters as well as mismatch filters in promiscuous mode
If a function is in promiscuous mode and another function has a broadcast or
 multicast filter inserted, the function in promiscuous mode won't see that
 broadcast or multicast traffic.
Most notably this breaks broadcast, which means ARP doesn't work. Less
 show-stoppingly, a function listening on a multicast address that's also in
 promiscuous mode will not see that multicast traffic if another function is
 also listening on that multicast address.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Daniel Pieczko
5a55a72abe sfc: warn if other functions have been reset by MCFW
When enabling the workaround for cascaded multicast filters, the MC
 can reset other functions if they have already inserted filters.
 In that case, the workaround has been enabled, but print an info
 message in the log recording that other functions had to be reset.

As other functions were reset, the MC will have incremented its boot
 count, so also increment the warm_boot_count on the function which
 enabled the workaround, as that function won't have received an MC
 reboot event and does not need to reset.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Daniel Pieczko
34ccfe6f8a sfc: add output flag decoding to efx_mcdi_set_workaround
The initial use of this will be to check a flag reporting if an FLR was
performed on other functions when enabling cascaded multicast filters.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:32 -07:00
Edward Cree
832dc9ed43 sfc: cope with ENOSYS from efx_mcdi_get_workarounds()
GET_WORKAROUNDS was only introduced in May 2014, not all firmware
 will have it.  So call sites need to handle ENOSYS.
In this case we're probing the bug26807 workaround, which is not
 implemented in any firmware that doesn't have GET_WORKAROUNDS.
 So interpret ENOSYS as 'false'.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:31 -07:00
Daniel Pieczko
46e612b0fc sfc: enable cascaded multicast filters in MCFW
After creating event queue 0, check to see if the workaround is enabled,
 and enable it if necessary.  This will be called during PCI probe and
 also when coming back up after a reset.  The nic_data->workaround_26807
 will be used in the future to control the filter insertion behaviour
 based on this workaround.

Only the primary PF can enable this workaround, so tolerate an EPERM
 error and continue.  Otherwise, if any step in the checking and enabling
 of the workaround fails, the event queue must be removed.

We check that workaround is implemented before trying to enable it,
 and store the current workaround setting before trying to change it.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:31 -07:00
Edward Cree
a9196bb048 sfc: update MCDI protocol definitions
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 22:21:31 -07:00
Jacob Keller
7415991ead siena: only report generic filters in get_ts_info
CC: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
CC: Shradha Shah <sshah@solarflare.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-07-17 19:59:06 -07:00
Peter Dunning
c936835c1e sfc: Report TX completions to BQL after all TX events in interrupt
The limit for BQL is updated each time we call
netdev_tx_completed_queue.
Without this patch the BQL limit was updated for every TX event we
see.
The issue was that this only updated the limit to handle the data
we complete in two events as the first event wouldn't show that
enough traffic had been processed between them.

This was OK when interrupt moderation was off but not when it was
on as more data had to be completed in a single interrupt.

The patch changes this so that we do report the completion to BQL
only when all the TX events in the interrupt have been processed.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09 00:00:40 -07:00
Shradha Shah
671b53eec2 sfc: Ensure down_write(&filter_sem) and up_write() are matched before calling efx_net_open()
This patch avoids the double up_write to filter_sem if
efx_net_open() fails.

Resolves: 2d432f20d2

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08 16:18:52 -07:00
Daniel Pieczko
535a61777f sfc: suppress handled MCDI failures when changing the MAC address
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08 16:07:33 -07:00
Daniel Pieczko
7a186f4703 sfc: add legacy method for changing a PF's MAC address
Some versions of MCFW do not support the MC_CMD_VADAPTOR_SET_MAC
command, and ENOSYS will be returned.

If the PF created its own vport, the function's datapath must be
stopped and the vport can be reconfigured to reflect the new MAC
address.

If the MCFW created the vport for the PF (which is the case when
the nic_data->vport_mac is blank), nothing further needs to be
done as the vport is not under the control of the PF.

This only applies to PFs because the MCFW in question does not
support VFs.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08 16:07:33 -07:00
Daniel Pieczko
9e9f665a18 sfc: refactor code in efx_ef10_set_mac_address()
Re-organize the structure of error handling to avoid having
to duplicate the netif_err() around the ifdefs.

The only change to the behaviour of the error-handling is that
the PF's data structure to record VF details should only be
updated if the original command succeeded.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08 16:07:33 -07:00
Linus Torvalds
e0456717e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add TX fast path in mac80211, from Johannes Berg.

 2) Add TSO/GRO support to ibmveth, from Thomas Falcon

 3) Move away from cached routes in ipv6, just like ipv4, from Martin
    KaFai Lau.

 4) Lots of new rhashtable tests, from Thomas Graf.

 5) Run ingress qdisc lockless, from Alexei Starovoitov.

 6) Allow servers to fetch TCP packet headers for SYN packets of new
    connections, for fingerprinting.  From Eric Dumazet.

 7) Add mode parameter to pktgen, for testing receive.  From Alexei
    Starovoitov.

 8) Cache access optimizations via simplifications of build_skb(), from
    Alexander Duyck.

 9) Move page frag allocator under mm/, also from Alexander.

10) Add xmit_more support to hv_netvsc, from KY Srinivasan.

11) Add a counter guard in case we try to perform endless reclassify
    loops in the packet scheduler.

12) Extern flow dissector to be programmable and use it in new "Flower"
    classifier.  From Jiri Pirko.

13) AF_PACKET fanout rollover fixes, performance improvements, and new
    statistics.  From Willem de Bruijn.

14) Add netdev driver for GENEVE tunnels, from John W Linville.

15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.

16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.

17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
    Borkmann.

18) Add tail call support to BPF, from Alexei Starovoitov.

19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.

20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.

21) Favor even port numbers for allocation to connect() requests, and
    odd port numbers for bind(0), in an effort to help avoid
    ip_local_port_range exhaustion.  From Eric Dumazet.

22) Add Cavium ThunderX driver, from Sunil Goutham.

23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata,
    from Alexei Starovoitov.

24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.

25) Double TCP Small Queues default to 256K to accomodate situations
    like the XEN driver and wireless aggregation.  From Wei Liu.

26) Add more entropy inputs to flow dissector, from Tom Herbert.

27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
    Jonassen.

28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.

29) Track and act upon link status of ipv4 route nexthops, from Andy
    Gospodarek.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
  bridge: vlan: flush the dynamically learned entries on port vlan delete
  bridge: multicast: add a comment to br_port_state_selection about blocking state
  net: inet_diag: export IPV6_V6ONLY sockopt
  stmmac: troubleshoot unexpected bits in des0 & des1
  net: ipv4 sysctl option to ignore routes when nexthop link is down
  net: track link-status of ipv4 nexthops
  net: switchdev: ignore unsupported bridge flags
  net: Cavium: Fix MAC address setting in shutdown state
  drivers: net: xgene: fix for ACPI support without ACPI
  ip: report the original address of ICMP messages
  net/mlx5e: Prefetch skb data on RX
  net/mlx5e: Pop cq outside mlx5e_get_cqe
  net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
  net/mlx5e: Remove extra spaces
  net/mlx5e: Avoid TX CQE generation if more xmit packets expected
  net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
  net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
  net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
  net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
  net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
  ...
2015-06-24 16:49:49 -07:00
Edward Cree
ea6bb99ed5 sfc: mark state UNINIT after unregister
Without this change, modprobe -r sfc hits the BUG_ON() in
efx_pci_remove_main().

Fixes: e7fef9b45a ("sfc: add sysfs entry to control MCDI tracing")
Reported-by: Jarod Wilson <jarod@redhat.com>
Reviewed-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15 19:55:01 -07:00
Daniel Pieczko
6598dad26b sfc: leak vports if a VF is assigned during PF unload
If any VF is assigned as the PF is unloaded, do not attempt to
remove its vport or the vswitch.  These will be removed if the
driver binds to the PF again, as an entity reset occurs during
probe.

A 'force' flag is added to efx_ef10_pci_sriov_disable() to
distinguish between disabling SR-IOV and driver unload.
SR-IOV cannot be disabled if VFs are assigned to guests.

If the PF driver is unloaded while VFs are assigned, the driver
may try to bind to the VF again at a later point if the driver
has been reloaded and the VF returns to the same domain as the PF.
In this case, the PF will not have a VF data structure, so the VF
can check this and drop out of probe early.

In this case, efx->vf_count will be zero but VFs will be present.
The user is advised to remove the VF and re-create it. The check
at the beginning of efx_ef10_pci_sriov_disable() that
efx->vf_count is non-zero is removed to allow SR-IOV to be
disabled in this case. Also, if the PF driver is unloaded, it
will disable SR-IOV to remove these unknown VFs.

By not disabling bus-mastering if VFs are still assigned, the VF
will continue to pass traffic after the PF has been removed.

When using the max_vfs module parameter, if VFs are already
present do not try to initialise any more.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-02 12:57:32 -07:00