Commit Graph

618886 Commits

Author SHA1 Message Date
John Crispin
092183df0f net-next: dsa: make the set_addr() operation optional
Only 1 of the 3 drivers currently has a set_addr() operation. Make the
set_addr() callback optional to reduce the amount of empty stubs inside
the drivers.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:47:44 -04:00
John Crispin
06f8ec9041 net-next: dsa: fix duplicate invocation of set_addr()
commit 83c0afaec7 ("net: dsa: Add new binding implementation")
has a duplicate invocation of the set_addr() operation callback. Remove one
of them.

Signed-off-by: John Crispin <john@phrozen.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:47:44 -04:00
David S. Miller
f361bdde61 Merge branch 'rhashtable-dups'
Herbert Xu says:

====================
rhashtable: rhashtable with duplicate objects

v3 fixes a bug in the remove path that causes the element count
to decrease when it shouldn't, leading to a gigantic hash table
when it underflows.

v2 contains a reworked insertion slowpath to ensure that the
spinlock for the table we're inserting into is taken.

This series contains two patches.  The first adds the rhlist
interface and the second converts mac80211 to use it.  If this
works out I'll then proceed to convert the other insecure_elasticity
users over to this.

I've tested the rhlist code with test_rhashtable but I haven't
tested the mac80211 conversion.  So please give it a go and see
if it still works.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:43:43 -04:00
Herbert Xu
83e7e4ce9e mac80211: Use rhltable instead of rhashtable
mac80211 currently uses rhashtable with insecure_elasticity set
to true.  The latter is because of duplicate objects.  What's
more, mac80211 walks the rhashtable chains by hand which is broken
as rhashtable may contain multiple tables due to resizing or
rehashing.

This patch fixes it by converting it to the newly added rhltable
interface which is designed for use with duplicate objects.

With rhltable a lookup returns a list of objects instead of a
single one.  This is then fed into the existing for_each_sta_info
macro.

This patch also deletes the sta_addr_hash function since rhashtable
defaults to jhash.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:43:36 -04:00
Herbert Xu
ca26893f05 rhashtable: Add rhlist interface
The insecure_elasticity setting is an ugly wart brought out by
users who need to insert duplicate objects (that is, distinct
objects with identical keys) into the same table.

In fact, those users have a much bigger problem.  Once those
duplicate objects are inserted, they don't have an interface to
find them (unless you count the walker interface which walks
over the entire table).

Some users have resorted to doing a manual walk over the hash
table which is of course broken because they don't handle the
potential existence of multiple hash tables.  The result is that
they will break sporadically when they encounter a hash table
resize/rehash.

This patch provides a way out for those users, at the expense
of an extra pointer per object.  Essentially each object is now
a list of objects carrying the same key.  The hash table will
only see the lists so nothing changes as far as rhashtable is
concerned.

To use this new interface, you need to insert a struct rhlist_head
into your objects instead of struct rhash_head.  While the hash
table is unchanged, for type-safety you'll need to use struct
rhltable instead of struct rhashtable.  All the existing interfaces
have been duplicated for rhlist, including the hash table walker.

One missing feature is nulls marking because AFAIK the only potential
user of it does not need duplicate objects.  Should anyone need
this it shouldn't be too hard to add.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:43:36 -04:00
Vitaly Kuznetsov
fd07160bb7 xen-netfront: avoid packet loss when ethernet header crosses page boundary
Small packet loss is reported on complex multi host network configurations
including tunnels, NAT, ... My investigation led me to the following check
in netback which drops packets:

        if (unlikely(txreq.size < ETH_HLEN)) {
                netdev_err(queue->vif->dev,
                           "Bad packet size: %d\n", txreq.size);
                xenvif_tx_err(queue, &txreq, extra_count, idx);
                break;
        }

But this check itself is legitimate. SKBs consist of a linear part (which
has to have the ethernet header) and (optionally) a number of frags.
Netfront transmits the head of the linear part up to the page boundary
as the first request and all the rest becomes frags so when we're
reconstructing the SKB in netback we can't distinguish between original
frags and the 'tail' of the linear part. The first SKB needs to be at
least ETH_HLEN size. So in case we have an SKB with its linear part
starting too close to the page boundary the packet is lost.

I see two ways to fix the issue:
- Change the 'wire' protocol between netfront and netback to start keeping
  the original SKB structure. We'll have to add a flag indicating the fact
  that the particular request is a part of the original linear part and not
  a frag. We'll need to know the length of the linear part to pre-allocate
  memory.
- Avoid transmitting SKBs with linear parts starting too close to the page
  boundary. That seems preferable short-term and shouldn't bring
  significant performance degradation as such packets are rare. That's what
  this patch is trying to achieve with skb_copy().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:40:41 -04:00
Raju Lakkaraju
1a21101d21 net: phy: Add MAC-IF driver for Microsemi PHYs.
All the review comments updated and resending for review.

This is MAC interface feature.
Microsemi PHY can support RGMII, RMII or GMII/MII interface between MAC and PHY.
MAC-IF function program the right value based on Device tree configuration.

Tested on Beaglebone Black with VSC 8531 PHY.

Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:39:23 -04:00
Ido Schimmel
1a9234e66e mlxsw: spectrum: Fix sparse warnings
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:251:28: warning: symbol
'mlxsw_sp_span_entry_find' was not declared. Should it be static?
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:265:28: warning: symbol
'mlxsw_sp_span_entry_get' was not declared. Should it be static?
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: warning: mixing
different enum types
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56:     int enum
mlxsw_sp_span_type  versus
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56:     int enum
mlxsw_reg_mpar_i_e
...
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: warning:
mixing different enum types
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32:     int
enum mlxsw_reg_sbxx_dir  versus
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32:     int
enum devlink_sb_pool_type
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: warning:
mixing different enum types
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39:     int
enum mlxsw_reg_sbpr_mode  versus
drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39:     int
enum devlink_sb_threshold_type
...
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: warning:
mixing different enum types
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54:     int
enum mlxsw_sp_l3proto  versus
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54:     int
enum mlxsw_reg_ralxx_protocol
...
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1749:6: warning:
symbol 'mlxsw_sp_fib_entry_put' was not declared. Should it be static?

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:32:50 -04:00
Elad Raz
18c2d2c113 mlxsw: Change the RX LAG hash function from XOR to CRC
Change the RX hash function from XOR to CRC in order to have better
distribution of the traffic.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:32:50 -04:00
Rob Swindell
878786d95e bnxt_en: Fix build error for kernesl without RTC-LIB
bnxt_hwrm_fw_set_time() now returns -EOPNOTSUPP when built for kernel
without RTC_LIB.  Setting the firmware time is not critical to the
successful completion of the firmware update process.

Signed-off-by: Rob Swindell <Rob.Swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 03:53:49 -04:00
Jamal Hadi Salim
5a7a5555a3 net sched: stylistic cleanups
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 22:04:14 -04:00
Roman Mashak
f71b109f17 net sched actions police: peg drop stats for conforming traffic
setting conforming action to drop is a valid policy.
When it is set we need to at least see the stats indicating it
for debugging.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 22:04:14 -04:00
Jamal Hadi Salim
408fbc22ef net sched ife action: Introduce skb tcindex metadata encap decap
Sample use case of how this is encoded:
user space via tuntap (or a connected VM/Machine/container)
encodes the tcindex TLV.

Sample use case of decoding:
IFE action decodes it and the skb->tc_index is then used to classify.
So something like this for encoded ICMP packets:

.. first decode then reclassify... skb->tcindex will be set
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xbeef \
u32 match u32 0 0 flowid 1:1 \
action ife decode reclassify

...next match the decode icmp packet...
sudo $TC filter add dev $ETH parent ffff: prio 4 protocol ip \
u32 match ip protocol 1 0xff flowid 1:1 \
action continue

... last classify it using the tcindex classifier and do someaction..
sudo $TC filter add dev $ETH parent ffff: prio 5 protocol ip \
handle 0x11 tcindex classid 1:1 \
action blah..

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:55:28 -04:00
Jamal Hadi Salim
6a5d58b67e net sched ife action: add 16 bit helpers
encoder and checker for 16 bits metadata

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:55:28 -04:00
Baoyou Xie
766a0e978f net/mlx5: clean function declarations in eswitch.c up
We get 2 warnings when building kernel with W=1:
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:463:5: warning: no previous prototype for 'esw_offloads_init' [-Wmissing-prototypes]
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:521:6: warning: no previous prototype for 'esw_offloads_cleanup' [-Wmissing-prototypes]

In fact, both functions are declared in
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c,but should be
declared in a header file, thus can be recognized in other file.

So this patch moves the declarations into
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h

Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:52:10 -04:00
Baoyou Xie
d766e7e6b6 be2net: mark symbols static where possible
We get 4 warnings when building kernel with W=1:
drivers/net/ethernet/emulex/benet/be_main.c:4368:6: warning: no previous prototype for 'be_calculate_pf_pool_rss_tables' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4385:5: warning: no previous prototype for 'be_get_nic_pf_num_list' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4537:6: warning: no previous prototype for 'be_reset_nic_desc' [-Wmissing-prototypes]
drivers/net/ethernet/emulex/benet/be_cmds.c:4910:5: warning: no previous prototype for '__be_cmd_set_logical_link_config' [-Wmissing-prototypes]

In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
so this patch marks these functions with 'static'.

Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:52:10 -04:00
Baoyou Xie
9940803065 phy: mark lan88xx_suspend() static
We get 1 warning when building kernel with W=1:
drivers/net/phy/microchip.c:58:5: warning: no previous prototype for 'lan88xx_suspend' [-Wmissing-prototypes]

In fact, this function is only used in the file in which it is
declared and don't need a declaration, but can be made static.
so this patch marks this function with 'static'.

Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:52:10 -04:00
Philippe Reynes
6b352ebccb net: ethernet: broadcom: bcmgenet: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:39:12 -04:00
Philippe Reynes
639cfa9e8c net: ethernet: broadcom: bcm63xx: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:39:12 -04:00
Philippe Reynes
625eb8667d net: ethernet: broadcom: bcm63xx: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:39:12 -04:00
Philippe Reynes
2406e5d4c4 net: ethernet: broadcom: b44: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:39:12 -04:00
Philippe Reynes
51f141bec1 net: ethernet: broadcom: b44: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:39:11 -04:00
David S. Miller
583c139c71 Merge branch 'bnxt_en-next'
Michael Chan says:

====================
bnxt: update for net-next.

Misc. changes and minor bug fixes for net-next.  Please review.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:31 -04:00
Eddie Wai
350a714960 bnxt_en: Fixed the VF link status after a link state change
The VF link state can be changed via the 'ip link set' cmd.
Currently, the new link state does not take effect immediately.

The fix is for the PF to send a link change async event to the
designated VF after a VF link state change.  This async event will
trigger the VF to update the link status.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:25 -04:00
Deepak Khungar
ae8e98a6fa bnxt_en: Support for "ethtool -r" command
Restart autoneg if autoneg is enabled.

Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:25 -04:00
Michael Chan
4ffcd58230 bnxt_en: Pad TX packets below 52 bytes.
The hardware has a limitation that it won't pass host to BMC loopback
packets below 52-bytes.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:25 -04:00
Michael Chan
001154eb24 bnxt_en: Call firmware to approve the random VF MAC address.
After generating the random MAC address for VF, call the firmware to
approve it.  This step serves 2 purposes.  Some hypervisor (e.g. ESX)
wants to approve the MAC address.  2nd, the call will setup the
proper forwarding database in the internal switch.

We need to unlock the hwrm_cmd_lock mutex before calling bnxt_approve_mac().
We can do that because we are at the end of the function and all the
previous firmware response data has been copied.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:25 -04:00
Michael Chan
7cc5a20e38 bnxt_en: Re-arrange bnxt_hwrm_func_qcaps().
Re-arrange the code so that the generation of the random MAC address for
the VF is at the end of the function.  The next patch will add one more step
to call bnxt_approve_mac() to get the firmware to approve the random MAC
address.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:25 -04:00
Michael Chan
47f8e8b9bb bnxt_en: Fix ethtool -l|-L inconsistent channel counts.
The existing code is inconsistent in reporting and accepting the combined
channel count.  bnxt_get_channels() reports maximum combined as the
maximum rx count.  bnxt_set_channels() accepts combined count that
cannot be bigger than max rx or max tx.

For example, if max rx = 2 and max tx = 1, we report max supported
combined to be 2.  But if the user tries to set combined to 2, it will
fail because 2 is bigger than max tx which is 1.

Fix the code to be consistent.  Max allowed combined = max(max_rx, max_tx).
We will accept a combined channel count <= max(max_rx, max_tx).

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:25 -04:00
Rob Swindell
5ac67d8bc7 bnxt_en: Added support for Secure Firmware Update
Using Ethtool flashdev command, entire NVM package (*.pkg) files
may now be staged into the "update" area of the NVM and subsequently
verified and installed by the firmware using the newly introduced
command: NVM_INSTALL_UPDATE.

We also introduce use of the new firmware command FW_SET_TIME so that the
NVM-resident package installation log contains valid time-stamps.

Signed-off-by: Rob Swindell <Rob.Swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:24 -04:00
Michael Chan
441cabbbf1 bnxt_en: Update to firmware interface spec 1.5.1.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:24 -04:00
Michael Chan
adbc830545 bnxt_en: Simplify PCI device names and add additinal PCI IDs.
Remove "Single-port/Dual-port" from the device names.  Dual-port devices
will appear as 2 separate devices, so no need to call each a dual-port
device.  Use a more generic name for VF devices belonging to the same
chip fanmily.  Add some remaining NPAR device IDs.

Signed-off-by: David Christensen <david.christensen@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:24 -04:00
Michael Chan
8d6be8b627 bnxt_en: Use RSS flags defined in the bnxt_hsi.h file.
And remove redundant definitions of the same flags.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:32:24 -04:00
Steffen Klassert
07b26c9454 gso: Support partial splitting at the frag_list pointer
Since commit 8a29111c7 ("net: gro: allow to build full sized skb")
gro may build buffers with a frag_list. This can hurt forwarding
because most NICs can't offload such packets, they need to be
segmented in software. This patch splits buffers with a frag_list
at the frag_list pointer into buffers that can be TSO offloaded.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 20:59:34 -04:00
David S. Miller
e867e87ae8 RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAV90ZzvSw1s6N8H32AQIBsg//StI2J/tyGGTvvVXoeWPAGeDZp8907Qbc
 O6671V3feft0HvenQ/uFfC9PdpRTYHTlQhWidk9/wTx5nvg/5b7HGHSD8ghPnCRj
 Jyu4XbpXM6vw650qSBpRAg5TLIAkSsrdjaANzFSYw70sX6eTLdKJF3YFyoDRD2Cc
 T2Y0XIUAcACl5/A/KIHRaKjTtmzhoc5Vih9lOWQRZ/dw2u0A6GdwN0qoFfHqXwvI
 t1S4MtSkXrO8D9uWqmFhTpRzSAzO57L7bccVJbr+OzdNAm/Bkicjrxyzo6YepDAg
 VRdYaNtKCR9vPRFb9vZFlZy3vyUHb/22mrgM6/V5GG5qd/NFSnouFZ/WGxLnv1Yt
 5X+F0qJXXgxSmaGt0f6cV7d9OO24HU/YWGl7wUCHsMZ4bWDySBUWtk0X4USpCsiD
 +AUxYJTEUNKZcCsW8XmC+cfnkRLNPycX603/pCH7aiPdQ5qY+ue89ICxu3PrJrf/
 S8PuQPIoEWmEdTQ9QweG9FpIuQc2LO9fmzOJdiZXHYbGAKFOfNNz5Z64tSVRvPVE
 WbxW0HYSNCJmBt15OI3hyONmbgACgmuD+e3EFWzCATz94FwMogLBFNqPcw1YpbwX
 48RdcP62gx7FMZ4MdqfWllviZKjTlTVmacCRghwGFBOuZL3ifu1gfhsTbrAxqXCs
 CSlnXgZV7gI=
 =/2cJ
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20160917-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Tracepoint addition and improvement

Here is a set of patches that add some more tracepoints and improve a couple
of existing ones.  New additions include:

 (1) Connection refcount tracking.

 (2) Client connection state machine tracking.

 (3) Tx and Rx packet lifecycle.

 (4) ACK reception and transmission.

 (5) recvmsg processing.

Updates include:

 (1) Print the symbolic packet name in the Rx packet tracepoint.

 (2) Additional call refcount trace events.

 (3) Improvements to sk_buff tracking with AF_RXRPC.

In addition:

 (1) Config option to inject packet loss during both transmission and
     reception.

 (2) Removal of some printks.

This series needs to be applied on top of the previously posted fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:52:21 -04:00
David S. Miller
5b0c6fc8ef RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAV90ZnPSw1s6N8H32AQKguA//Q7lvTa3n3DFvMQcIPsyJZ6VniUqksTA1
 wmQrw4GHXRUgM8UWz7G9Y5aqxUp2q6y6Vm9BeHkQ2bYZjSrOx5Dc/AImWhBn5au+
 h+HZYcEs4mFM5AoVT7GK8o/nNODDjNt2qwcH/Nf8+SM7Xf52zYHelrSteLZZ1YWO
 S1m4pa5YUBa/ICD3+K52lWq9SCG0VMmy41UHDXU6uSakfN62rn9ZYCcNeathlJGS
 2D0cG3GzYYHiBZ1CkmdPQgGQhTM3wzI+0OvbNnidFlF78zVDlxX8C9zgWs/VlTCg
 02ok4+ftzDojgXH9W+DziEiyCNq14GTDbrdSai5WA+vHVGagr6OoSwDWgPHkXBvW
 pYQh7jqRoBOKN2fsVkU0t19hPc2CCVGYMh49A6AFv8lgS+nWoDXOlmZ0snh8deQg
 Z0HO5mx+V+4yJplBlwH6ncvbRB9ywpsvIuLriZXC/aJg6aY4a8nrU35d1+6xUaM7
 RBMud0uj+7oU+sC9N7CuM8m8HpBOg6+qAsbsfATSwadMRcMdS4LSoXcBg0WvljIH
 JmtL924yEnMDw1yPkmuDBcQ9K6DuxeOYZg4A2756tBtGulxuVjntmI1MVAQlsbqH
 CnNPWpxIDoLRQsHVcYWS5O1F16drGobzFhmj7Hf/6HmGa28x7nQhDafzfFj/3Dos
 MAdM2pdO2x8=
 =MjVT
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20160917-1' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fixes & miscellany

Here are some more AF_RXRPC fix patches with a couple of miscellaneous
changes also.  Fixes include:

 (1) Make RxRPC IPv6 support conditional on IPv6 being available.

 (2) Move the condition check in rxrpc_locate_data() into the caller and
     check the error return.

 (3) Fix the detection of the last received packet in recvmsg.

 (4) Account calls that need acceptance and clean up any unaccepted ones if
     the socket gets closed.

 (5) Fix the cleanup of client connections.

 (6) Fix the soft-ACK parsing and the retransmission of packets based on
     those ACKs.

 (7) Suppress transmission of an ACK when there's no pending ACK to
     transmit because another thread stole it.

And some miscellany:

 (8) Whitespace removal.

 (9) Switch-value consistency in rxrpc_send_call_packet().

(10) Fix the basic transmission packet size to allow for spur-of-the-moment
     jumbo DATA packet production.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:51:21 -04:00
David S. Miller
029ac21146 Merge branch 'net-sched-singly-linked-list'
Florian Westphal says:

====================
sched: convert queues to single-linked list

During Netfilter Workshop 2016 Eric Dumazet pointed out that qdisc
schedulers use doubly-linked lists, even though single-linked list
would be enough.

The double-linked skb lists incur one extra write on enqueue/dequeue
operations (to change ->prev pointer of next list elem).

This series converts qdiscs to single-linked version, listhead
maintains pointers to first (for dequeue) and last skb (for enqueue).

Most qdiscs don't queue at all and instead use a leaf qdisc (typically
pfifo_fast) so only a few schedulers needed changes.

I briefly tested netem and htb and they seemed fine.

UDP_STREAM netperf with 64 byte packets via veth+pfifo_fast shows
a small (~2%) improvement.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:23 -04:00
Florian Westphal
48da34b7a7 sched: add and use qdisc_skb_head helpers
This change replaces sk_buff_head struct in Qdiscs with new qdisc_skb_head.

Its similar to the skb_buff_head api, but does not use skb->prev pointers.

Qdiscs will commonly enqueue at the tail of a list and dequeue at head.
While skb_buff_head works fine for this, enqueue/dequeue needs to also
adjust the prev pointer of next element.

The ->prev pointer is not required for qdiscs so we can just leave
it undefined and avoid one cacheline write access for en/dequeue.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Florian Westphal
ed760cb8aa sched: replace __skb_dequeue with __qdisc_dequeue_head
After previous patch these functions are identical.
Replace __skb_dequeue in qdiscs with __qdisc_dequeue_head.

Next patch will then make __qdisc_dequeue_head handle
single-linked list instead of strcut sk_buff_head argument.

Doesn't change generated code.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Florian Westphal
ec32336879 sched: remove qdisc arg from __qdisc_dequeue_head
Moves qdisc stat accouting to qdisc_dequeue_head.

The only direct caller of the __qdisc_dequeue_head version open-codes
this now.

This allows us to later use __qdisc_dequeue_head as a replacement
of __skb_dequeue() (which operates on sk_buff_head list).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Florian Westphal
97d0678f91 sched: don't use skb queue helpers
A followup change will replace the sk_buff_head in the qdisc
struct with a slightly different list.

Use of the sk_buff_head helpers will thus cause compiler
warnings.

Open-code these accesses in an extra change to ease review.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Florian Westphal
1486587b2f pie: use qdisc_dequeue_head wrapper
Doesn't change generated code.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:47:18 -04:00
Wei Yongjun
106323b905 cxgb4: Fix return value check in cfg_queues_uld()
Fix the retrn value check which testing the wrong variable
in cfg_queues_uld().

Fixes: 94cdb8bb99 ("cxgb4: Add support for dynamic allocation of
resources for ULD")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:42:28 -04:00
David S. Miller
4646651e75 Merge branch 'mediatek-hw-lro'
Nelson Chang says:

====================
net: ethernet: mediatek: add HW LRO functions

The series add the large receive offload (LRO) functions by hardware and
the ethtool functions to configure RX flows of HW LRO.

changes since v3:
- Respin the patch by the newer driver
- Move the dts description of hwlro to optional properties

changes since v2:
- Add ndo_fix_features to prevent NETIF_F_LRO off while RX flow is programmed
- Rephrase the dts property is a capability if the hardware supports LRO

changes since v1:
- Add HW LRO support
- Add ethtool hooks to set LRO RX flows
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:40:54 -04:00
Nelson Chang
004e6cc6c1 net: ethernet: mediatek: add the dts property to set if the HW supports LRO
Add the dts property for the capability if the hardware supports LRO.

Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:40:47 -04:00
Nelson Chang
7aab747e55 net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO
The codes add ethtool functions to set RX flows for HW LRO. Because the
HW LRO hardware can only recognize the destination IP of TCP/IP RX flows,
the ethtool command to add HW LRO flow is as below:
ethtool -N [devname] flow-type tcp4 dst-ip [ip_addr] loc [0~1]

Otherwise, cause the hardware can set total four destination IPs, each
GMAC (GMAC1/GMAC2) can set two IPs separately at most.

Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:40:47 -04:00
Nelson Chang
ee40681037 net: ethernet: mediatek: add HW LRO functions of PDMA RX rings
The codes add the large receive offload (LRO) functions by hardware as below:
1) PDMA has total four RX rings that one is the normal ring, and others can
   be configured as LRO rings.
2) Only TCP/IP RX flows can be offloaded. The hardware can set four IP
   addresses at most, if the destination IP of the RX flow matches one of
   them, it has the chance to be offloaded.
3) There three RX flows can be offloaded at most, and one flow is mapped to
   one RX ring.
4) If there are more than three candidate RX flows, the hardware can
   choose three of them by throughput comparison results.

Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:40:47 -04:00
Hariprasad Shenai
0fbc81b3ad chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's
Allocate resources dynamically to cxgb4's Upper layer driver's(ULD) like
cxgbit, iw_cxgb4 and cxgb4i. Allocate resources when they register with
cxgb4 driver and free them while unregistering. All the queues and the
interrupts for them will be allocated during ULD probe only and freed
during remove.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:37:32 -04:00
Christophe Jaillet
e8bc8f9a67 sctp: Remove some redundant code
In commit 311b21774f ("sctp: simplify sk_receive_queue locking"), a call
to 'skb_queue_splice_tail_init()' has been made explicit. Previously it was
hidden in 'sctp_skb_list_tail()'

Now, the code around it looks redundant. The '_init()' part of
'skb_queue_splice_tail_init()' should already do the same.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:34:01 -04:00
Jesper Dangaard Brouer
95357907ae mlx4: fix XDP_TX is acting like XDP_PASS on TX ring full
The XDP_TX action can fail transmitting the frame in case the TX ring
is full or port is down.  In case of TX failure it should drop the
frame, and not as now call 'break' which is the same as XDP_PASS.

Fixes: 9ecc2d8617 ("net/mlx4_en: add xdp forwarding and data write support")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:32:19 -04:00