Commit Graph

576064 Commits

Author SHA1 Message Date
David Howells
ab802ee0ab rxrpc: Clear the unused part of a sockaddr_rxrpc for memcmp() use
Clear the unused part of a sockaddr_rxrpc structs so that memcmp() can be
used to compare them.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:59:49 +00:00
David Howells
2b15ef15bc rxrpc: rxkad: Casts are needed when comparing be32 values
Forced casts are needed to avoid sparse warning when directly comparing
be32 values.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:59:13 +00:00
David Howells
098a20991d rxrpc: rxkad: The version number in the response should be net byte order
The version number rxkad places in the response should be network byte
order.

Whilst we're at it, rearrange the code to be more readable.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:59:00 +00:00
David Howells
ee72b9fddb rxrpc: Use ACCESS_ONCE() when accessing circular buffer pointers
Use ACCESS_ONCE() when accessing the other-end pointer into a circular
buffer as it's possible the other-end pointer might change whilst we're
doing this, and if we access it twice, we might get some weird things
happening.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:58:06 +00:00
David Howells
b4f1342f91 rxrpc: Adjust some whitespace and comments
Remove some excess whitespace, insert some missing spaces and adjust a
couple of comments.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:56:19 +00:00
David Howells
351c1e6486 rxrpc: Be more selective about the types of received packets we accept
Currently, received RxRPC packets outside the range 1-13 are rejected.
There are, however, holes in the range that should also be rejected - plus
at least one type we don't yet support - so reject these also.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:56:06 +00:00
David Howells
ee6fe085a9 rxrpc: Fix defined range for /proc/sys/net/rxrpc/rx_mtu
The upper bound of the defined range for rx_mtu is being set in the same
member as the lower bound (extra1) rather than the correct place (extra2).
I'm not entirely sure why this compiles.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:55:32 +00:00
David Howells
e33b3d97bc rxrpc: The protocol family should be set to PF_RXRPC not PF_UNIX
Fix the protocol family set in the proto_ops for rxrpc to be PF_RXRPC not
PF_UNIX.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:54:27 +00:00
David Howells
0d12f8a402 rxrpc: Keep the skb private record of the Rx header in host byte order
Currently, a copy of the Rx packet header is copied into the the sk_buff
private data so that we can advance the pointer into the buffer,
potentially discarding the original.  At the moment, this copy is held in
network byte order, but this means we're doing a lot of unnecessary
translations.

The reasons it was done this way are that we need the values in network
byte order occasionally and we can use the copy, slightly modified, as part
of an iov array when sending an ack or an abort packet.

However, it seems more reasonable on review that it would be better kept in
host byte order and that we make up a new header when we want to send
another packet.

To this end, rename the original header struct to rxrpc_wire_header (with
BE fields) and institute a variant called rxrpc_host_header that has host
order fields.  Change the struct in the sk_buff private data into an
rxrpc_host_header and translate the values when filling it in.

This further allows us to keep values kept in various structures in host
byte order rather than network byte order and allows removal of some fields
that are byteswapped duplicates.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:53:46 +00:00
David Howells
4c198ad17a rxrpc: Rename call events to begin RXRPC_CALL_EV_
Rename call event names to begin RXRPC_CALL_EV_ to distinguish them from the
flags.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:53:46 +00:00
David Howells
5b8848d149 rxrpc: Convert call flag and event numbers into enums
Convert call flag and event numbers into enums and move their definitions
outside of the struct.

Also move the call state enum outside of the struct and add an extra
element to count the number of states.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:53:46 +00:00
David Howells
e721498a63 rxrpc: Fix a case where a call event bit is being used as a flag bit
Fix a case where RXRPC_CALL_RELEASE (an event) is being used to specify a
flag bit.  RXRPC_CALL_RELEASED should be used instead.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-03-04 15:53:46 +00:00
Eric Dumazet
1f27cde313 net: sched: use pfifo_fast for non real queues
Some devices declare a high number of TX queues, then set a much
lower real_num_tx_queues

This cause setups using fq_codel, sfq or fq as the default qdisc to consume
more memory than really needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 17:38:46 -05:00
David Ahern
799977d9aa net: ipv6: Fix refcnt on host routes
Andrew and Ying Huang's test robot both reported usage count problems that
trace back to the 'keep address on ifdown' patch.

>From Andrew:
We execute CRIU test on linux-next. On the current linux-next kernel
they hangs on creating a network namespace.

The kernel log contains many massages like this:
[ 1036.122108] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[ 1046.165156] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[ 1056.210287] unregister_netdevice: waiting for lo to become free.
Usage count = 2

I tried to revert this patch and the bug disappeared.

Here is a set of commands to reproduce this bug:

[root@linux-next-test linux-next]# uname -a
Linux linux-next-test 4.5.0-rc6-next-20160301+ #3 SMP Wed Mar 2
17:32:18 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

[root@linux-next-test ~]# unshare -n
[root@linux-next-test ~]# ip link set up dev lo
[root@linux-next-test ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
[root@linux-next-test ~]# logout
[root@linux-next-test ~]# unshare -n

 -----

The problem is a change made to RTM_DELADDR case in __ipv6_ifa_notify that
was added in an early version of the offending patch and is no longer
needed.

Fixes: f1705ec197 ("net: ipv6: Make address flushing on ifdown optional")
Cc: Andrey Wagin <avagin@gmail.com>
Cc: Ying Huang <ying.huang@linux.intel.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 17:26:18 -05:00
Lada Trimasova
b54b8c2d6e net: ezchip: adapt driver to little endian architecture
Since ezchip network driver is written with big endian EZChip platform it
is necessary to add support for little endian architecture.

The first issue is that the order of the bits in a bit field is
implementation specific. So all the bit fields are removed.
Named constants are used to access necessary fields.

And the second one is that network byte order is big endian.
For example, data on ethernet is transmitted with most-significant
octet (byte) first. So in case of little endian architecture
it is important to swap data byte order when we read it from
register. In case of unaligned access we can use "get_unaligned_be32"
and in other case we can use function "ioread32_rep" which reads all
data from register and works either with little endian or big endian
architecture.

And then when we are going to write data to register we need to restore
byte order using the function "put_unaligned_be32" in case of
unaligned access and in other case "iowrite32_rep".

The last little fix is a space between type and pointer to observe
coding style.

Signed-off-by: Lada Trimasova <ltrimas@synopsys.com>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Tal Zilcer <talz@ezchip.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 17:20:08 -05:00
Simon Horman
274ba628a3 sh_eth, ravb: Use ARCH_RENESAS
Make use of ARCH_RENESAS in place of ARCH_SHMOBILE.

This is part of an ongoing process to migrate from ARCH_SHMOBILE to
ARCH_RENESAS the motivation for which being that RENESAS seems to be a more
appropriate name than SHMOBILE for the majority of Renesas ARM based SoCs.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 17:09:08 -05:00
Arnd Bergmann
3d1cbe839a net: mellanox: add DEVLINK dependencies
The new NET_DEVLINK infrastructure can be a loadable module, but the drivers
using it might be built-in, which causes link errors like:

drivers/net/built-in.o: In function `mlx4_load_one':
:(.text+0x2fbfda): undefined reference to `devlink_port_register'
:(.text+0x2fc084): undefined reference to `devlink_port_unregister'
drivers/net/built-in.o: In function `mlxsw_sx_port_remove':
:(.text+0x33a03a): undefined reference to `devlink_port_type_clear'
:(.text+0x33a04e): undefined reference to `devlink_port_unregister'

There are multiple ways to avoid this:

a) add 'depends on NET_DEVLINK || !NET_DEVLINK' dependencies
   for each user
b) use 'select NET_DEVLINK' from each driver that uses it
   and hide the symbol in Kconfig.
c) make NET_DEVLINK a 'bool' option so we don't have to
   list it as a dependency, and rely on the APIs to be
   stubbed out when it is disabled
d) use IS_REACHABLE() rather than IS_ENABLED() to check for
   NET_DEVLINK in include/net/devlink.h

This implements a variation of approach a) by adding an
intermediate symbol that drivers can depend on, and changes
the three drivers using it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 09d4d087cd ("mlx4: Implement devlink interface")
Fixes: c4745500e9 ("mlxsw: Implement devlink interface")
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 17:08:59 -05:00
David S. Miller
aefd3fb26e Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2016-03-01

Here's our main set of Bluetooth & 802.15.4 patches for the 4.6 kernel.

 - New Bluetooth HCI driver for Intel/AG6xx controllers
 - New Broadcom ACPI IDs
 - LED trigger support for indicating Bluetooth powered state
 - Various fixes in mac802154, 6lowpan and related drivers
 - New USB IDs for AR3012 Bluetooth controllers

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 16:27:23 -05:00
John Fastabend
5eb4dce3b3 net: relax setup_tc ndo op handle restriction
I added this check in setup_tc to multiple drivers,

 if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)

Unfortunately restricting to TC_H_ROOT like this breaks the old
instantiation of mqprio to setup a hardware qdisc. This patch
relaxes the test to only check the type to make it equivalent
to the check before I broke it. With this the old instantiation
continues to work.

A good smoke test is to setup mqprio with,

# tc qdisc add dev eth4 root mqprio num_tc 8 \
  map 0 1 2 3 4 5 6 7 \
  queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7

Fixes: e4c6734eaa ("net: rework ndo tc op to consume additional qdisc handle paramete")
Reported-by: Singh Krishneil <krishneil.k.singh@intel.com>
Reported-by: Jake Keller <jacob.e.keller@intel.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-03 16:25:15 -05:00
Miaoqing Pan
25c0f30142 ath9k: clear bb filter calibration power threshold
JP WiFi certification for bandwidth of channel 14 failed, the OBW
is lower than the requirement. Clear the bb filter calibration power
threshold to increase OBW(+2). The fix only for qca9531 chip now.

Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:27:17 +02:00
Arnd Bergmann
e9a26010f6 ath9k: reduce stack usage in ar9003_aic_cal_post_process
In some configurations, this function uses more than the warning limit
of 1024 bytes:

drivers/net/wireless/ath/ath9k/ar9003_aic.c: In function 'ar9003_aic_cal_post_process':
drivers/net/wireless/ath/ath9k/ar9003_aic.c:434:1: error: the frame size of 1040 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

It turns out that there are two large arrays on the stack here, but
almost all the data in them is never used outside of the loop in
which it gets written, so we can replace the array with a single
instance.

The .valid flag is used later, so I'm replacing the array of structures
with an array of bools. An obvious follow-up optimization would be
to replace it with a bitmask and set_bit()/find_first_bit()/
find_last_bit()/... operations. However, I have not tested this patch,
so I sticked to the simpler transformation that does the job of
reducing the stack usage to a harmless level.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:27:17 +02:00
Miaoqing Pan
82def495d1 ath9k: make NF load complete quickly and reliably
Make NF load complete quickly and reliably. NF load execution
is delayed by HW to end of frame if frame Rx or Tx is ongoing.
Increasing timeout to max frame duration. If NF cal is ongoing
before NF load, stop it before load, and restart it afterwards.

Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:27:17 +02:00
Mohammed Shafi Shajakhan
c28e6f06ff ath10k: fix sanity check on enabling btcoex via debugfs
First check for the device state before enabling / disabling
btcoex, also return a proper error value. Enabling / disabling
btcoex ideally does a f/w + ath10k_core_restart so the checks
that are applicable for 'simulate_fw_crash' shall be applicable
for this as well

Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:27:17 +02:00
Anilkumar Kolli
af9a6a3ad0 ath10k: reduce number of peers to support peer stats feature
To enable per peer stats feature we are reducing the number of peers.
Firmware has introduced tx stats feature. We have memory limitation in
firmware to add these additional bytes.

These are the new variables introduced in the firmware.
========		=======================
Variable	      	Bytes required/per rate
========		=======================
TX success packets 	1
TX failed packets	1
Retry packets		1
Success bytes		2
TX failed bytes		2
Retry bytes		2
Tx duration		4
Rate			1
Bw and AMPDU flags	1
Total			16 (because of allocation in word pattern)

Firmware sends these tx_stats in pktlog.
If we consider 4 feedbacks at a time, Frimware need about ~1K memory for coding
and 8192 bytes required / per rate [ 4*16*128(peers)].
To accommodate this firmware needs to reduce 10 peers.

This fixes a firmware crash with firmware-5.bin_10.2.4.70.22-2.

Signed-off-by: Anilkumar Kolli <akolli@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:20:01 +02:00
Rajkumar Manoharan
da6416cac6 ath10k: process htt rx indication as batch mode
On multicore systems, it is possible that txrx tasket can run
in parallel with pci tasklet (i.e smp affinity of ath10k irq is
assigned to multiple CPUs). Feeding and consuming from the same
rx completion list leads to txrx tasklet runs for longer period.
Prevent this by processing a snapshot of rx queue by moving list
into temporary list. Consecutive received frames will be processed
in next batch.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:20:01 +02:00
Rajkumar Manoharan
e7827e512a ath10k: reduce rx_lock contention for htt rx indication
Received frame indications are queued into a skb list and latest
processed by txrx tasklet. This skb queue is protected by htt rx lock.
Since the entire rx processing till delivering frame to mac80211 and
replenish tasks are processed under rx_lock protection, there might be
some delay in queuing newly received rx frame into that list on
multicore systems. Optimize this by using skb list lock while accessing
rx completion queue instead of htt rx lock.

Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:20:01 +02:00
Anton Protopopov
22baa98097 ath10k: fix erroneous return value
The ath10k_pci_hif_exchange_bmi_msg() function may return the positive
value EIO instead of -EIO in case of error.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:20:01 +02:00
Ashok Raj Nagarajan
1835374917 ath10k: add hw_rev to trace events to support pktlog
pktlog data is different between firmware variants (eg. 10.2 vs 10.4). To
have a unified user space script to decode pktlog trace events generated,
it is desirable to know which firmware variant has provided the events and
thereby decode the pktlogs appropriately. Hardware revision (hw_rev) helps
to determine the firmware variant sending these trace events. So add hw_rev
to trace events.

Signed-off-by: Ashok Raj Nagarajan <arnagara@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:20:00 +02:00
Ashok Raj Nagarajan
53a5c9bc53 ath10k: fix pktlog in QCA99X0
Currently, we are providing wrong payload data of pktlog to trace points.
Data we receive from FW through copy engine 8 contains pktlog data alone.
We don't need to parse anything in driver before handing it to trace
points.

Fixes: afb0bf7f53 ("ath10k: add support for pktlog in QCA99X0")
Signed-off-by: Ashok Raj Nagarajan <arnagara@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:20:00 +02:00
Mohammed Shafi Shajakhan
e0b6ce00b1 ath10k: fix pointless update of peer stats list
We periodically receive f/w stats event for updating
the rx duration and there is no reason to keep on appending
the f/w stats peer list, as this gets completely cleaned up when
the user polls for f/w stats {pdev, vdev, peer stats}. Only don't
print the warning message in the case PEER_STATS service is enabled

Fixes: 856e7c3 ("ath10k: add debugfs support for Per STA total rx duration")
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-03-03 19:17:57 +02:00
Eric Engestrom
998fc1d080 ethernet/atl1c: remove left over dead code
Left over from c24588afc5
("atl1c: using fixed TXQ configuration for l2cb and l1c")

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 15:00:55 -05:00
Eric Engestrom
a9d562358b net/ipv4: remove left over dead code
8cc785f6f4 ("net: ipv4: make the ping
/proc code AF-independent") removed the code using it, but renamed this
variable instead of removing it.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 15:00:55 -05:00
Eric Engestrom
6353e1875d net/rtnetlink: remove dead code
3b766cd832 ("net/core: Add reading VF
statistics through the PF netdevice") added that variable but it's never
been used.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 15:00:55 -05:00
David S. Miller
3ebeac1d02 Merge branch 'cxgb4-next'
Hariprasad Shenai says:

====================
cxgb4/cxgb4vf: Cleanup and minor fixes

This series sets FBMIN to 64 bytes for Chelsio's T6 series of adapters,
check to replenish fl is revised, some code cleanup in cxgb4vf sge
initialization code and removes dead code.

This patch series has been created against net-next tree and includes
patches on cxgb4 and cxgb4vf driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================
2016-03-02 14:46:30 -05:00
Hariprasad Shenai
f1ea2fe05d cxgb4vf: Remove dead functions collect_netdev_[um]c_list_addrs
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:46:30 -05:00
Hariprasad Shenai
5d7b80522b cxgb4vf: Remove redundant adapter ready check during probe
Function t4vf_wait_dev_ready() is already called in t4vf_prep_adapter(),
no need to call it again in adap_init0().

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:46:30 -05:00
Hariprasad Shenai
cb440364c7 cxgb4vf: Make sge init code more readable
Adds a new function t4vf_fl_pkt_align() and use the same in SGE
initialization code to find out freelist packet alignment

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:46:29 -05:00
Hariprasad Shenai
edadad80d6 cxgb4/cxgb4vf: For T6 adapter, set FBMIN to 64 bytes
T4 and T5 hardware will not coalesce Free List PCI-E Fetch Requests if
the Host Driver provides more Free List Pointers than the Fetch Burst
Minimum value.  So if we set FBMIN to 64 bytes and the Host Driver
supplies 128 bytes of Free List Pointer data, the hardware will issue two
64-byte PCI-E Fetch Requests rather than a single coallesced 128-byte
Fetch Request. T6 fixes this. So, for T4/T5 we set the FBMIN value to 128

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:46:29 -05:00
Hariprasad Shenai
da08e4259f cxgb4/cxgb4vf: Use fl capacity to check if fl needs to be replenished
Use freelist capacity instead of freelist size while checking, if
freelist needs to be refilled

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:46:29 -05:00
David S. Miller
d93e4093fb Merge branch 'stmmac-optimizations'
Alexandre TORGUE says:

====================
stmmac: enhance driver performances and update the version

According to Giuseppe, I send the v3 series.

This is a subset of patches to rework the driver in order to improve its
performances and make it more robust under stress conditions.

All patches have been ported on STi mainstream kernel branch and
tested on ARM STiH4xx platforms and newer ones.

This series also updates the driver version and prepares it
to include further development to support new chips.

In detail, these patches are:

o to rework and improve the internal DMA bus settings

  Fine tuning is mandatory on some platforms for both
  performance and stability issues.

o to rework and optimize the descriptor management.

  This will help a lot on performance side and preparing
  the inclusion on the GMAC4.x.

o to add a set of optimizations for both xmit and rx functions.

  These will help a lot on performance side and making the driver
  more robust in case of low memory conditions and under some
  stress test, performed for example on IP-STB.

Below some throughput figures obtained on some boxes before and after
the patches.

                       nuttcp (mbps)       iperf (Mbps)
------------------------------------------------------------------
                      tcp     udp          tcp      udp
                   tx   rx   tx  rx      tx   rx   tx  rx
                    ------------------------------------------
   old             680   800 480  506    760  800   600  700
   new             830   880 540  630    840  880   700   800

V2: - rx_copybreak is now managed by using ethtool.
V3: - improve comments on PCIe detailing that there are no regressions
    - rework some APIs to properly define some params as bool as expected
    - rework the formula to get the element inside the ring. Comparing V2,
	patches 4 and 13 have been merged because the same formula have been
	used. After this rework, no evident benefit has been noticed in terms
	of performances so the table above is still valid. Disassembling the
	code for SH4 and ARM, with the new formula just an instr is saved
	(depending on compiler flags) and this gives us not so relevanti gain,
	for example, on SH4 where some instr are executed in the same pipeline
	stage.
	Ring sizes are now fixed and maybe they can be reworked to be tuned
	w/o using stmmaceth= cmdline option. Indeed, nobody change these sizes
	and indeed the numbers selected by default respect the budget and
	avoid to pass invalid setup. These are the best driver default sizes
	for ring and chain.

====================
2016-03-02 14:21:34 -05:00
Giuseppe Cavallaro
3796e44ddc stmmac: update version to Oct_2015
This patch just updates the driver to the version fully
tested on STi platforms. This version is Oct_2015.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:34 -05:00
Giuseppe Cavallaro
120e87f91e stmmac: tune rx copy via threshold.
There is a threshold now used to also limit the skb allocation
when use zero-copy. This is to avoid that there are incoherence
in the ring due to a failure on skb allocation under very
aggressive testing and under low memory conditions.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:33 -05:00
Giuseppe Cavallaro
22ad383815 stmmac: do not perform zero-copy for rx frames
This patch is to allow this driver to copy tiny frames during the reception
process. This is giving more stability while stressing the driver on STi
embedded systems.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:33 -05:00
Fabrice Gasnier
8ecd80a5f6 stmmac: fix phy init when attached to a phy
phy_bus_name can be NULL when "fixed-link" property isn't used.
Then, since "stmmac: do not poll phy handler when attach a switch",
phy_bus_name ptr needs to be checked before strcmp is called.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:33 -05:00
Giuseppe Cavallaro
8e99fc5f88 stmmac: do not poll phy handler when attach a switch
This patch avoids to call the stmmac_adjust_link when
the driver is connected to a switch by using the FIXED_PHY
support. Prior this patch the phydev->irq was set as PHY_POLL
so periodically the phy handler was invoked spending useless
time because the link cannot actually change.
Note that the stmmac_adjust_link will be called just one
time and this guarantees that the ST glue logic will be
setup according to the mode and speed fixed.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:33 -05:00
Giuseppe Cavallaro
0e80bdc9a7 stmmac: first frame prep at the end of xmit routine
This patch is to fill the first descriptor just before granting
the DMA engine so at the end of the xmit.
The patch takes care about the algorithm adopted to mitigate the
interrupts, then it fixes the last segment in case of no fragments.
Moreover, this new implementation does not pass any "ter" field when
prepare the descriptors because this is not necessary.
The patch also details the memory barrier in the xmit.

As final results, this patch guarantees the same performances
but fixing a case if small datagram are sent. In fact, this
kind of test is impacted if no coalesce is done.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:33 -05:00
Giuseppe Cavallaro
fbc80823a9 stmmac: set dirty index out of the loop
The dirty index can be updated out of the loop where all the
tx resources are claimed. This will help on performances too.
Also a useless debug printk has been removed from the main loop.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:32 -05:00
Fabrice Gasnier
c363b6586c stmmac: optimize tx clean function
This patch "inline" get_tx_owner and get_ls routines. It Results in a
unique read to tdes0, instead of three, to check TX_OWN and LS bits,
and other status bits.

It helps improve driver TX path by removing two uncached read/writes
inside TX clean loop for enhanced descriptors but not for normal ones
because the des1 must be read in any case.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:32 -05:00
Giuseppe Cavallaro
be434d5075 stmmac: optimize tx desc management
This patch is to optimize the way to manage the TDES inside the
xmit function. When prepare the frame, some settings (e.g. OWN
bit) can be merged. This has been reworked to improve the tx
performances.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:32 -05:00
Fabrice Gasnier
c1fa3212be stmmac: merge get_rx_owner into rx_status routine.
The RDES0 register can be read several times while doing RX of a
packet.
This patch slightly improves RX path performance by reading rdes0
once for two operation: check rx owner, get rx status bits.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02 14:21:32 -05:00