Commit Graph

681109 Commits

Author SHA1 Message Date
Michael Grzeschik
52ab12e4f9 arcnet: com20020-pci: handle backplane mode depending on card type
We read the backplane mode of each subcard from bits 2 and 3 of the misc
register.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:26:13 -04:00
Michael Grzeschik
ede07a1fc7 arcnet: com20020-pci: add attribute to readback backplane status
We add the sysfs interface the read back the backplane
status of the interface.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:26:13 -04:00
Michael Grzeschik
05fcd31cc4 arcnet: add err_skb package for package status feedback
We need to track the status of our queued packages. This way the driving
process knows if failed packages need to be retransmitted. For this
purpose we queue the transferred/failed packages back into the err_skb
message queue added with some status information.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:26:13 -04:00
David S. Miller
00778f7cad Merge branch 'arcnet-fixes'
Michael Grzeschik says:

====================
arcnet: Collection of latest fixes

Here we sum up the recent fixes I collected on the way to use and
stabilise the framework. Part of it is an possible deadlock that we
prevent as well to fix the calculation of the dev_id that can be setup
by an rotary encoder. Beside that we added an trivial spelling patch and
fix some wrong and missing assignments that improves the code footprint.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:18:38 -04:00
Michael Grzeschik
2a0ea04c83 arcnet: com20020-pci: add missing pdev setup in netdev structure
We add the pdev data to the pci devices netdev structure. This way
the interface get consistent device names in the userspace (udev).

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:18:37 -04:00
Michael Grzeschik
cb108619f2 arcnet: com20020-pci: fix dev_id calculation
The dev_id was miscalculated. Only the two bits 4-5 are relevant for the
MA1 card. PCIARC1 and PCIFB2 use the four bits 4-7 for id selection.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:18:36 -04:00
Michael Grzeschik
0d494fcf86 arcnet: com20020: remove needless base_addr assignment
The assignment is superfluous.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:18:36 -04:00
Colin Ian King
06908d7aee Trivial fix to spelling mistake in arc_printk message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:18:36 -04:00
Michael Grzeschik
5b85840320 arcnet: change irq handler to lock irqsave
This patch prevents the arcnet driver from the following deadlock.

[   41.273910] ======================================================
[   41.280397] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[   41.287433] 4.4.0-00034-gc0ae784 #536 Not tainted
[   41.292366] ------------------------------------------------------
[   41.298863] arcecho/233 [HC0[0]:SC0[2]:HE0:SE0] is trying to acquire:
[   41.305628]  (&(&lp->lock)->rlock){+.+...}, at: [<bf083bc8>] arcnet_send_packet+0x60/0x1c0 [arcnet]
[   41.315199]
[   41.315199] and this task is already holding:
[   41.321324]  (_xmit_ARCNET#2){+.-...}, at: [<c06b934c>] packet_direct_xmit+0xfc/0x1c8
[   41.329593] which would create a new lock dependency:
[   41.334893]  (_xmit_ARCNET#2){+.-...} -> (&(&lp->lock)->rlock){+.+...}
[   41.341801]
[   41.341801] but this new dependency connects a SOFTIRQ-irq-safe lock:
[   41.350108]  (_xmit_ARCNET#2){+.-...}
... which became SOFTIRQ-irq-safe at:
[   41.357539]   [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.362677]   [<c063ab8c>] dev_watchdog+0x5c/0x264
[   41.367723]   [<c0094edc>] call_timer_fn+0x6c/0xf4
[   41.372759]   [<c00950b8>] run_timer_softirq+0x154/0x210
[   41.378340]   [<c0036b30>] __do_softirq+0x144/0x298
[   41.383469]   [<c0036fb4>] irq_exit+0xcc/0x130
[   41.388138]   [<c0085c50>] __handle_domain_irq+0x60/0xb4
[   41.393728]   [<c0014578>] __irq_svc+0x58/0x78
[   41.398402]   [<c0010274>] arch_cpu_idle+0x24/0x3c
[   41.403443]   [<c007127c>] cpu_startup_entry+0x1f8/0x25c
[   41.409029]   [<c09adc90>] start_kernel+0x3c0/0x3cc
[   41.414170]
[   41.414170] to a SOFTIRQ-irq-unsafe lock:
[   41.419931]  (&(&lp->lock)->rlock){+.+...}
... which became SOFTIRQ-irq-unsafe at:
[   41.427996] ...  [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.433409]   [<bf083d54>] arcnet_interrupt+0x2c/0x800 [arcnet]
[   41.439646]   [<c0089120>] handle_nested_irq+0x8c/0xec
[   41.445063]   [<c03c1170>] regmap_irq_thread+0x190/0x314
[   41.450661]   [<c0087244>] irq_thread_fn+0x1c/0x34
[   41.455700]   [<c0087548>] irq_thread+0x13c/0x1dc
[   41.460649]   [<c0050f10>] kthread+0xe4/0xf8
[   41.465158]   [<c000f810>] ret_from_fork+0x14/0x24
[   41.470207]
[   41.470207] other info that might help us debug this:
[   41.470207]
[   41.478627]  Possible interrupt unsafe locking scenario:
[   41.478627]
[   41.485763]        CPU0                    CPU1
[   41.490521]        ----                    ----
[   41.495279]   lock(&(&lp->lock)->rlock);
[   41.499414]                                local_irq_disable();
[   41.505636]                                lock(_xmit_ARCNET#2);
[   41.511967]                                lock(&(&lp->lock)->rlock);
[   41.518741]   <Interrupt>
[   41.521490]     lock(_xmit_ARCNET#2);
[   41.525356]
[   41.525356]  *** DEADLOCK ***
[   41.525356]
[   41.531587] 1 lock held by arcecho/233:
[   41.535617]  #0:  (_xmit_ARCNET#2){+.-...}, at: [<c06b934c>] packet_direct_xmit+0xfc/0x1c8
[   41.544355]
the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
[   41.552362] -> (_xmit_ARCNET#2){+.-...} ops: 27 {
[   41.557357]    HARDIRQ-ON-W at:
[   41.560664]                     [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.567445]                     [<c063ba28>] dev_deactivate_many+0x114/0x304
[   41.574866]                     [<c063bc3c>] dev_deactivate+0x24/0x38
[   41.581646]                     [<c0630374>] linkwatch_do_dev+0x40/0x74
[   41.588613]                     [<c06305d8>] __linkwatch_run_queue+0xec/0x140
[   41.596120]                     [<c0630658>] linkwatch_event+0x2c/0x34
[   41.602991]                     [<c004af30>] process_one_work+0x188/0x40c
[   41.610131]                     [<c004b200>] worker_thread+0x4c/0x480
[   41.616912]                     [<c0050f10>] kthread+0xe4/0xf8
[   41.623048]                     [<c000f810>] ret_from_fork+0x14/0x24
[   41.629735]    IN-SOFTIRQ-W at:
[   41.633039]                     [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.639820]                     [<c063ab8c>] dev_watchdog+0x5c/0x264
[   41.646508]                     [<c0094edc>] call_timer_fn+0x6c/0xf4
[   41.653190]                     [<c00950b8>] run_timer_softirq+0x154/0x210
[   41.660425]                     [<c0036b30>] __do_softirq+0x144/0x298
[   41.667201]                     [<c0036fb4>] irq_exit+0xcc/0x130
[   41.673518]                     [<c0085c50>] __handle_domain_irq+0x60/0xb4
[   41.680754]                     [<c0014578>] __irq_svc+0x58/0x78
[   41.687077]                     [<c0010274>] arch_cpu_idle+0x24/0x3c
[   41.693769]                     [<c007127c>] cpu_startup_entry+0x1f8/0x25c
[   41.701006]                     [<c09adc90>] start_kernel+0x3c0/0x3cc
[   41.707791]    INITIAL USE at:
[   41.711003]                    [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.717696]                    [<c063ba28>] dev_deactivate_many+0x114/0x304
[   41.725026]                    [<c063bc3c>] dev_deactivate+0x24/0x38
[   41.731718]                    [<c0630374>] linkwatch_do_dev+0x40/0x74
[   41.738593]                    [<c06305d8>] __linkwatch_run_queue+0xec/0x140
[   41.746011]                    [<c0630658>] linkwatch_event+0x2c/0x34
[   41.752789]                    [<c004af30>] process_one_work+0x188/0x40c
[   41.759847]                    [<c004b200>] worker_thread+0x4c/0x480
[   41.766541]                    [<c0050f10>] kthread+0xe4/0xf8
[   41.772596]                    [<c000f810>] ret_from_fork+0x14/0x24
[   41.779198]  }
[   41.780945]  ... key      at: [<c124d620>] netdev_xmit_lock_key+0x38/0x1c8
[   41.788192]  ... acquired at:
[   41.791309]    [<c007bed8>] lock_acquire+0x70/0x90
[   41.796361]    [<c06f9140>] _raw_spin_lock_irqsave+0x40/0x54
[   41.802324]    [<bf083bc8>] arcnet_send_packet+0x60/0x1c0 [arcnet]
[   41.808844]    [<c06b9380>] packet_direct_xmit+0x130/0x1c8
[   41.814622]    [<c06bc7e4>] packet_sendmsg+0x3b8/0x680
[   41.820034]    [<c05fe8b0>] sock_sendmsg+0x14/0x24
[   41.825091]    [<c05ffd68>] SyS_sendto+0xb8/0xe0
[   41.829956]    [<c05ffda8>] SyS_send+0x18/0x20
[   41.834638]    [<c000f780>] ret_fast_syscall+0x0/0x1c
[   41.839954]
[   41.841514]
the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock:
[   41.850302] -> (&(&lp->lock)->rlock){+.+...} ops: 5 {
[   41.855644]    HARDIRQ-ON-W at:
[   41.858945]                     [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.865726]                     [<bf083d54>] arcnet_interrupt+0x2c/0x800 [arcnet]
[   41.873607]                     [<c0089120>] handle_nested_irq+0x8c/0xec
[   41.880666]                     [<c03c1170>] regmap_irq_thread+0x190/0x314
[   41.887901]                     [<c0087244>] irq_thread_fn+0x1c/0x34
[   41.894593]                     [<c0087548>] irq_thread+0x13c/0x1dc
[   41.901195]                     [<c0050f10>] kthread+0xe4/0xf8
[   41.907338]                     [<c000f810>] ret_from_fork+0x14/0x24
[   41.914025]    SOFTIRQ-ON-W at:
[   41.917328]                     [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.924106]                     [<bf083d54>] arcnet_interrupt+0x2c/0x800 [arcnet]
[   41.931981]                     [<c0089120>] handle_nested_irq+0x8c/0xec
[   41.939028]                     [<c03c1170>] regmap_irq_thread+0x190/0x314
[   41.946264]                     [<c0087244>] irq_thread_fn+0x1c/0x34
[   41.952954]                     [<c0087548>] irq_thread+0x13c/0x1dc
[   41.959548]                     [<c0050f10>] kthread+0xe4/0xf8
[   41.965689]                     [<c000f810>] ret_from_fork+0x14/0x24
[   41.972379]    INITIAL USE at:
[   41.975595]                    [<c06f8fc8>] _raw_spin_lock+0x30/0x40
[   41.982283]                    [<bf083d54>] arcnet_interrupt+0x2c/0x800 [arcnet]
[   41.990063]                    [<c0089120>] handle_nested_irq+0x8c/0xec
[   41.997027]                    [<c03c1170>] regmap_irq_thread+0x190/0x314
[   42.004172]                    [<c0087244>] irq_thread_fn+0x1c/0x34
[   42.010766]                    [<c0087548>] irq_thread+0x13c/0x1dc
[   42.017267]                    [<c0050f10>] kthread+0xe4/0xf8
[   42.023314]                    [<c000f810>] ret_from_fork+0x14/0x24
[   42.029903]  }
[   42.031648]  ... key      at: [<bf0854cc>] __key.42091+0x0/0xfffff0f8 [arcnet]
[   42.039255]  ... acquired at:
[   42.042372]    [<c007bed8>] lock_acquire+0x70/0x90
[   42.047413]    [<c06f9140>] _raw_spin_lock_irqsave+0x40/0x54
[   42.053364]    [<bf083bc8>] arcnet_send_packet+0x60/0x1c0 [arcnet]
[   42.059872]    [<c06b9380>] packet_direct_xmit+0x130/0x1c8
[   42.065634]    [<c06bc7e4>] packet_sendmsg+0x3b8/0x680
[   42.071030]    [<c05fe8b0>] sock_sendmsg+0x14/0x24
[   42.076069]    [<c05ffd68>] SyS_sendto+0xb8/0xe0
[   42.080926]    [<c05ffda8>] SyS_send+0x18/0x20
[   42.085601]    [<c000f780>] ret_fast_syscall+0x0/0x1c
[   42.090918]
[   42.092481]
[   42.092481] stack backtrace:
[   42.097065] CPU: 0 PID: 233 Comm: arcecho Not tainted 4.4.0-00034-gc0ae784 #536
[   42.104751] Hardware name: Generic AM33XX (Flattened Device Tree)
[   42.111183] [<c0017ec8>] (unwind_backtrace) from [<c00139d0>] (show_stack+0x10/0x14)
[   42.119337] [<c00139d0>] (show_stack) from [<c02a82c4>] (dump_stack+0x8c/0x9c)
[   42.126937] [<c02a82c4>] (dump_stack) from [<c0078260>] (check_usage+0x4bc/0x63c)
[   42.134815] [<c0078260>] (check_usage) from [<c0078438>] (check_irq_usage+0x58/0xb0)
[   42.142964] [<c0078438>] (check_irq_usage) from [<c007aaa0>] (__lock_acquire+0x1524/0x20b0)
[   42.151740] [<c007aaa0>] (__lock_acquire) from [<c007bed8>] (lock_acquire+0x70/0x90)
[   42.159886] [<c007bed8>] (lock_acquire) from [<c06f9140>] (_raw_spin_lock_irqsave+0x40/0x54)
[   42.168768] [<c06f9140>] (_raw_spin_lock_irqsave) from [<bf083bc8>] (arcnet_send_packet+0x60/0x1c0 [arcnet])
[   42.179115] [<bf083bc8>] (arcnet_send_packet [arcnet]) from [<c06b9380>] (packet_direct_xmit+0x130/0x1c8)
[   42.189182] [<c06b9380>] (packet_direct_xmit) from [<c06bc7e4>] (packet_sendmsg+0x3b8/0x680)
[   42.198059] [<c06bc7e4>] (packet_sendmsg) from [<c05fe8b0>] (sock_sendmsg+0x14/0x24)
[   42.206199] [<c05fe8b0>] (sock_sendmsg) from [<c05ffd68>] (SyS_sendto+0xb8/0xe0)
[   42.213978] [<c05ffd68>] (SyS_sendto) from [<c05ffda8>] (SyS_send+0x18/0x20)
[   42.221388] [<c05ffda8>] (SyS_send) from [<c000f780>] (ret_fast_syscall+0x0/0x1c)

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>

   ---
   v1 -> v2: removed unneeded zero assignment of flags
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:18:35 -04:00
David S. Miller
65344ba998 Merge branch 'amd-xgbe-updates'
Tom Lendacky says:

====================
amd-xgbe: AMD XGBE driver updates 2016-06-28

The following updates and fixes are included in this driver update series:

- Simplify mailbox interface code
- Fix SFP supported and advertising settings
- Fix PTP initialization register usage
- Insure there is timestamp skb present before using it
- Add a timeout to timestamp register updates
- Handle return code from software reset function
- Some fixes for handling 2.5Gbps rates
- Limit I2C error messages
- Fix non-DMA interrupt handling through tasklet usage
- Add NUMA affinity support for memory allocations
- Add NUMA affinity support for interrupts
- Prepare for more fine-grained cache coherency controls
- Simplify setting the DMA burst length programming
- Performance improvements

This patch series is based on net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:20 -04:00
Lendacky, Thomas
6f595959c0 amd-xgbe: Adjust register settings to improve performance
Add support to change some general performance settings and to provide
some performance settings based on the device that is probed.

This includes:

- Setting the maximum read/write outstanding request limit
- Reducing the AXI interface burst length size
- Selectively setting the Tx and Rx descriptor pre-fetch threshold
- Selectively setting additional cache coherency controls

Tested and verified on all versions of the hardware.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:19 -04:00
Lendacky, Thomas
7e1e6b86a5 amd-xgbe: Simplify the burst length settings
Currently the driver hardcodes the PBLx8 setting.  Remove the need for
specifying the PBLx8 setting and automatically calculate based on the
specified PBL value. Since the PBLx8 setting applies to both Tx and Rx
use the same PBL value for both of them.

Also, the driver currently uses a bit field to set the AXI master burst
len setting. Change to the full bit field range and set the burst length
based on the specified value.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:19 -04:00
Lendacky, Thomas
9916716a1b amd-xgbe: Prepare for more fine grained cache coherency controls
In prep for setting fine grained read and write DMA cache coherency
controls, allow specific values to be used to set the cache coherency
registers.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:18 -04:00
Lendacky, Thomas
f00ba49d8e amd-xgbe: Add NUMA affinity support for IRQ hints
For IRQ affinity, set the affinity hints for the IRQs to be (initially) on
the processors corresponding to the NUMA node of the device.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:18 -04:00
Lendacky, Thomas
18f9f0ac55 amd-xgbe: Add NUMA affinity support for memory allocations
Add support to perform memory allocations on the node of the device. The
original allocation or the ring structure and Tx/Rx queues allocated all
of the memory at once and then carved it up for each channel and queue.
To best ensure that we get as much memory from the NUMA node as we can,
break the channel and ring allocations into individual allocations.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:18 -04:00
Lendacky, Thomas
85b85c8534 amd-xgbe: Re-issue interrupt if interrupt status not cleared
Some of the device interrupts should function as level interrupts. For
some hardware configurations this requires setting some control bits
so that if the interrupt status has not been cleared the interrupt
should be reissued.

Additionally, when using MSI or MSI-X interrupts, run the interrupt
service routine as a tasklet so that the re-issuance of the interrupt
is handled properly.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:18 -04:00
Lendacky, Thomas
45a2005e93 amd-xgbe: Limit the I2C error messages that are output
When I2C communication fails, it tends to always fail. Rather than
continuously issue an error message (once per second in most cases),
change the message to be issued just once.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:17 -04:00
Lendacky, Thomas
ed3333fa6f amd-xgbe: Fixes for working with PHYs that support 2.5GbE
The driver has some missing functionality when operating in the mode that
supports 2.5GbE.  Fix the driver to fully recognize and support this speed.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:17 -04:00
Lendacky, Thomas
42d452dc4a amd-xgbe: Handle return code from software reset function
Currently the function that performs a software reset of the hardware
provides a return code.  During driver probe check this return code and
exit with an error if the software reset fails.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:17 -04:00
Lendacky, Thomas
9018ff5331 amd-xgbe: Prevent looping forever if timestamp update fails
Just to be on the safe side, should the update of the timestamp registers
not complete, issue a warning rather than looping forever waiting for the
update to complete.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:16 -04:00
Lendacky, Thomas
93845d5f1b amd-xgbe: Add a check for an skb in the timestamp path
Spurious Tx timestamp interrupts can cause an oops in the Tx timestamp
processing function if a Tx timestamp skb is NULL. Add a check to insure
a Tx timestamp skb is present before attempting to use it.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:16 -04:00
Lendacky, Thomas
3abc7cff67 amd-xgbe: Use the proper register during PTP initialization
During PTP initialization, the Timestamp Control register should be
cleared and not the Tx Configuration register.  While this typo causes
the wrong register to be cleared, the default value of each register and
and the fact that the Tx Configuration register is programmed afterwards
doesn't result in a bug, hence only fixing in net-next.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:16 -04:00
Lendacky, Thomas
56503d55cc amd-xgbe: Fix SFP PHY supported/advertised settings
When using SFPs, the supported and advertised settings should be initially
based on the SFP that has been detected.  The code currently indicates the
overall support of the device as opposed to what the SFP is capable of.
Update the code to change the supported link modes, auto-negotiation, etc.
to be based on the installed SFP.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:15 -04:00
Lendacky, Thomas
549b32af9f amd-xgbe: Simplify mailbox interface rate change code
Simplify and centralize the mailbox command rate change interface by
having a single function perform the writes to the mailbox registers
to issue the request.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 15:14:15 -04:00
Dan Carpenter
acb4b7df48 rocker: move dereference before free
My static checker complains that ofdpa_neigh_del() can sometimes free
"found".   It just makes sense to use it first before deleting it.

Fixes: ecf244f753 ("rocker: fix maybe-uninitialized warning")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 14:19:24 -04:00
David S. Miller
e3ef6983cc Merge branch 'bpf-Add-syscall-lookup-support-for-fd-array-and-htab'
Martin KaFai Lau says:

====================
bpf: Add syscall lookup support for fd array and htab

This patchset adds BPF_MAP_LOOKUP_ELEM syscall support for
BPF_MAP_TYPE_PROG_ARRAY,
BPF_MAP_TYPE_ARRAY_OF_MAPS and
BPF_MAP_TYPE_HASH_OF_MAPS
====================

Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 13:13:26 -04:00
Martin KaFai Lau
a8744f2528 bpf: Add test for syscall on fd array/htab lookup
Checks are added to the existing sockex3 and test_map_in_map test.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 13:13:26 -04:00
Martin KaFai Lau
14dc6f04f4 bpf: Add syscall lookup support for fd array and htab
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on
BPF_MAP_TYPE_PROG_ARRAY,
BPF_MAP_TYPE_ARRAY_OF_MAPS and
BPF_MAP_TYPE_HASH_OF_MAPS.

The lookup returns a prog-id or map-id to the userspace.
The userspace can then use the BPF_PROG_GET_FD_BY_ID
or BPF_MAP_GET_FD_BY_ID to get a fd.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 13:13:25 -04:00
Ido Schimmel
6b27c8adf2 mlxsw: spectrum_router: Fix NULL pointer dereference
In case a VLAN device is enslaved to a bridge we shouldn't create a
router interface (RIF) for it when it's configured with an IP address.
This is already handled by the driver for other types of netdevs, such
as physical ports and LAG devices.

If this IP address is then removed and the interface is subsequently
unlinked from the bridge, a NULL pointer dereference can happen, as the
original 802.1d FID was replaced with an rFID which was then deleted.

To reproduce:
$ ip link set dev enp3s0np9 up
$ ip link add name enp3s0np9.111 link enp3s0np9 type vlan id 111
$ ip link set dev enp3s0np9.111 up
$ ip link add name br0 type bridge
$ ip link set dev br0 up
$ ip link set enp3s0np9.111 master br0
$ ip address add dev enp3s0np9.111 192.168.0.1/24
$ ip address del dev enp3s0np9.111 192.168.0.1/24
$ ip link set dev enp3s0np9.111 nomaster

Fixes: 99724c18fc ("mlxsw: spectrum: Introduce support for router interfaces")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Petr Machata <petrm@mellanox.com>
Tested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:59:48 -04:00
Gao Feng
c1a4872ebf net: sched: Fix one possible panic when no destroy callback
When qdisc fail to init, qdisc_create would invoke the destroy callback
to cleanup. But there is no check if the callback exists really. So it
would cause the panic if there is no real destroy callback like the qdisc
codel, fq, and so on.

Take codel as an example following:
When a malicious user constructs one invalid netlink msg, it would cause
codel_init->codel_change->nla_parse_nested failed.
Then kernel would invoke the destroy callback directly but qdisc codel
doesn't define one. It causes one panic as a result.

Now add one the check for destroy to avoid the possible panic.

Fixes: 87b60cfacf ("net_sched: fix error recovery at qdisc creation")
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:55:12 -04:00
Jason Wang
713a98d90c virtio-net: serialize tx routine during reset
We don't hold any tx lock when trying to disable TX during reset, this
would lead a use after free since ndo_start_xmit() tries to access
the virtqueue which has already been freed. Fix this by using
netif_tx_disable() before freeing the vqs, this could make sure no tx
after vq freeing.

Reported-by: Jean-Philippe Menil <jpmenil@gmail.com>
Tested-by: Jean-Philippe Menil <jpmenil@gmail.com>
Fixes commit f600b69050 ("virtio_net: Add XDP support")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Robert McCabe <robert.mccabe@rockwellcollins.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:51:59 -04:00
Thor Thayer
77b0d36177 net: stmmac: Add additional registers for dwmac1000_dma ethtool
Version 3.70a of the Designware has additional DMA registers so
add those to the ethtool DMA Register dump.
Offset 9  - Receive Interrupt Watchdog Timer Register
Offset 10 - AXI Bus Mode Register
Offset 11 - AHB or AXI Status Register
Offset 22 - HW Feature Register

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:49:54 -04:00
David S. Miller
5185ad616b mlx5-updates-2017-06-27 (Innova IPsec offload support)
This patchset adds support for Innova IPSec network interface card.
 
 About Innova device:
 --------------------
 Innova is a network card with a ConnectX chip and an FPGA chip as a
  bump-on-the-wire.
 
                Internal
 +----------+   Link       +-----------------+
 |          +--------------+      FPGA       |  +------+
 | ConnectX |              |  Shell          +--+ QSFP |
 |          +--------------+    +-------+    |  | Port |
 +----------+      I2C     |    |  SBU  |    |  +------+
                           |    +-------+    |
                           +--+----------+---+
                              |          |
                           +--+--+   +---+---+
                           | DDR |   | Flash |
                           +-----+   +-------+
 
 The FPGA synthesized logic is loaded from dedicated flash storage and has
  access to its own dedicated DDR RAM.
 The ConnectX chip firmware programs the FPGA by accessing its configuration
 space over either the slow internal I2C link or the high-speed internal link.
 
 The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU).
 mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality,
 while other components may handle the various SBU functionalities.
 
 The driver opens high-speed reliable communication channels with the shell and
 the SBU over the internal link.
 These channels may be used for high-bandwidth configuration or for SBU-specific
 out-of-band data paths.
 
 About Innova IPSec device:
 --------------------------
 Innova IPSec is a network card that allows offloading IPSec cryptography operations
 from the host CPU to the NIC. It is an Innova card with an IPSec SBU.
 The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's
 DDR memory.
 
                Internal
 +----------+   Link       +-----------------+
 |          +--------------+      FPGA       |  +------+
 | ConnectX |              |  Shell          +--+ QSFP |
 |          +--------------+    +-------+    |  | Port |
 +----------+ Internal I2C |    | IPSec |    |  +------+
                           |    |  SBU  |    |
                           |    +-------+    |
                           +--+----------+---+
                              |          |
                           +--+--+   +---+---+
                           | DDR |   |       |
                           |     |   | Flash |
                           |SADB |   |       |
                           +-----+   +-------+
 
 Modes and ciphers:
 Currently the following modes and ciphers are supported:
 IPv4 and IPv6
 ESP tunnel and transport modes
 AES 128 and 256 bit encryption, with GCM authentication (RFC4106)
 
 IV is generated using seqiv, in sync with Linux's geniv.
 
 More modes and ciphers may be added later.
 
 Notes:
 In the future similar functionality will be included in a single-chip NIC.
 
 About the driver:
 -----------------
 Patches 1-4 prepare some existing driver code for the new feature:
   * Add support for reserved GIDs in the hardware GID table
   * Allow multiple modules to enable hardware RoCE support independently
 Patches 5-6 define structs and helper functions for QP work-queues.
 Patches 7-11 add various FPGA-related features required for Innova.
 IPSec.
 Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices.
 atches 13-16 add IPSec offload support to the mlx5 netdevice.
 
 This driver services the new IPSec offload API introduced in commit
 d77e38e612 ("xfrm: Add an IPsec hardware offloading API")
 
 Configuration Path:
 If Innova IPSec device is detected, the mlx5e netdevice gets the new
 NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload
 capabilities, and also the matching TX checksum and GSO features.
 
 The driver configures offloaded Security Associations (SAs) by sending
 an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR.
 These messages and their responses are sent over a high-speed channel.
 Counters for ethtool are retrieved by the driver from the SBU.
 
 Data path:
 On receive path, the SBU decrypts ESP packets which match the offloaded SADB,
 but keeps them encapsulated.
 The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload
 has taken place, the SA with which it was done, and the authentication result.
 
 The ConnectX chip performs RX checksum offload on the packet, and RSS using the
 ESP SPI value.  The driver detects the special ethertype, and attaches a struct
 secpath to the RX SKB, including flags to indicate that crypto offload took place,
 the authentication result, and which xfrm_state was used for decryption, in the
 olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate
 patchset will add support for that in the xfrm stack.
 
 On transmit path, the stack encapsulates the packet but does not encrypt it, and
 indicates in the SKB's secpath that crypto offload is to be performed and the SA
 to use to do so.
 The driver avoids performing crypto-offload for ESP fragments, and packets with
 IP options, as the SBU cannot currently do that.  For eligible packets, the driver
 prepends a special ethertype with metadata instructing the hardware to perform crypto offload.
 The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer.
 The driver trims it off, because the SBU automatically appends the trailer for offloaded packets.
 The ConnectX chip performs TX checksum offload on inner UDP or TCP packets,
 and GSO for TCP packets (duplicating the prepended metadata).
 The segmented packets then undergo encryption in the SBU before going on the wire.
 
 Performance:
 We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
 Using AES-NI with ESP GSO we get constant 4.1 Gbps.
 Using crypto offload we get constant 18 Gbps.
 
 Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately.
 
 -  Ilan Tayari
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZUmf1AAoJEEg/ir3gV/o+ukIIALp/5+E1W0cC9xvY1X9dTETW
 cKsHvDJ7G1CxUy18W8Mf9z+WOqC6hGCqS+yicOb+umfIqkTcLHDb2irlqprYLC+F
 oYl1HqgHTaiAYByqL90qiyPcFbfsaNIqA9KOsED2qdZ1yxjoYBiJnSDZDAdO/0lN
 Lt1czNswFc5ovnEUGn8bkjLZZH2pJoJWEI4g4hN9cq33BLLq8A795F/ZjwCJTQ1X
 qXdKcEmktBrgZiSiTVFxxpQVhO/uB0HmzaZzrY1k1P5e6yhHEr422mcOcF9KcSL4
 aeyRYHjoIh51vPMbScPjvfbO/PwooU3LWLlxLVNLG0MmkSaGyJeUXg/wHsGI910=
 =JN0A
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2017-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2017-06-27 (Innova IPsec offload support)

This patchset adds support for Innova IPSec network interface card.

About Innova device:
--------------------
Innova is a network card with a ConnectX chip and an FPGA chip as a
 bump-on-the-wire.

               Internal
+----------+   Link       +-----------------+
|          +--------------+      FPGA       |  +------+
| ConnectX |              |  Shell          +--+ QSFP |
|          +--------------+    +-------+    |  | Port |
+----------+      I2C     |    |  SBU  |    |  +------+
                          |    +-------+    |
                          +--+----------+---+
                             |          |
                          +--+--+   +---+---+
                          | DDR |   | Flash |
                          +-----+   +-------+

The FPGA synthesized logic is loaded from dedicated flash storage and has
 access to its own dedicated DDR RAM.
The ConnectX chip firmware programs the FPGA by accessing its configuration
space over either the slow internal I2C link or the high-speed internal link.

The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU).
mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality,
while other components may handle the various SBU functionalities.

The driver opens high-speed reliable communication channels with the shell and
the SBU over the internal link.
These channels may be used for high-bandwidth configuration or for SBU-specific
out-of-band data paths.

About Innova IPSec device:
--------------------------
Innova IPSec is a network card that allows offloading IPSec cryptography operations
from the host CPU to the NIC. It is an Innova card with an IPSec SBU.
The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's
DDR memory.

               Internal
+----------+   Link       +-----------------+
|          +--------------+      FPGA       |  +------+
| ConnectX |              |  Shell          +--+ QSFP |
|          +--------------+    +-------+    |  | Port |
+----------+ Internal I2C |    | IPSec |    |  +------+
                          |    |  SBU  |    |
                          |    +-------+    |
                          +--+----------+---+
                             |          |
                          +--+--+   +---+---+
                          | DDR |   |       |
                          |     |   | Flash |
                          |SADB |   |       |
                          +-----+   +-------+

Modes and ciphers:
Currently the following modes and ciphers are supported:
IPv4 and IPv6
ESP tunnel and transport modes
AES 128 and 256 bit encryption, with GCM authentication (RFC4106)

IV is generated using seqiv, in sync with Linux's geniv.

More modes and ciphers may be added later.

Notes:
In the future similar functionality will be included in a single-chip NIC.

About the driver:
-----------------
Patches 1-4 prepare some existing driver code for the new feature:
  * Add support for reserved GIDs in the hardware GID table
  * Allow multiple modules to enable hardware RoCE support independently
Patches 5-6 define structs and helper functions for QP work-queues.
Patches 7-11 add various FPGA-related features required for Innova.
IPSec.
Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices.
atches 13-16 add IPSec offload support to the mlx5 netdevice.

This driver services the new IPSec offload API introduced in commit
d77e38e612 ("xfrm: Add an IPsec hardware offloading API")

Configuration Path:
If Innova IPSec device is detected, the mlx5e netdevice gets the new
NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload
capabilities, and also the matching TX checksum and GSO features.

The driver configures offloaded Security Associations (SAs) by sending
an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR.
These messages and their responses are sent over a high-speed channel.
Counters for ethtool are retrieved by the driver from the SBU.

Data path:
On receive path, the SBU decrypts ESP packets which match the offloaded SADB,
but keeps them encapsulated.
The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload
has taken place, the SA with which it was done, and the authentication result.

The ConnectX chip performs RX checksum offload on the packet, and RSS using the
ESP SPI value.  The driver detects the special ethertype, and attaches a struct
secpath to the RX SKB, including flags to indicate that crypto offload took place,
the authentication result, and which xfrm_state was used for decryption, in the
olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate
patchset will add support for that in the xfrm stack.

On transmit path, the stack encapsulates the packet but does not encrypt it, and
indicates in the SKB's secpath that crypto offload is to be performed and the SA
to use to do so.
The driver avoids performing crypto-offload for ESP fragments, and packets with
IP options, as the SBU cannot currently do that.  For eligible packets, the driver
prepends a special ethertype with metadata instructing the hardware to perform crypto offload.
The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer.
The driver trims it off, because the SBU automatically appends the trailer for offloaded packets.
The ConnectX chip performs TX checksum offload on inner UDP or TCP packets,
and GSO for TCP packets (duplicating the prepended metadata).
The segmented packets then undergo encryption in the SBU before going on the wire.

Performance:
We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
Using AES-NI with ESP GSO we get constant 4.1 Gbps.
Using crypto offload we get constant 18 Gbps.

Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately.

-  Ilan Tayari
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:30:16 -04:00
David S. Miller
869684a70d Merge branch 'net-fix-sw-timestamping'
Ivan Khoronzhuk says:

====================
net: fix sw timestamping for non PTP packets

This series contains several corrections connected with timestamping
for cpsw and netcp drivers based on same cpts module.

Based on net/next
====================

Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:28:57 -04:00
Ivan Khoronzhuk
0ccf59ba07 net: ethernet: ti: netcp_ethss: use cpts to check if packet needs timestamping
There is cpts function to check if packet can be timstamped with cpts.
Seems that ptp_classify_raw cover all cases listed with "case".

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:28:57 -04:00
Ivan Khoronzhuk
f44f8417ba net: ethernet: ti: cpsw: fix sw timestamping for non PTP packets
The cpts can timestmap only ptp packets at this moment, so driver
cannot mark every packet as though it's going to be timestamped,
only because h/w timestamping for given skb is enabled with
SKBTX_HW_TSTAMP. It doesn't allow to use sw timestamping, as result
outgoing packet is not timestamped at all if it's not PTP and h/w
timestamping is enabled. So, fix it by setting SKBTX_IN_PROGRESS
only for PTP packets.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:28:57 -04:00
Ivan Khoronzhuk
98fdd857a3 net: ethernet: ti: cpsw: move skb timestamp to packet_submit
Move sw timestamp function close to channel submit function.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:28:56 -04:00
Joe Perches
bf24e136a3 cavium: thunder: Remove duplicate "netdev->name" logging output
Using netdev_<level>(netdev, "%s: ...", netdev->name) duplicates the
name in the output.  Remove those uses.

Miscellanea:

o Use the netif_<level> convenience macros at the same time

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:25:33 -04:00
Colin Ian King
46ccf725bf net/mlx4: fix spelling mistake: "enforcment" -> "enforcement"
Trivial fix to spelling mistake in mlx4_dbg debug message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:25:01 -04:00
Colin Ian King
62d4fd4733 net: atl1c: fix spelling mistake: "droppted" -> "dropped"
Trivial fix to spelling mistake in netif_info message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:24:26 -04:00
LABBE Corentin
5a79b4f2a5 arm: sun8i: orangepi-2: use internal phy-mode
Since the PHY used is internal, simply set phy-mode as internal.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:23:28 -04:00
LABBE Corentin
bdcc005bea arm: sun8i: nanopi-neo: use internal phy-mode
Since the PHY used is internal, simply set phy-mode as internal.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:23:28 -04:00
LABBE Corentin
4ac57180ea arm: sun8i: orangepi-one: use internal phy-mode
Since the PHY used is internal, simply set phy-mode as internal.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:23:28 -04:00
LABBE Corentin
6066de6848 arm: sun8i: orangepi-zero: use internal phy-mode
Since the PHY used is internal, simply set phy-mode as internal.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:23:27 -04:00
LABBE Corentin
3432a86e64 arm: sun8i: orangepipc: use internal phy-mode
Since the PHY used is internal, simply set phy-mode as internal.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:23:27 -04:00
LABBE Corentin
1c2fa5f846 net: stmmac: support future possible different internal phy mode
The current way to find if the phy is internal is to compare DT phy-mode
and emac_variant/internal_phy.
But it will negate a possible future SoC where an external PHY use the
same phy mode than the internal one.

By using phy-mode = "internal" we permit to have an external PHY with
the same mode than the internal one.

Reported-by: André Przywara <andre.przywara@arm.com>
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:23:27 -04:00
Michael Dilmore
eac306b4ad Bonding: Convert multiple netdev_info messages to netdev_dbg
The bond_options.c file contains multiple netdev_info statements that clutter kernel output.
This patch replaces all netdev_info with netdev_dbg and adds a netdev_dbg statement for the
packets per slave parameter. Also fixes misalignment at line 467.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Michael J Dilmore <michael.j.dilmore@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29 12:21:37 -04:00
Tobias Klauser
6474924e2b arch: remove unused macro/function thread_saved_pc()
The only user of thread_saved_pc() in non-arch-specific code was removed
in commit 8243d55977 ("sched/core: Remove pointless printout in
sched_show_task()").  Remove the implementations as well.

Some architectures use thread_saved_pc() in their arch-specific code.
Leave their thread_saved_pc() intact.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-28 16:13:57 -07:00
Jens Axboe
9ae3b3f52c block: provide bio_uninit() free freeing integrity/task associations
Wen reports significant memory leaks with DIF and O_DIRECT:

"With nvme devive + T10 enabled, On a system it has 256GB and started
logging /proc/meminfo & /proc/slabinfo for every minute and in an hour
it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute
leaking.

/proc/meminfo | grep SUnreclaim...

SUnreclaim:      6752128 kB
SUnreclaim:      6874880 kB
SUnreclaim:      7238080 kB
....
SUnreclaim:     22307264 kB
SUnreclaim:     22485888 kB
SUnreclaim:     22720256 kB

When testcases with T10 enabled call into __blkdev_direct_IO_simple,
code doesn't free memory allocated by bio_integrity_alloc. The patch
fixes the issue. HTX has been run with +60 hours without failure."

Since __blkdev_direct_IO_simple() allocates the bio on the stack, it
doesn't go through the regular bio free. This means that any ancillary
data allocated with the bio through the stack is not freed. Hence, we
can leak the integrity data associated with the bio, if the device is
using DIF/DIX.

Fix this by providing a bio_uninit() and export it, so that we can use
it to free this data. Note that this is a minimal fix for this issue.
Any current user of bio's that are allocated outside of
bio_alloc_bioset() suffers from this issue, most notably some drivers.
We will fix those in a more comprehensive patch for 4.13. This also
means that the commit marked as being fixed by this isn't the real
culprit, it's just the most obvious one out there.

Fixes: 542ff7bf18 ("block: new direct I/O implementation")
Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28 15:30:13 -06:00
Linus Torvalds
e547204f1f NFS client bugfixes for Linux 4.12
Bugfixes include:
 
 - Stable fix for exclusive create if the server supports the umask attribute
 - Trunking detection should handle ERESTARTSYS/EINTR
 - Stable fix for a race in the LAYOUTGET function
 - Stable fix to revert "nfs_rename() handle -ERESTARTSYS dentry left behind"
 - nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZU7Z/AAoJEGcL54qWCgDyqkwQAKwHulvAMeKH9/Wy3vo7mqOR
 Yqbz2mujMzTFUaebV1bvOJzmpvP5uBmC/9ggqBCYbV0pW7mR9YYiX6RH7TfQ/IfQ
 XZjOq+isBFIhDUVQQ2VOCbdgIHBu1V5++1Oterwtz9yWfXB+GXdlVD0UwH/j4MHG
 LP/z7Xa7jdsG/XrQC8Z22Qk7byBR6+D4IBYRZlqwVtnEtte880LZJh7OjOUpwRVW
 SwH5p7EAF8plyr/9/OT8uC+dl+LPE5cs3ZXkkTiEB91VLRdCuU/wxo8r5xMU3wPY
 sT7q7O50fTvQka0to5Ag64laXwq352SjqfYwHd+90THc8Zh2XbuRftF3zhvi46d5
 kaXvdNqggV7qTJc8dcEt2dtdTOaQ9zr1SOfar9pusnfAvrmw46R6tmS7YJbMhomr
 iQzDKSwo9IZ7mTCiEopUVb/nXQwgy+/oojPNR6f0eXM2DU3z0CR/6awyAFmpgRUK
 OWcMSTSXYq/LMSaZEZq5htzTzkRl1m6V/z5vRD1M/ZqXp2hpwJWmDupciJEya/Y8
 /L+tkkPjGeBkEzgSYqJYpTMnzfxidMw+QDYkQIGYei421/nBi1BzH3IU82aKvsVI
 V6/i2DEL6ncJhw71tHHYBfn01PXIdwMdBHJVLunMZPU9hgc0UF5o1jaK3JourAWg
 Uqd1JTm/mOLSZNTgoXbO
 =w19C
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.12-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Bugfixes include:

   - stable fix for exclusive create if the server supports the umask
     attribute

   - trunking detection should handle ERESTARTSYS/EINTR

   - stable fix for a race in the LAYOUTGET function

   - stable fix to revert "nfs_rename() handle -ERESTARTSYS dentry left
     behind"

   - nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()"

* tag 'nfs-for-4.12-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4.1: nfs4_callback_free_slot() cannot call nfs4_slot_tbl_drain_complete()
  Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
  NFSv4.1: Fix a race in nfs4_proc_layoutget
  NFS: Trunking detection should handle ERESTARTSYS/EINTR
  NFSv4.2: Don't send mode again in post-EXCLUSIVE4_1 SETATTR with umask
2017-06-28 13:27:15 -07:00