Commit Graph

949055 Commits

Author SHA1 Message Date
David S. Miller
c4b83061dc Merge branch 'sfc-driver-for-EF100-family-NICs-part-2'
Edward Cree says:

====================
sfc: driver for EF100 family NICs, part 2

This series implements the data path and various other functionality
 for Xilinx/Solarflare EF100 NICs.

Changed from v2:
 * Improved error handling of design params (patch #3)
 * Removed 'inline' from .c file in patch #4
 * Don't report common stats to ethtool -S (patch #8)

Changed from v1:
 * Fixed build errors on CONFIG_RFS_ACCEL=n (patch #5) and 32-bit
   (patch #8)
 * Dropped patch #10 (ethtool ops) as it's buggy and will need a
   bigger rework to fix.
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
d61592a112 sfc_ef100: add nic-type for VFs, and bind to them
We don't yet have a .sriov_configure() to create them, though.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
ef2c57b956 sfc_ef100: read pf_index at probe time
We'll need it later, for VF representors.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
43c3df0d56 sfc_ef100: functions for selftests
Self-tests for event and interrupt reception and NVRAM.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
b593b6f1b4 sfc_ef100: statistics gathering
MAC stats work much the same as on EF10, with a periodic DMA to a region
 specified via an MCDI.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
b780feac36 sfc_ef100: plumb in fini_dmaq
Bring down the TX and RX queues at ifdown, so that we can then fini the
 EVQs (otherwise the MC would return EBUSY because they're still in use).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:55 -07:00
Edward Cree
8e57daf706 sfc_ef100: RX path for EF100
Includes RSS spreading.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
a9dc3d5612 sfc_ef100: RX filter table management and related gubbins
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
d19a537218 sfc_ef100: TX path for EF100 NICs
Includes checksum offload and TSO, so declare those in our netdev features.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
adcfc3482f sfc_ef100: read Design Parameters at probe time
Several parts of the EF100 architecture are parameterised (to allow
 varying capabilities on FPGAs according to resource constraints), and
 these parameters are exposed to the driver through a TLV-encoded
 region of the BAR.
For the most part we either don't care about these values at all or
 just need to sanity-check them against the driver's assumptions, but
 there are a number of TSO limits which we record so that we will be
 able to check against them in the TX path when handling GSO skbs.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
4496363bec sfc_ef100: fail the probe if NIC uses unsol_ev credits
In the future, EF100 is planned to have a credit-based scheme for
 handling unsolicited events, which drivers will need to use in order
 to function correctly.  However, current EF100 hardware does not yet
 generate unsolicited events and the credit scheme has not yet been
 implemented in firmware.  To prevent compatibility problems later if
 the current driver is used with future firmware which does implement
 it, we check for the corresponding capability flag (which that
 future firmware will set), and if found, we refuse to probe.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Edward Cree
8e737145e8 sfc_ef100: check firmware version at start-of-day
Early in EF100 development there was a different format of event
 descriptor; if the NIC is somehow running the very old firmware
 which will use that format, fail the probe.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:22:54 -07:00
Jiafei Pan
215602a8d2 enetc: use napi_schedule to be compatible with PREEMPT_RT
The driver calls napi_schedule_irqoff() from a context where, in RT,
hardirqs are not disabled, since the IRQ handler is force-threaded.

In the call path of this function, __raise_softirq_irqoff() is modifying
its per-CPU mask of pending softirqs that must be processed, using
or_softirq_pending(). The or_softirq_pending() function is not atomic,
but since interrupts are supposed to be disabled, nobody should be
preempting it, and the operation should be safe.

Nonetheless, when running with hardirqs on, as in the PREEMPT_RT case,
it isn't safe, and the pending softirqs mask can get corrupted,
resulting in softirqs being lost and never processed.

To have common code that works with PREEMPT_RT and with mainline Linux,
we can use plain napi_schedule() instead. The difference is that
napi_schedule() (via __napi_schedule) also calls local_irq_save, which
disables hardirqs if they aren't already. But, since they already are
disabled in non-RT, this means that in practice we don't see any
measurable difference in throughput or latency with this patch.

Signed-off-by: Jiafei Pan <Jiafei.Pan@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:21:30 -07:00
Jiafei Pan
6c33ae1ad5 dpaa2-eth: use napi_schedule to be compatible with PREEMPT_RT
The driver calls napi_schedule_irqoff() from a context where, in RT,
hardirqs are not disabled, since the IRQ handler is force-threaded.

In the call path of this function, __raise_softirq_irqoff() is modifying
its per-CPU mask of pending softirqs that must be processed, using
or_softirq_pending(). The or_softirq_pending() function is not atomic,
but since interrupts are supposed to be disabled, nobody should be
preempting it, and the operation should be safe.

Nonetheless, when running with hardirqs on, as in the PREEMPT_RT case,
it isn't safe, and the pending softirqs mask can get corrupted,
resulting in softirqs being lost and never processed.

To have common code that works with PREEMPT_RT and with mainline Linux,
we can use plain napi_schedule() instead. The difference is that
napi_schedule() (via __napi_schedule) also calls local_irq_save, which
disables hardirqs if they aren't already. But, since they already are
disabled in non-RT, this means that in practice we don't see any
measurable difference in throughput or latency with this patch.

Signed-off-by: Jiafei Pan <Jiafei.Pan@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:21:30 -07:00
David S. Miller
d8f375ea46 Merge branch 'net-dsa-loop-Preparatory-changes-for-802-1Q-data-path'
net: dsa: loop: Preparatory changes for 802.1Q data path
Florian Fainelli says:

====================
These patches are all meant to help pave the way for a 802.1Q data path
added to the mockup driver, making it more useful than just testing for
configuration. Sending those out now since there is no real need to
wait.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Florian Fainelli
947b6ef9f7 net: dsa: loop: Set correct number of ports
We only support DSA_LOOP_NUM_PORTS in the switch, do not tell the DSA
core to allocate up to DSA_MAX_PORTS which is nearly the double (6 vs.
11).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Florian Fainelli
c99194eded net: dsa: loop: Wire-up MTU callbacks
For now we simply store the port MTU into a per-port member.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Florian Fainelli
6c84a58997 net: dsa: loop: Move data structures to header
In preparation for adding support for a mockup data path, move the
driver data structures to include/linux/dsa/loop.h such that we can
share them between net/dsa/ and drivers/net/dsa/ later on.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:23 -07:00
Florian Fainelli
916a8d168e net: dsa: loop: Support 4K VLANs
Allocate a 4K array of VLANs instead of limiting ourselves to just 5
which is arbitrary.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:22 -07:00
Florian Fainelli
81d4e8e073 net: dsa: loop: PVID should be per-port
The PVID should be per-port, this is a preliminary change to support a
802.1Q data path in the driver.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:19:22 -07:00
Rahul Lakkireddy
59b328cf56 cxgb4: add TC-MATCHALL IPv6 support
Matching IPv6 traffic require allocating their own individual slots
in TCAM. So, fetch additional slots to insert IPv6 rules. Also, fetch
the cumulative stats of all the slots occupied by the Matchall rule.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:17:08 -07:00
Vladimir Oltean
af9fdd2bf8 net: dsa: sja1105: poll for extts events from a timer
The current poll interval is enough to ensure that rising and falling
edge events are not lost for a 1 PPS signal with 50% duty cycle.

But when we deliver the events to user space, it will try to infer if
they were corresponding to a rising or to a falling edge (the kernel
driver doesn't know that either). User space will try to make that
inference based on the time at which the PPS master had emitted the
pulse (i.e. if it's a .0 time, it's rising edge, if it's .5 time, it's
falling edge).

But there is no in-kernel API for retrieving the precise timestamp
corresponding to a PPS master (aka perout) pulse. So user space has to
guess even that. It will read the PTP time on the PPS master right after
we've delivered the extts event, and declare that the PPS master time
was just the closest integer second, based on 2 thresholds (lower than
.25, or higher than .75, and ignore anything else).

Except that, if we poll for extts events (and our hardware doesn't
really help us, by not providing an interrupt), then there is a risk
that the poll period (and therefore the time at which the event is
delivered) might confuse user space.

Because we are always scheduling the next extts poll at
SJA1105_EXTTS_INTERVAL "from now" (that's the only thing that the
schedule_delayed_work() API gives us), it means that the start time of
the next delayed workqueue will always be shifted to the right a little
bit (shifted with the SPI access duration of this workqueue run).
In turn, because user space sees extts events that are non-periodic
compared to the PPS master's time, this means that it might start making
wrong guesses about rising/falling edge.

To understand the effect, here is the output of ts2phc currently. Notice
the 'src' timestamps of the 'SKIP extts' events, and how they have a
large wander. They keep increasing until the upper limit for the ignore
threshold (.75 seconds), after which the application starts ignoring the
_other_ edge.

ts2phc[26.624]: /dev/ptp3 SKIP extts index 0 at 21.449898912 src 21.657784518
ts2phc[27.133]: adding tstamp 21.949894240 to clock /dev/ptp3
ts2phc[27.133]: adding tstamp 22.000000000 to clock /dev/ptp1
ts2phc[27.133]: /dev/ptp3 offset        640 s2 freq   +5112
ts2phc[27.636]: /dev/ptp3 SKIP extts index 0 at 22.449889360 src 22.669398022
ts2phc[28.140]: adding tstamp 22.949884376 to clock /dev/ptp3
ts2phc[28.140]: adding tstamp 23.000000000 to clock /dev/ptp1
ts2phc[28.140]: /dev/ptp3 offset         96 s2 freq   +4760
ts2phc[28.644]: /dev/ptp3 SKIP extts index 0 at 23.449879504 src 23.677420422
ts2phc[29.153]: adding tstamp 23.949874704 to clock /dev/ptp3
ts2phc[29.153]: adding tstamp 24.000000000 to clock /dev/ptp1
ts2phc[29.153]: /dev/ptp3 offset       -264 s2 freq   +4429
ts2phc[29.656]: /dev/ptp3 SKIP extts index 0 at 24.449870008 src 24.689407238
ts2phc[30.160]: adding tstamp 24.949865376 to clock /dev/ptp3
ts2phc[30.160]: adding tstamp 25.000000000 to clock /dev/ptp1
ts2phc[30.160]: /dev/ptp3 offset       -280 s2 freq   +4334
ts2phc[30.664]: /dev/ptp3 SKIP extts index 0 at 25.449860760 src 25.697449926
ts2phc[31.168]: adding tstamp 25.949856176 to clock /dev/ptp3
ts2phc[31.168]: adding tstamp 26.000000000 to clock /dev/ptp1
ts2phc[31.168]: /dev/ptp3 offset       -176 s2 freq   +4354
ts2phc[31.672]: /dev/ptp3 SKIP extts index 0 at 26.449851584 src 26.705433606
ts2phc[32.180]: adding tstamp 26.949846992 to clock /dev/ptp3
ts2phc[32.180]: adding tstamp 27.000000000 to clock /dev/ptp1
ts2phc[32.180]: /dev/ptp3 offset        -80 s2 freq   +4397
ts2phc[32.684]: /dev/ptp3 SKIP extts index 0 at 27.449842384 src 27.717415110
ts2phc[33.192]: adding tstamp 27.949837768 to clock /dev/ptp3
ts2phc[33.192]: adding tstamp 28.000000000 to clock /dev/ptp1
ts2phc[33.192]: /dev/ptp3 offset          0 s2 freq   +4453
ts2phc[33.696]: /dev/ptp3 SKIP extts index 0 at 28.449833128 src 28.729412902
ts2phc[34.200]: adding tstamp 28.949828472 to clock /dev/ptp3
ts2phc[34.200]: adding tstamp 29.000000000 to clock /dev/ptp1
ts2phc[34.200]: /dev/ptp3 offset          8 s2 freq   +4461
ts2phc[34.704]: /dev/ptp3 SKIP extts index 0 at 29.449823816 src 29.737416038
ts2phc[35.208]: adding tstamp 29.949819152 to clock /dev/ptp3
ts2phc[35.208]: adding tstamp 30.000000000 to clock /dev/ptp1
ts2phc[35.208]: /dev/ptp3 offset         -8 s2 freq   +4447
ts2phc[35.712]: /dev/ptp3 SKIP extts index 0 at 30.449814496 src 30.745554982
ts2phc[36.216]: adding tstamp 30.949809840 to clock /dev/ptp3
ts2phc[36.216]: adding tstamp 31.000000000 to clock /dev/ptp1
ts2phc[36.216]: /dev/ptp3 offset         -8 s2 freq   +4445
ts2phc[36.468]: /dev/ptp3 SKIP extts index 0 at 31.449805184 src 31.501109446
ts2phc[36.972]: adding tstamp 31.949800536 to clock /dev/ptp3
ts2phc[36.972]: adding tstamp 32.000000000 to clock /dev/ptp1
ts2phc[36.972]: /dev/ptp3 offset         -8 s2 freq   +4442
ts2phc[37.480]: /dev/ptp3 SKIP extts index 0 at 32.449795896 src 32.513320070
ts2phc[37.984]: adding tstamp 32.949791248 to clock /dev/ptp3
ts2phc[37.984]: adding tstamp 33.000000000 to clock /dev/ptp1
ts2phc[37.984]: /dev/ptp3 offset          0 s2 freq   +4448

Fix that by taking the following measures:
- Schedule the poll from a timer. Because we are really scheduling the
  timer periodically, the extts events delivered to user space are
  periodic too, and don't suffer from the "shift-to-the-right" effect.
- Increase the poll period to 6 times a second. This imposes a smaller
  upper bound to the shift that can occur to the delivery time of extts
  events, and makes user space (ts2phc) to always interpret correctly
  which events should be skipped and which shouldn't.
- Move the SPI readout itself to the main PTP kernel thread, instead of
  the generic workqueue. This is because the timer runs in atomic
  context, but is also better than before, because if needed, we can
  chrt & taskset this kernel thread, to ensure it gets enough priority
  under load.

After this patch, one can notice that the wander is greatly reduced, and
that the latencies of one extts poll are not propagated to the next. The
'src' timestamp that is skipped is never larger than .65 seconds (which
means .15 seconds larger than the time at which the real event occurred
at, and .10 seconds smaller than the .75 upper threshold for ignoring
the falling edge):

ts2phc[40.076]: adding tstamp 34.949261296 to clock /dev/ptp3
ts2phc[40.076]: adding tstamp 35.000000000 to clock /dev/ptp1
ts2phc[40.076]: /dev/ptp3 offset         48 s2 freq   +4631
ts2phc[40.568]: /dev/ptp3 SKIP extts index 0 at 35.449256496 src 35.595791078
ts2phc[41.064]: adding tstamp 35.949251744 to clock /dev/ptp3
ts2phc[41.064]: adding tstamp 36.000000000 to clock /dev/ptp1
ts2phc[41.064]: /dev/ptp3 offset       -224 s2 freq   +4374
ts2phc[41.552]: /dev/ptp3 SKIP extts index 0 at 36.449247088 src 36.579825574
ts2phc[42.044]: adding tstamp 36.949242456 to clock /dev/ptp3
ts2phc[42.044]: adding tstamp 37.000000000 to clock /dev/ptp1
ts2phc[42.044]: /dev/ptp3 offset       -240 s2 freq   +4290
ts2phc[42.536]: /dev/ptp3 SKIP extts index 0 at 37.449237848 src 37.563828774
ts2phc[43.028]: adding tstamp 37.949233264 to clock /dev/ptp3
ts2phc[43.028]: adding tstamp 38.000000000 to clock /dev/ptp1
ts2phc[43.028]: /dev/ptp3 offset       -144 s2 freq   +4314
ts2phc[43.520]: /dev/ptp3 SKIP extts index 0 at 38.449228656 src 38.547823238
ts2phc[44.012]: adding tstamp 38.949224048 to clock /dev/ptp3
ts2phc[44.012]: adding tstamp 39.000000000 to clock /dev/ptp1
ts2phc[44.012]: /dev/ptp3 offset        -80 s2 freq   +4335
ts2phc[44.508]: /dev/ptp3 SKIP extts index 0 at 39.449219432 src 39.535846118
ts2phc[44.996]: adding tstamp 39.949214816 to clock /dev/ptp3
ts2phc[44.996]: adding tstamp 40.000000000 to clock /dev/ptp1
ts2phc[44.996]: /dev/ptp3 offset        -32 s2 freq   +4359
ts2phc[45.488]: /dev/ptp3 SKIP extts index 0 at 40.449210192 src 40.515824678
ts2phc[45.980]: adding tstamp 40.949205568 to clock /dev/ptp3
ts2phc[45.980]: adding tstamp 41.000000000 to clock /dev/ptp1
ts2phc[45.980]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[46.636]: /dev/ptp3 SKIP extts index 0 at 41.449200928 src 41.664176902
ts2phc[47.132]: adding tstamp 41.949196288 to clock /dev/ptp3
ts2phc[47.132]: adding tstamp 42.000000000 to clock /dev/ptp1
ts2phc[47.132]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[47.620]: /dev/ptp3 SKIP extts index 0 at 42.449191656 src 42.648117190
ts2phc[48.112]: adding tstamp 42.949187016 to clock /dev/ptp3
ts2phc[48.112]: adding tstamp 43.000000000 to clock /dev/ptp1
ts2phc[48.112]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[48.604]: /dev/ptp3 SKIP extts index 0 at 43.449182384 src 43.632112582
ts2phc[49.100]: adding tstamp 43.949177736 to clock /dev/ptp3
ts2phc[49.100]: adding tstamp 44.000000000 to clock /dev/ptp1
ts2phc[49.100]: /dev/ptp3 offset         -8 s2 freq   +4376
ts2phc[49.588]: /dev/ptp3 SKIP extts index 0 at 44.449173096 src 44.616136774
ts2phc[50.080]: adding tstamp 44.949168464 to clock /dev/ptp3
ts2phc[50.080]: adding tstamp 45.000000000 to clock /dev/ptp1
ts2phc[50.080]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[50.572]: /dev/ptp3 SKIP extts index 0 at 45.449163816 src 45.600134662
ts2phc[51.064]: adding tstamp 45.949159160 to clock /dev/ptp3
ts2phc[51.064]: adding tstamp 46.000000000 to clock /dev/ptp1
ts2phc[51.064]: /dev/ptp3 offset         -8 s2 freq   +4376
ts2phc[51.556]: /dev/ptp3 SKIP extts index 0 at 46.449154528 src 46.584588550
ts2phc[52.048]: adding tstamp 46.949149896 to clock /dev/ptp3
ts2phc[52.048]: adding tstamp 47.000000000 to clock /dev/ptp1
ts2phc[52.048]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[52.540]: /dev/ptp3 SKIP extts index 0 at 47.449145256 src 47.568132198
ts2phc[53.032]: adding tstamp 47.949140616 to clock /dev/ptp3
ts2phc[53.032]: adding tstamp 48.000000000 to clock /dev/ptp1
ts2phc[53.032]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[53.524]: /dev/ptp3 SKIP extts index 0 at 48.449135968 src 48.552121446
ts2phc[54.016]: adding tstamp 48.949131320 to clock /dev/ptp3
ts2phc[54.016]: adding tstamp 49.000000000 to clock /dev/ptp1
ts2phc[54.016]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[54.512]: /dev/ptp3 SKIP extts index 0 at 49.449126680 src 49.540147014
ts2phc[55.000]: adding tstamp 49.949122040 to clock /dev/ptp3
ts2phc[55.000]: adding tstamp 50.000000000 to clock /dev/ptp1
ts2phc[55.000]: /dev/ptp3 offset          0 s2 freq   +4382
ts2phc[55.492]: /dev/ptp3 SKIP extts index 0 at 50.449117400 src 50.520119078
ts2phc[55.988]: adding tstamp 50.949112768 to clock /dev/ptp3
ts2phc[55.988]: adding tstamp 51.000000000 to clock /dev/ptp1
ts2phc[55.988]: /dev/ptp3 offset          8 s2 freq   +4390
ts2phc[56.476]: /dev/ptp3 SKIP extts index 0 at 51.449108120 src 51.504175910
ts2phc[57.132]: adding tstamp 51.949103480 to clock /dev/ptp3
ts2phc[57.132]: adding tstamp 52.000000000 to clock /dev/ptp1
ts2phc[57.132]: /dev/ptp3 offset          0 s2 freq   +4384
ts2phc[57.624]: /dev/ptp3 SKIP extts index 0 at 52.449098840 src 52.651833574
ts2phc[58.116]: adding tstamp 52.949094200 to clock /dev/ptp3
ts2phc[58.116]: adding tstamp 53.000000000 to clock /dev/ptp1
ts2phc[58.116]: /dev/ptp3 offset          8 s2 freq   +4392
ts2phc[58.612]: /dev/ptp3 SKIP extts index 0 at 53.449089560 src 53.639826918
ts2phc[59.100]: adding tstamp 53.949084920 to clock /dev/ptp3
ts2phc[59.100]: adding tstamp 54.000000000 to clock /dev/ptp1
ts2phc[59.100]: /dev/ptp3 offset          8 s2 freq   +4394
ts2phc[59.592]: /dev/ptp3 SKIP extts index 0 at 54.449080272 src 54.619842278
ts2phc[60.084]: adding tstamp 54.949075624 to clock /dev/ptp3
ts2phc[60.084]: adding tstamp 55.000000000 to clock /dev/ptp1
ts2phc[60.084]: /dev/ptp3 offset          8 s2 freq   +4397
ts2phc[60.576]: /dev/ptp3 SKIP extts index 0 at 55.449070968 src 55.603885542
ts2phc[61.068]: adding tstamp 55.949066312 to clock /dev/ptp3
ts2phc[61.068]: adding tstamp 56.000000000 to clock /dev/ptp1
ts2phc[61.068]: /dev/ptp3 offset          0 s2 freq   +4391
ts2phc[61.560]: /dev/ptp3 SKIP extts index 0 at 56.449061680 src 56.587885798
ts2phc[62.052]: adding tstamp 56.949057032 to clock /dev/ptp3
ts2phc[62.052]: adding tstamp 57.000000000 to clock /dev/ptp1
ts2phc[62.052]: /dev/ptp3 offset         -8 s2 freq   +4383

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:16:02 -07:00
Paolo Abeni
8555c6bfd5 mptcp: fix bogus sendmsg() return code under pressure
In case of memory pressure, mptcp_sendmsg() may call
sk_stream_wait_memory() after succesfully xmitting some
bytes. If the latter fails we currently return to the
user-space the error code, ignoring the succeful xmit.

Address the issue always checking for the xmitted bytes
before mptcp_sendmsg() completes.

Fixes: f296234c98 ("mptcp: Add handling of incoming MP_JOIN requests")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:14:20 -07:00
David S. Miller
f8deaea06f Merge branch 'mlxsw-Add-support-for-buffer-drop-traps'
Ido Schimmel says:

====================
mlxsw: Add support for buffer drop traps

Petr says:

A recent patch set added the ability to mirror buffer related drops
(e.g., early drops) through a netdev. This patch set adds the ability to
trap such packets to the local CPU for analysis.

The trapping towards the CPU is configured by using tc-trap action
instead of tc-mirred as was done when the packets were mirrored through
a netdev. A future patch set will also add the ability to sample the
dropped packets using tc-sample action.

The buffer related drop traps are added to devlink, which means that the
dropped packets can be reported to user space via the kernel's
drop_monitor module.

Patch set overview:

Patch #1 adds the early_drop trap to devlink

Patch #2 adds extack to a few devlink operations to facilitate better
error reporting to user space. This is necessary - among other things -
because the action of buffer drop traps cannot be changed in mlxsw

Patch #3 performs a small refactoring in mlxsw, patch #4 fixes a bug that
this patchset would trigger.

Patches #5-#6 add the infrastructure required to support different traps
/ trap groups in mlxsw per-ASIC. This is required because buffer drop
traps are not supported by Spectrum-1

Patch #7 extends mlxsw to register the early_drop trap

Patch #8 adds the offload logic for the "trap" action at a qevent block.

Patch #9 adds a mlxsw-specific selftest.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:47 -07:00
Petr Machata
8fb6ac457d selftests: mlxsw: RED: Test offload of trapping on RED qevents
Add a selftest for RED early_drop and mark qevents when a trap action is
attached at the associated block.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Petr Machata
54a9238589 mlxsw: spectrum_qdisc: Offload action trap for qevents
When offloading action trap on a qevent, pass to_dev of NULL to the SPAN
module to trigger the mirror to the CPU port. Query the buffer drops
policer and use it for policing of the trapped traffic.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
6687e953f4 mlxsw: spectrum_trap: Add early_drop trap
As previously explained, packets that are dropped due to buffer related
reasons (e.g., tail drop, early drop) can be mirrored to the CPU port.
These packets are then trapped with one of the "mirror session" traps
and their CQE includes the reason for which the packet was mirrored.

Register with devlink a new trap, early_drop, and initialize the
corresponding Rx listener with the appropriate mirror reason. Return an
error in case user tries to change the traps' action, as this is not
supported.

Since Spectrum-1 does not support these traps, the above is only done
for Spectrum-2 onwards.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
869c7be940 mlxsw: spectrum_trap: Allow for per-ASIC traps initialization
Subsequent patches will need to register different traps for Spectrum-1
and Spectrum-2 onwards.

Enable that by invoking a per-ASIC operation during traps
initialization.

Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
36d1fd687d mlxsw: spectrum_trap: Allow for per-ASIC trap groups initialization
Subsequent patches will need to register different trap groups for
Spectrum-1 and Spectrum-2 onwards.

Enable that by invoking a per-ASIC operation during trap groups
initialization.

Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Petr Machata
928345c08b mlxsw: spectrum_span: On policer_id_base_ref_count, use dec_and_test
When unsetting policer base, the SPAN code currently uses refcount_dec().
However that function splats when the counter reaches zero, because
reaching zero without actually testing is in general indicative of a
missing cleanup. There is no cleanup to be done here, but nonetheless, use
refcount_dec_and_test() as required.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
76ba292cc7 mlxsw: spectrum_trap: Use 'size_t' for array sizes
Use 'size_t' instead of 'u64' for array sizes, as this this is correct
type to use for expressions involving sizeof().

Suggested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
c88e11e047 devlink: Pass extack when setting trap's action and group's parameters
A later patch will refuse to set the action of certain traps in mlxsw
and also to change the policer binding of certain groups. Pass extack so
that failure could be communicated clearly to user space.

Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Amit Cohen
08e335f6ad devlink: Add early_drop trap
Add the packet trap that can report packets that were ECN marked due to RED
AQM.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Zhihao Cheng
9feffe1466 f2fs: update_sit_entry: Make the judgment condition of f2fs_bug_on more intuitive
Current judgment condition of f2fs_bug_on in function update_sit_entry():
  new_vblocks >> (sizeof(unsigned short) << 3) ||
	new_vblocks > sbi->blocks_per_seg

which equivalents to:
  new_vblocks < 0 || new_vblocks > sbi->blocks_per_seg

The latter is more intuitive.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reported-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-08-03 18:05:13 -07:00
Yufen Yu
58f7e00ffb f2fs: replace test_and_set/clear_bit() with set/clear_bit()
Since set/clear_inode_flag() don't need to return value to show
if flag is set, we can just call set/clear_bit() here.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-08-03 18:05:10 -07:00
Daeho Jeong
567c4bf54a f2fs: make file immutable even if releasing zero compression block
When we use F2FS_IOC_RELEASE_COMPRESS_BLOCKS ioctl, if we can't find
any compressed blocks in the file even with large file size, the
ioctl just ends up without changing the file's status as immutable.
It makes the user, who expects that the file is immutable when it
returns successfully, confused.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-08-03 18:05:03 -07:00
Linus Torvalds
09a0bd0776 platform-drivers-x86 for v5.9-1
* ASUS WMI driver honors BAT1 name of the battery
   (quite a few new laptops are using it)
 * Dell WMI driver supports new key codes and backlight events
 * ThinkPad ACPI driver now may use standard charge threshold interface,
   it also has been updated to provide Laptop or Desktop mode to the user
 * Intel Speed Select Technology gained support on Sapphire Rapids platform
 * Regular update of Speed Select Technology tools
 * Mellanox has been updated to support complex attributes
 * PMC core driver has been fixed to show correct names for LPM0 register
 * HTTP links were replaced by HTTPS ones where it applies
 * Miscellaneous fixes and cleanups here and there
 
 The following is an automated git shortlog grouped by driver:
 
 acerhdf:
  -  Replace HTTP links with HTTPS ones
 
 Add new intel_atomisp2_led driver:
  - Add new intel_atomisp2_led driver
 
 apple-gmux:
  -  Replace HTTP links with HTTPS ones
 
 asus-nb-wmi:
  -  Drop duplicate DMI quirk structures
  -  add support for ASUS ROG Zephyrus G14 and G15
 
 asus-wmi:
  -  allow BAT1 battery name
 
 dell-wmi:
  -  add new dmi mapping for keycode 0xffff
  -  add new keymap type 0x0012
  -  add new backlight events
 
 intel_cht_int33fe:
  -  Drop double check for ACPI companion device
 
 intel-hid:
  -  Fix return value check in check_acpi_dev()
 
 intel_pmc_core:
  -  fix bound check in pmc_core_mphy_pg_show()
  -  update TGL's LPM0 reg bit map name
 
 intel-vbtn:
  -  Fix return value check in check_acpi_dev()
 
 ISST:
  -  drop a duplicated word in isst_if.h
  -  Add new PCI device ids
 
 pcengines-apuv2:
  -  revert wiring up simswitch GPIO as LED
 
 platform/mellanox:
  -  Introduce string_upper() and string_lower() helpers
  -  Add string_upper() and string_lower() tests
  -  Extend FAN platform data description
  -  Add more definitions for system attributes
  -  Add new attribute for mlxreg-io sysfs interfaces
  -  Add presence register field for FAN devices
  -  Add support for complex attributes
  -  mlxreg-io: Add support for complex attributes
  -  mlxreg-hotplug: Add environmental data to uevent
  -  mlxreg-hotplug: Use capability register for attribute creation
  -  mlxreg-hotplug: Modify module license
 
 system76-acpi:
  -  Fix brightness_set schedule while atomic
 
 thinkpad_acpi:
  -  Make some symbols static
  -  add documentation for battery charge control
  -  use standard charge control attribute names
  -  remove unused defines
  -  Replace HTTP links with HTTPS ones
  -  not loading brightness_init when _BCL invalid
  -  lap or desk mode interface
  -  Revert "Use strndup_user() in dispatch_proc_write()"
 
 tools/power/x86/intel-speed-select:
  -  Update version for v5.9
  -  Add retries for mail box commands
  -  Add option to delay mbox commands
  -  Ignore -o option processing on error
  -  Change path for caching topology info
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl8n9UIACgkQb7wzTHR8
 rCg0cRAAnHC699qMns53T+jSMAc9kY5vrwbf2Bacwjp5vYS7uw3S12kLNgQh6JWM
 B8imS1RAMlHfCvGT9vUs+mbpemCfYpFZ0HQ4z+VMzziKayJu8pCD9O2XDS9lRUHj
 uKqA19FX6lXfJnrhRoi3vM0JC35FXZQZkueLyJ728t1xqcDkNS2/O/by3c76XnK1
 r81jl9m+qMJwB5zt+6qeSWjZAgFCTDvOHA/gZTEKF1VBOcafN1RHWTLH49JtpBH7
 +pPoy9mw3zfDJReqKxFhaXgDlueBlRwTBG5kEQk9aIxgA9T8q2jqEyAw5XtAKOWp
 RCebgCpC5NHXyz/2CVzaJWYikS/Iuvr4ynx6F2nKXVClGMtF7Obb2/3rvSCOTMri
 ge1bJ2NomUL9F5Cd/iR/F4/jNtxk9LSuzW+otA+/m+g8UkK574cefQQmBdrhrfDU
 GqK6C1kB7OQW2v3tPKVsWvduXQTaaCVzINEkdUxK50PXZayY7+w0djM+QFR/sadw
 2/okr1zuNjFnEt7HtZRP5bzEm1ofXVo5JIaCnpwgsz6dCQH4NiuVxsc/GyanRpBJ
 AadUAYcD+PRRnOybsfhx5MlUt/JIr6cV0OvT8LlWPINJjTpWPlqMeBYdDUOLCndX
 3LXmoZ0k8IQ2LLmONc5UBX1+iayCO+PkHcBDoJdNuWT1eoO06M8=
 =yvnW
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.9-1' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform driver updates from Andy Shevchenko:

 - ASUS WMI driver honors BAT1 name of the battery (quite a few new
   laptops are using it)

 - Dell WMI driver supports new key codes and backlight events

 - ThinkPad ACPI driver now may use standard charge threshold interface,
   it also has been updated to provide Laptop or Desktop mode to the
   user

 - Intel Speed Select Technology gained support on Sapphire Rapids
   platform

 - Regular update of Speed Select Technology tools

 - Mellanox has been updated to support complex attributes

 - PMC core driver has been fixed to show correct names for LPM0
   register

 - HTTP links were replaced by HTTPS ones where it applies

 - Miscellaneous fixes and cleanups here and there

* tag 'platform-drivers-x86-v5.9-1' of git://git.infradead.org/linux-platform-drivers-x86: (42 commits)
  platform/x86: asus-nb-wmi: Drop duplicate DMI quirk structures
  platform/x86: thinkpad_acpi: Make some symbols static
  platform/x86: thinkpad_acpi: add documentation for battery charge control
  platform/x86: thinkpad_acpi: use standard charge control attribute names
  platform/x86: thinkpad_acpi: remove unused defines
  platform/x86: ISST: drop a duplicated word in isst_if.h
  tools/power/x86/intel-speed-select: Update version for v5.9
  tools/power/x86/intel-speed-select: Add retries for mail box commands
  tools/power/x86/intel-speed-select: Add option to delay mbox commands
  tools/power/x86/intel-speed-select: Ignore -o option processing on error
  tools/power/x86/intel-speed-select: Change path for caching topology info
  platform/x86: acerhdf: Replace HTTP links with HTTPS ones
  platform/x86: apple-gmux: Replace HTTP links with HTTPS ones
  platform/x86: pcengines-apuv2: revert wiring up simswitch GPIO as LED
  platform/x86: mlx-platform: Extend FAN platform data description
  platform_data/mlxreg: Add presence register field for FAN devices
  Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces
  platform/mellanox: mlxreg-io: Add support for complex attributes
  platform/x86: mlx-platform: Add more definitions for system attributes
  platform_data/mlxreg: Add support for complex attributes
  ...
2020-08-03 18:02:53 -07:00
YueHaibing
80fbbb1672 fib: Fix undef compile warning
net/core/fib_rules.c:26:7: warning: "CONFIG_IP_MULTIPLE_TABLES" is not defined, evaluates to 0 [-Wundef]
 #elif CONFIG_IP_MULTIPLE_TABLES
       ^~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: 8b66a6fd34 ("fib: fix another fib_rules_ops indirect call wrapper problem")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-By: Brian Vazquez <brianvv@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:01:49 -07:00
Geliang Tang
190f8b060e mptcp: use mptcp_for_each_subflow in mptcp_stream_accept
Use mptcp_for_each_subflow in mptcp_stream_accept instead of
open-coding.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:01:14 -07:00
David S. Miller
ee494f42a3 A few more changes, notably:
* handle new SAE (WPA3 authentication) status codes in the correct way
  * fix a while that should be an if instead, avoiding infinite loops
  * handle beacon filtering changing better
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl8n/fgACgkQB8qZga/f
 l8TiOA/+PwGoXiP7WPulI+V8Bzv/Tmg6rqKEUFX1hxaGQOAlr9hqV2jvdxkOZmIM
 sQc19vT4TUMiwrdYHOKcKlY5jhi5r8aIzxSA98mGqqKsvERhJWuzoaPR5aHuqOUD
 VxUidOieV+H4Ol1YxV1Tx4eCmLR8ZPjg9EebMGNPOrASg7JXCjVXb5Pp8XXmsuDn
 uXdPc9QQ+DMLqCEF+1C+jhH2lDUfyk78AgPuXw7E3CP/copiKljKL7AApoFebpkM
 Cu0g6hYLBDAnkuZgL7PBw/MQThN5oMLDeX8RNK0uXDy/kdnXd7RL0IMcQpmwTn81
 ZeewZjYq/Ha64sscR87kbzzFM15Vo/UhyhDfqQki/2zrQ26GoQdtgbj/OB3CMMd/
 8kBtZPoItlIkpO18kN9gBO/nB0cGghtsLn8u8pKdQ5RiIv1Q4JqerN4rHCyMgLrk
 FeOH25feAtuJ5MlOkIuHDYTfxNZnw1DE6DQv0hZuui1dbMPHUZuMFrCgZscuVBfj
 HOzhcyvs2bdCU4ZKLGiiq1zTxPply9rESmk4tvlDo5strqCKUEez8VgtZIPw1LZp
 pD43UHAfRiBEAF1pO/l1E6LyTvHMHYFKMiqzwCEXZXf2L276hlQ1rYoHeWEP0hhO
 N03gwHiMs1xug6MeYhsdeyR0tqXb1N7XwnkpSwgtVOH2z+pCBx4=
 =9iOZ
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
A few more changes, notably:
 * handle new SAE (WPA3 authentication) status codes in the correct way
 * fix a while that should be an if instead, avoiding infinite loops
 * handle beacon filtering changing better
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:00:22 -07:00
Jisheng Zhang
01f4d47a5b net: stmmac: fix failed to suspend if phy based WOL is enabled
With the latest net-next tree, if test suspend/resume after enabling
WOL, we get error as below:

[  487.086365] dpm_run_callback(): mdio_bus_suspend+0x0/0x30 returns -16
[  487.086375] PM: Device stmmac-0:00 failed to suspend: error -16

-16 means -EBUSY, this is because I didn't enable wakeup of the correct
device when implementing phy based WOL feature. To be honest, I caught
the issue when implementing phy based WOL and then fix it locally, but
forgot to amend the phy based wol patch. Today, I found the issue by
testing net-next tree.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 17:59:39 -07:00
Ioana-Ruxandra Stăncioi
88fab21c69 seg6_iptunnel: Refactor seg6_lwt_headroom out of uapi header
Refactor the function seg6_lwt_headroom out of the seg6_iptunnel.h uapi
header, because it is only used in seg6_iptunnel.c. Moreover, it is only
used in the kernel code, as indicated by the "#ifdef __KERNEL__".

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Ioana-Ruxandra Stăncioi <stancioi@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 17:57:40 -07:00
Jane Chu
7f674025d9 libnvdimm/security: ensure sysfs poll thread woke up and fetch updated attr
commit 7d988097c5 ("acpi/nfit, libnvdimm/security: Add security DSM overwrite support")
adds a sysfs_notify_dirent() to wake up userspace poll thread when the "overwrite"
operation has completed. But the notification is issued before the internal
dimm security state and flags have been updated, so the userspace poll thread
wakes up and fetches the not-yet-updated attr and falls back to sleep, forever.
But if user from another terminal issue "ndctl wait-overwrite nmemX" again,
the command returns instantly.

Link: https://lore.kernel.org/r/1596494499-9852-3-git-send-email-jane.chu@oracle.com
Fixes: 7d988097c5 ("acpi/nfit, libnvdimm/security: Add security DSM overwrite support")
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-08-03 18:54:13 -06:00
Jane Chu
7c02d53dfe libnvdimm/security: the 'security' attr never show 'overwrite' state
'security' attribute displays the security state of an nvdimm.
During normal operation, the nvdimm state maybe one of 'disabled',
'unlocked' or 'locked'.  When an admin issues
  # ndctl sanitize-dimm nmem0 --overwrite
the attribute is expected to change to 'overwrite' until the overwrite
operation completes.

But tests on our systems show that 'overwrite' is never shown during
the overwrite operation. i.e.
  # cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/nmem0/security
  unlocked
the attribute remain 'unlocked' through out the operation, consequently
"ndctl wait-overwrite nmem0" command doesn't wait at all.

The driver tracks the state in 'nvdimm->sec.flags': when the operation
starts, it adds an overwrite bit to the flags; and when the operation
completes, it removes the bit. Hence security_show() should check the
'overwrite' bit first, in order to indicate the actual state when multiple
bits are set in the flags.

Link: https://lore.kernel.org/r/1596494499-9852-2-git-send-email-jane.chu@oracle.com
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-08-03 18:54:13 -06:00
Jane Chu
dad42d1755 libnvdimm/security: fix a typo
commit d78c620a2e ("libnvdimm/security: Introduce a 'frozen' attribute")
introduced a typo, causing a 'nvdimm->sec.flags' update being overwritten
by the subsequent update meant for 'nvdimm->sec.ext_flags'.

Link: https://lore.kernel.org/r/1596494499-9852-1-git-send-email-jane.chu@oracle.com
Fixes: d78c620a2e ("libnvdimm/security: Introduce a 'frozen' attribute")
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-08-03 18:54:13 -06:00
Jianfeng Wang
730e700e2c tcp: apply a floor of 1 for RTT samples from TCP timestamps
For retransmitted packets, TCP needs to resort to using TCP timestamps
for computing RTT samples. In the common case where the data and ACK
fall in the same 1-millisecond interval, TCP senders with millisecond-
granularity TCP timestamps compute a ca_rtt_us of 0. This ca_rtt_us
of 0 propagates to rs->rtt_us.

This value of 0 can cause performance problems for congestion control
modules. For example, in BBR, the zero min_rtt sample can bring the
min_rtt and BDP estimate down to 0, reduce snd_cwnd and result in a
low throughput. It would be hard to mitigate this with filtering in
the congestion control module, because the proper floor to apply would
depend on the method of RTT sampling (using timestamp options or
internally-saved transmission timestamps).

This fix applies a floor of 1 for the RTT sample delta from TCP
timestamps, so that seq_rtt_us, ca_rtt_us, and rs->rtt_us will be at
least 1 * (USEC_PER_SEC / TCP_TS_HZ).

Note that the receiver RTT computation in tcp_rcv_rtt_measure() and
min_rtt computation in tcp_update_rtt_min() both already apply a floor
of 1 timestamp tick, so this commit makes the code more consistent in
avoiding this edge case of a value of 0.

Signed-off-by: Jianfeng Wang <jfwang@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Kevin Yang <yyd@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 17:54:03 -07:00
Linus Torvalds
e53bc3ff99 Boris is on vacation and he asked us to send you the pending RAS bits:
- Print the PPIN field on CPUs that fill them out
  - Fix an MCE injection bug
  - Simplify a kzalloc in dev_mcelog_init_device()
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oZ+ERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIVQ//XLP3qky/X/DOcOhdqw7JEDUpElK+sLdb
 nFLcSP0vgdoip0jjz0voEf2tb0nRbZRqjnCIuHj5Dq8fTWbM/qcD0zM35N5lk5bJ
 N+y9cE+FQg2oayU/0O3IezoT0lAW3YzdE2HdzhZ2ZEiDKx3fAj9QoJbKlc4TxshC
 Tg93IpMq5dmnj6Ge3BfNE56O4WCLIQxs2MCIWHWjBEU8qb4Nry7vEL9hijmFQNRd
 85i9pDrOkdnLDKhTagHUR4HbhkAacWeJKe0UoraJFr8/1nCm8Hcii5hI1t5N9KXg
 /+asjkhLX75uYy2aMmwF7qelSLNbOieMNji7rt42vevBsqrAtX0zuSvbEBb8bNdx
 f4WS5GcmUTZVHTuppPzAX3YNmYdrGqmecerAY+1j+uHtWFGs07ZCnKx+kQbiK0m4
 LqaQmi16KsV/Jj9BYZu8wLd/uIAsRqnY5nJlbe7of0/kFPzlJkRkT8Z3DPhSp/ee
 qHc3eR1EFMQvNr8F1e17TPCTcq5YjLM9BXFX4JJppaX30YOb6ZH0z5jYMbouJeKD
 8qtQ1BtawWKqkoi/0PtbH6khRFXH6oUNWZ+2aAKQqWwfJDdP8Q7fFyIvgN487s79
 u7fN4h8qj7w1AeIRJQH5v1G/Es14gdBcotgkM2AflQLF+c7QormnHdCJZad4gTAj
 /X8Ks5DE51w=
 =z0MD
 -----END PGP SIGNATURE-----

Merge tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 RAS updates from Ingo Molnar:
 "Boris is on vacation and he asked us to send you the pending RAS bits:

   - Print the PPIN field on CPUs that fill them out

   - Fix an MCE injection bug

   - Simplify a kzalloc in dev_mcelog_init_device()"

* tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce, EDAC/mce_amd: Print PPIN in machine check records
  x86/mce/dev-mcelog: Use struct_size() helper in kzalloc()
  x86/mce/inject: Fix a wrong assignment of i_mce.status
2020-08-03 17:42:23 -07:00
Linus Torvalds
a92ad11fb2 A single commit which sets the X86_FEATURE_TSC_KNOWN_FREQ flag for Xen guests,
to avoid recalibration.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oYlARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hA+Q//YeTzZc8UnXyLzIB0AnP/jAvgbn2KMmSD
 PmsBBTBD/q24hkMoj/hiyn0lwOHzJ6fnb4T08WYfpDbhbWt8bJVrfu6QWpK4duLP
 dr58iiqyZ78o/G4Nuzr755JU2/Yy18682t9if4038vuWu/osmlRbNExYj/dMTLFW
 5Q63dy9y8vB+/rWI2/MoxXjym+qy/S4Px2X79XMrazOsPBkbxUHwt99SsEHsjZRf
 +94+z04z/sXV5HznbdXHeKJsvvAaz+kelIOJ9kDskt4VTwgcPdYSlX8Wpun6O7uR
 YuJTCpk2Hsx3ERSKNCFIJ1+7uZsA8E2blcK35iiScaTCa7qCvpA9KC1hDpVqtH1B
 jEuJIG2VpBVx5oxYpZV2kiwEQzcsnBuIeiMS2Fd+SCLcAkPBxARgOlHu0yLSTtkl
 92cbRXp+w6N1SnIEMFwgkZAy3AjWTAXROXXyKiMMjLkaTFRLMgzcYWYWFVng3GrK
 4AUIUESrHeGHWvOa1vwTXmUCxSvazPH/WxyuG4pwkeZpH++peiRlKzHwo1TEXXCw
 59i8Sw4+OqdIN/ZmWQ8wB2v1oUGvA4j/Ta/GFmdyoPEII67XVcmtMuQV3tjUxLAb
 FdEgBzrdQp0yM/UYF2pQYJiV4qjdh2/K3wFkJI1c2rD6e5On54t51E0Cf9C1oyPH
 wn32OE5yr9U=
 =S8sw
 -----END PGP SIGNATURE-----

Merge tag 'x86-timers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 timer update from Ingo Molnar:
 "Set the X86_FEATURE_TSC_KNOWN_FREQ flag for Xen guests, to avoid
  recalibration"

* tag 'x86-timers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/xen/time: Set the X86_FEATURE_TSC_KNOWN_FREQ flag in xen_tsc_khz()
2020-08-03 17:41:06 -07:00
Linus Torvalds
5183a617ec The biggest change is the removal of SGI UV1 support, which allowed the
removal of the legacy EFI old_mmap code as well.
 
 This removes quite a bunch of old code & quirks.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oYBoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1izAg//adabmMctmGxpWETEj1UmZdgJcKu5l+H3
 SiC6hoB46NB5J6E2whhOXuJGqgBGY/qx9Qy3LMFwgXNki85TQEEFTcGfd+RcXTXz
 B+oASL5Z+KOJ+QQqFf8W/dYye4ZXXxQN8gFSl9zaY3iXSY28JLMFrwkckkIrBwLZ
 9DmB7NaxeMQ22IEan641sK/YafGIriHvMpe34s2/p2WpemyncG1y1tZcPxZ/9K9U
 7OZ+4O6gr9n3FloD9yp3z1jFXiU4gSWEgbZvy3bD/AjbXs+oaaaIFDZ6yM/C3JYo
 Qw0PPTBSXkKTmG3UaK4vml4NPROLqyAzv8k5UTJmYQiVlSkyAVW80mzMQ9qDAhfK
 ny2+XE7ahjsFiNLFbboSOqLuTyHogkzcFCBrwW1wCTnPDEX5QCQF9t0agrY4pqqZ
 8lSMLj0BmkAPoi5XHq4WVEJL+0GyYFWmJwdtUAzd2Hj5w6mse0TiL7PtqAUfdFH0
 E5nBnY5RvqtWwfEungGl+zFmSRvH8UzOUydMMuG8koNlGaNZF4TwNVrMukbVqArX
 34Zt1bQrLUY/T4xNM+Zx1BWOFd5D3BUuI7xVhE5zMg20A8PHtPzMByoomoQrgGX2
 MDVUoqnPxeXWUil06+ND6koOtaL+xJy0JsUIFs7h9SGsF+JxsrYKbbLfjjDvtpuK
 sdb+fYzmgBU=
 =NChe
 -----END PGP SIGNATURE-----

Merge tag 'x86-platform-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Ingo Molnar:
 "The biggest change is the removal of SGI UV1 support, which allowed
  the removal of the legacy EFI old_mmap code as well.

  This removes quite a bunch of old code & quirks"

* tag 'x86-platform-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Remove unused EFI_UV1_MEMMAP code
  x86/platform/uv: Remove uv bios and efi code related to EFI_UV1_MEMMAP
  x86/efi: Remove references to no-longer-used efi_have_uv1_memmap()
  x86/efi: Delete SGI UV1 detection.
  x86/platform/uv: Remove efi=old_map command line option
  x86/platform/uv: Remove vestigial mention of UV1 platform from bios header
  x86/platform/uv: Remove support for UV1 platform from uv
  x86/platform/uv: Remove support for uv1 platform from uv_hub
  x86/platform/uv: Remove support for UV1 platform from uv_bau
  x86/platform/uv: Remove support for UV1 platform from uv_mmrs
  x86/platform/uv: Remove support for UV1 platform from x2apic_uv_x
  x86/platform/uv: Remove support for UV1 platform from uv_tlb
  x86/platform/uv: Remove support for UV1 platform from uv_time
2020-08-03 17:38:43 -07:00
Linus Torvalds
e96ec8cf9c The biggest change is to not sync the vmalloc and ioremap ranges for x86-64 anymore.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oW+oRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hEsBAAkFaP2QXd8Nc3nqA+eQoBwSty5NumQWNi
 /VC4nt2R3vqa547XgMgT0RslrftUeCZ6bI0gs7yG6vCrjHbsiFzqliOUPJPXHoUE
 MQMw0tw1x7ODVsAGkzJ5YYrhBvZ2rDvaewAHNRLdkmdLJwXobpCHk4YM48O0oInh
 rlTC9Lgu4jOGOUBazZt4a1xAZGbMK4uk6Er14HGONoqwHV0XC4lDnJm7qNWJLquE
 KEFPWyEC6/T4Vwtmc98eDd+/gJ3sDC2tRtRalW0U5BZQXWjbMLUh/g5aJF8g0NG1
 kmHgbIf5todhFgl/D7ZWJ/i6jucGSZ1u43IsOQwuQ2U/flsUrHDPxbLW55r1axld
 K/RdvWvanWBb1A77Z/2h9eMwm0saEBjQT/UmtT4fNQpgUHJ1AJaVLHcgeFWQsVVX
 AbrP6DAGkEV7QqxQjCl9PK4kOi6BNwvKIfKmFsy5+Ypgx9e39OZmd9FMNt7mJslC
 2htTdXsUYUgwCrQH9brUjUs943mrvWXUYdencvVw1ByWwkSTRm5owXL0w3XtKuwN
 fEtcq0Y6ALeTwjyN0sja3johR93qaHWmyI8t5NLzT+g1rQgrd9Hae6dChyx1nzlz
 I5xIQF4ropZhhP6vcifiyDQcltOZsspVOMQzLTepuqZEpqDEoUtbjZhI4JcS51nc
 qe5rEOGGa00=
 =D9UW
 -----END PGP SIGNATURE-----

Merge tag 'x86-mm-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mmm update from Ingo Molnar:
 "The biggest change is to not sync the vmalloc and ioremap ranges for
  x86-64 anymore"

* tag 'x86-mm-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/64: Make sync_global_pgds() static
  x86/mm/64: Do not sync vmalloc/ioremap mappings
  x86/mm: Pre-allocate P4D/PUD pages for vmalloc area
2020-08-03 17:25:42 -07:00