EF100 will use the same mechanisms for PCI error recovery.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EF100 needs to map multiple BARs (sequentially, not concurrently) in
order to read the Function Control Window during probe.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EF100 will share EF10's model of filtering, hashing and spreading.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Link speeds, FEC, and autonegotiation are all things EF100 will share.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new nic_common.h contains the inlines for NIC-type function dispatch,
declarations for NIC-generic functions in nic.c, and other similar NIC-
generic functionality. Retained in nic.h are NIC-specific declarations
such as the siena and ef10 nic_data structs and various farch functions.
The EF100 driver will thus include nic_common.h but not nic.h.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate the generation-count handling from the format conversion, to
make it easier to re-use both for EF100.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calculate efx->max_vis at probe time, and check against it in
efx_allocate_msix_channels() when considering whether to create XDP TX
channels.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we have an _OFST definition for each individual flag bit,
callers of efx_has_cap() don't need to specify which flag word it's
in; we can just use the flag name directly in MCDI_CAPABILITY_OFST.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The script used to generate these now includes _OFST definitions for
flags, to identify the containing flag word.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 'tcfp_burst' with TICK factor, driver side always need to recover
it to the original value, this patch moves the generic calculation and
recover to the 'burst' original value before offloading to device driver.
Signed-off-by: Po Liu <po.liu@nxp.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti says:
====================
MPTCP: improve fallback to TCP
there are situations where MPTCP sockets should fall-back to regular TCP:
this series reworks the fallback code to pursue the following goals:
1) cleanup the non fallback code, removing most of 'if (<fallback>)' in
the data path
2) improve performance for non-fallback sockets, avoiding locks in poll()
further work will also leverage on this changes to achieve:
a) more consistent behavior of gestockopt()/setsockopt() on passive sockets
after fallback
b) support for "infinite maps" as per RFC8684, section 3.7
the series is made of the following items:
- patch 1 lets sendmsg() / recvmsg() / poll() use the main socket also
after fallback
- patch 2 fixes 'simultaneous connect' scenario after fallback. The
problem was present also before the rework, but the fix is much easier
to implement after patch 1
- patch 3, 4, 5 are clean-ups for code that is no more needed after the
fallback rework
- patch 6 fixes a race condition between close() and poll(). The problem
was theoretically present before the rework, but it became almost
systematic after patch 1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
mptcp_poll always return POLLOUT for unblocking
connect(), ensure that the socket is a suitable
state.
The MPTCP_DATA_READY bit is never cleared on accept:
ensure we don't leave mptcp_accept() with an empty
accept queue and such bit set.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently __mptcp_tcp_fallback() always return NULL
on incoming connections, because MPTCP does not create
the additional socket for the first subflow.
Since the previous commit no __mptcp_tcp_fallback()
caller needs a struct socket, so let __mptcp_tcp_fallback()
return the first subflow sock and cope correctly even with
incoming connections.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the hardware MACTYPE hardware configuration pins are set to "XFI
with Rate Matching" the PHY interface operate at fixed 10Gbps speed. The
MAC buffer packets in both directions to match various wire speeds.
Read the MAC Type field in the Port Control register, and set the MAC
interface speed accordingly.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5-tls-2020-06-26
1) Improve hardware layouts and structure for kTLS support
2) Generalize ICOSQ (Internal Channel Operations Send Queue)
Due to the asynchronous nature of adding new kTLS flows and handling
HW asynchronous kTLS resync requests, the XSK ICOSQ was extended to
support generic async operations, such as kTLS add flow and resync, in
addition to the existing XSK usages.
3) kTLS hardware flow steering and classification:
The driver already has the means to classify TCP ipv4/6 flows to send them
to the corresponding RSS HW engine, as reflected in patches 3 through 5,
the series will add a steering layer that will hook to the driver's TCP
classifiers and will match on well known kTLS connection, in case of a
match traffic will be redirected to the kTLS decryption engine, otherwise
traffic will continue flowing normally to the TCP RSS engine.
3) kTLS add flow RX HW offload support
New offload contexts post their static/progress params WQEs
(Work Queue Element) to communicate the newly added kTLS contexts
over the per-channel async ICOSQ.
The Channel/RQ is selected according to the socket's rxq index.
A new TLS-RX workqueue is used to allow asynchronous addition of
steering rules, out of the NAPI context.
It will be also used in a downstream patch in the resync procedure.
Feature is OFF by default. Can be turned on by:
$ ethtool -K <if> tls-hw-rx-offload on
4) Added mlx5 kTLS sw stats and new counters are documented in
Documentation/networking/tls-offload.rst
rx_tls_ctx - number of TLS RX HW offload contexts added to device for
decryption.
rx_tls_ooo - number of RX packets which were part of a TLS stream
but did not arrive in the expected order and triggered the resync
procedure.
rx_tls_del - number of TLS RX HW offload contexts deleted from device
(connection has finished).
rx_tls_err - number of RX packets which were part of a TLS stream
but were not decrypted due to unexpected error in the state machine.
5) Asynchronous RX resync
a. The NIC driver indicates that it would like to resync on some TLS
record within the received packet (P), but the driver does not
know (yet) which of the TLS records within the packet.
At this stage, the NIC driver will query the device to find the exact
TCP sequence for resync (tcpsn), however, the driver does not wait
for the device to provide the response.
b. Eventually, the device responds, and the driver provides the tcpsn
within the resync packet to KTLS. Now, KTLS can check the tcpsn against
any processed TLS records within packet P, and also against any record
that is processed in the future within packet P.
The asynchronous resync path simplifies the device driver, as it can
save bits on the packet completion (32-bit TCP sequence), and pass this
information on an asynchronous command instead.
Performance:
CPU: Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz, 24 cores, HT off
NIC: ConnectX-6 Dx 100GbE dual port
Goodput (app-layer throughput) comparison:
+---------------+-------+-------+---------+
| # connections | 1 | 4 | 8 |
+---------------+-------+-------+---------+
| SW (Gbps) | 7.26 | 24.70 | 50.30 |
+---------------+-------+-------+---------+
| HW (Gbps) | 18.50 | 64.30 | 92.90 |
+---------------+-------+-------+---------+
| Speedup | 2.55x | 2.56x | 1.85x * |
+---------------+-------+-------+---------+
* After linerate is reached, diff is observed in CPU util
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
genl_family_rcv_msg_attrs_parse() reuses the global family->attrbuf
when family->parallel_ops is false. However, family->attrbuf is not
protected by any lock on the genl_family_rcv_msg_doit() code path.
This leads to several different consequences, one of them is UAF,
like the following:
genl_family_rcv_msg_doit(): genl_start():
genl_family_rcv_msg_attrs_parse()
attrbuf = family->attrbuf
__nlmsg_parse(attrbuf);
genl_family_rcv_msg_attrs_parse()
attrbuf = family->attrbuf
__nlmsg_parse(attrbuf);
info->attrs = attrs;
cb->data = info;
netlink_unicast_kernel():
consume_skb()
genl_lock_dumpit():
genl_dumpit_info(cb)->attrs
Note family->attrbuf is an array of pointers to the skb data, once
the skb is freed, any dereference of family->attrbuf will be a UAF.
Maybe we could serialize the family->attrbuf with genl_mutex too, but
that would make the locking more complicated. Instead, we can just get
rid of family->attrbuf and always allocate attrbuf from heap like the
family->parallel_ops==true code path. This may add some performance
overhead but comparing with taking the global genl_mutex, it still
looks better.
Fixes: 75cdbdd089 ("net: ieee802154: have genetlink code to parse the attrs during dumpit")
Fixes: 057af70713 ("net: tipc: have genetlink code to parse the attrs during dumpit")
Reported-and-tested-by: syzbot+3039ddf6d7b13daf3787@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+80cad1e3cb4c41cde6ff@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+736bcbcb11b60d0c0792@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+520f8704db2b68091d44@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+c96e4dfb32f8987fdeed@syzkaller.appspotmail.com
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata says:
====================
TC: Introduce qevents
The Spectrum hardware allows execution of one of several actions as a
result of queue management decisions: tail-dropping, early-dropping,
marking a packet, or passing a configured latency threshold or buffer
size. Such packets can be mirrored, trapped, or sampled.
Modeling the action to be taken as simply a TC action is very attractive,
but it is not obvious where to put these actions. At least with ECN marking
one could imagine a tree of qdiscs and classifiers that effectively
accomplishes this task, albeit in an impractically complex manner. But
there is just no way to match on dropped-ness of a packet, let alone
dropped-ness due to a particular reason.
To allow configuring user-defined actions as a result of inner workings of
a qdisc, this patch set introduces a concept of qevents. Those are attach
points for TC blocks, where filters can be put that are executed as the
packet hits well-defined points in the qdisc algorithms. The attached
blocks can be shared, in a manner similar to clsact ingress and egress
blocks, arbitrary classifiers with arbitrary actions can be put on them,
etc.
For example:
red limit 500K avpkt 1K qevent early_drop block 10
matchall action mirred egress mirror dev eth1
The central patch #2 introduces several helpers to allow easy and uniform
addition of qevents to qdiscs: initialization, destruction, qevent block
number change validation, and qevent handling, i.e. dispatch of the filters
attached to the block bound to a qevent.
Patch #1 adds root_lock argument to qdisc enqueue op. The problem this is
tackling is that if a qevent filter pushes packets to the same qdisc tree
that holds the qevent in the first place, attempt to take qdisc root lock
for the second time will lead to a deadlock. To solve the issue, qevent
handler needs to unlock and relock the root lock around the filter
processing. Passing root_lock around makes it possible to get the lock
where it is needed, and visibly so, such that it is obvious the lock will
be used when invoking a qevent.
The following two patches, #3 and #4, then add two qevents to the RED
qdisc: "early_drop" qevent fires when a packet is early-dropped; "mark"
qevent, when it is ECN-marked.
Patch #5 contains a selftest. I have mentioned this test when pushing the
RED ECN nodrop mode and said that "I have no confidence in its portability
to [...] different configurations". That still holds. The backlog and
packet size are tuned to make the test deterministic. But it is better than
nothing, and on the boxes that I ran it on it does work and shows that
qevents work the way they are supposed to, and that their addition has not
broken the other tested features.
This patch set does not deal with offloading. The idea there is that a
driver will be able to figure out that a given block is used in qevent
context by looking at binder type. A future patch-set will add a qdisc
pointer to struct flow_block_offload, which a driver will be able to
consult to glean the TC or other relevant attributes.
Changes from RFC to v1:
- Move a "q = qdisc_priv(sch)" from patch #3 to patch #4
- Fix deadlock caused by mirroring packet back to the same qdisc tree.
- Rename "tail" qevent to "tail_drop".
- Adapt to the new 100-column standard.
- Add a selftest
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This test is inspired by the mlxsw RED selftest. It is much simpler to set
up (also because there is no point in testing PRIO / RED encapsulation). It
tests bare RED, ECN and ECN+nodrop modes of operation. On top of that it
tests RED early_drop and mark qevents.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to allow acting on dropped and/or ECN-marked packets, add two new
qevents to the RED qdisc: "early_drop" and "mark". Filters attached at
"early_drop" block are executed as packets are early-dropped, those
attached at the "mark" block are executed as packets are ECN-marked.
Two new attributes are introduced: TCA_RED_EARLY_DROP_BLOCK with the block
index for the "early_drop" qevent, and TCA_RED_MARK_BLOCK for the "mark"
qevent. Absence of these attributes signifies "don't care": no block is
allocated in that case, or the existing blocks are left intact in case of
the change callback.
For purposes of offloading, blocks attached to these qevents appear with
newly-introduced binder types, FLOW_BLOCK_BINDER_TYPE_RED_EARLY_DROP and
FLOW_BLOCK_BINDER_TYPE_RED_MARK.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the following patches, RED will get two qevents. The implementation will
be clearer if the callback for change is not a pure subset of the callback
for init. Split the two and promote attribute parsing to the callbacks
themselves from the common code, because it will be handy there.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Qevents are attach points for TC blocks, where filters can be put that are
executed when "interesting events" take place in a qdisc. The data to keep
and the functions to invoke to maintain a qevent will be largely the same
between qevents. Therefore introduce sched-wide helpers for qevent
management.
Currently, similarly to ingress and egress blocks of clsact pseudo-qdisc,
blocks attachment cannot be changed after the qdisc is created. To that
end, add a helper tcf_qevent_validate_change(), which verifies whether
block index attribute is not attached, or if it is, whether its value
matches the current one (i.e. there is no material change).
The function tcf_qevent_handle() should be invoked when qdisc hits the
"interesting event" corresponding to a block. This function releases root
lock for the duration of executing the attached filters, to allow packets
generated through user actions (notably mirred) to be reinserted to the
same qdisc tree.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A following patch introduces qevents, points in qdisc algorithm where
packet can be processed by user-defined filters. Should this processing
lead to a situation where a new packet is to be enqueued on the same port,
holding the root lock would lead to deadlocks. To solve the issue, qevent
handler needs to unlock and relock the root lock when necessary.
To that end, add the root lock argument to the qdisc op enqueue, and
propagate throughout.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Grygorii Strashko says:
====================
net: ethernet: ti: am65-cpsw: update and enable sr2.0 soc
This series contains set of improvements for TI AM654x/J721E CPSW2G driver and
adds support for TI AM654x SR2.0 SoC.
Patch 1: adds vlans restoration after "if down/up"
Patches 2-5: improvments
Patch 6: adds support for TI AM654x SR2.0 SoC which allows to disable errata i2027 W/A.
By default, errata i2027 W/A (TX csum offload disabled) is enabled on AM654x SoC
for backward compatibility, unless SR2.0 SoC is identified using SOC BUS framework.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The AM65x SR2.0 MCU CPSW has fixed errata i2027 "CPSW: CPSW Does Not
Support CPPI Receive Checksum (Host to Ethernet) Offload Feature". This
errata also fixed for J271E SoC.
Use SOC bus data for K3 SoC identification and apply i2027 errata w/a only
for the AM65x SR1.0 SoC.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ensure that critical setting can only be configured when there are no
running netdevs - all ports are down.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Skip HW configuration when p0-rx-ptype-rrobin is changed as it will be done
by .ndev_open(),
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MAC SL has to be initialized for each port otherwise
am65_cpsw_nuss_slave_disable_unused() will crash for disabled ports.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pf_p0_rx_ptype_rrobin is global parameter so move its initialization in
probe.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The vlan configuration is not restored after interface down/up sequence.
Steps to check:
# ip link add link eth0 name eth0.100 type vlan id 100
# ifconfig eth0 down
# ifconfig eth0 up
This patch fixes it, restoring vlan ALE entries on .ndo_open().
Fixes: 93a7653031 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Couple of fixes/small things:
* TX control port status check fixed to not assume frame format
* mesh control port fixes
* error handling/leak fixes when starting AP, with HE attributes
* fix broadcast packet handling with encapsulation offload
* add new AKM suites
* and a small code cleanup
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following sparse warning:
arch/arm64/xen/../../arm/xen/enlighten.c:244: warning: macro
"GRANT_TABLE_PHYSADDR" is not used [-Wunused-macros]
It is an isolated macro, and should be removed when its last user
was deleted in the following commit 3cf4095d74 ("arm/xen: Use
xen_xlate_map_ballooned_pages to setup grant table")
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Fixed an inconsistent use of GFP flags in nft_obj_notify() that used
GFP_KERNEL when a GFP flag was passed in to that function. Given this
allocated memory was then used in audit_log_nfcfg() it led to an audit
of all other GFP allocations in net/netfilter/nf_tables_api.c and a
modification of audit_log_nfcfg() to accept a GFP parameter.
Reported-by: Dan Carptenter <dan.carpenter@oracle.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
When starting at 744MHz, the Mali 450 core crashes on S805X based boards:
lima d00c0000.gpu: IRQ ppmmu3 not found
lima d00c0000.gpu: IRQ ppmmu4 not found
lima d00c0000.gpu: IRQ ppmmu5 not found
lima d00c0000.gpu: IRQ ppmmu6 not found
lima d00c0000.gpu: IRQ ppmmu7 not found
Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.2+ #492
Hardware name: Libre Computer AML-S805X-AC (DT)
pstate: 40000005 (nZcv daif -PAN -UAO)
pc : lima_gp_init+0x28/0x188
...
Call trace:
lima_gp_init+0x28/0x188
lima_device_init+0x334/0x534
lima_pdev_probe+0xa4/0xe4
...
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Reverting to a safer 666Mhz frequency on the S805X that doesn't use the
GP0 PLL makes it more stable.
Fixes: fd47716479 ("ARM64: dts: add S805X based P241 board")
Fixes: 0449b8e371 ("arm64: dts: meson: add libretech aml-s805x-ac board")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Link: https://lore.kernel.org/r/20200618132737.14243-1-narmstrong@baylibre.com
All the functionality in this driver has been reimplemented
in the new DRM driver in drivers/gpu/drm/pl111/* and all
the boards using it have been migrated to use the DRM driver
with all configuration coming from the device tree.
I started the work to migrate the CLCD driver to DRM in
april 2017 and it took a little more than 3 years to do this
properly without leaving any platforms behind.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Russell King <linux@armlinux.org.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609200446.153209-2-linus.walleij@linaro.org
In i.MX8QXP LPCG binding's example, "fsl,imx7d-usdhc" as fallback
compatible is incorrect, remove it to avoid below build error:
Documentation/devicetree/bindings/clock/imx8qxp-lpcg.example.dt.yaml:
mmc@5b010000: compatible: Additional items are not allowed ('fsl,imx7d-usdhc' was unexpected)
Documentation/devicetree/bindings/clock/imx8qxp-lpcg.example.dt.yaml:
mmc@5b010000: compatible: ['fsl,imx8qxp-usdhc', 'fsl,imx7d-usdhc'] is too long
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Link: https://lore.kernel.org/r/1592450578-30140-3-git-send-email-Anson.Huang@nxp.com
Signed-off-by: Rob Herring <robh@kernel.org>