Commit Graph

1014064 Commits

Author SHA1 Message Date
Thomas Bogendoerfer
af80425e05 net: korina: Only pass mac address via platform data
Get rid of access to struct korina_device by just passing the mac
address via platform data and use drvdata for passing netdev to remove
function.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer
0fc96939a9 net: korina: Use DMA API
Instead of messing with MIPS specific macros use DMA API for mapping
descriptors and skbs.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer
0fe632471a net: korina: Remove nested helpers
Remove helpers, which are only used in one call site.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer
e42f10533d net: korina: Remove not needed cache flushes
Descriptors are mapped uncached so there is no need to do any cache
handling for them.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer
b4cd249a8c net: korina: Use devres functions
Simplify probe/remove code by using devm_ functions.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer
89f9d5400b net: korina: Fix MDIO functions
Fixed MDIO functions to work reliable and not just by accident.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
David S. Miller
9652168304 Merge branch 'mtk_ppe_offload-fixes'
Pablo Neira Ayuso says:

====================
mtk_ppe_offload fixes

A few incremental fixes for the initial flowtable offload support
and this driver:

1) Fix undefined reference to `dsa_port_from_netdev' due to missing
   dependencies in Kconfig, reported by Kbuild robot.

2) Missing mutex to serialize flow events via workqueue to the driver.

3) Handle FLOW_ACTION_VLAN_POP tag action.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:55:05 -07:00
Pablo Neira Ayuso
f5c2cb583a net: ethernet: mtk_eth_soc: handle VLAN pop action
Do not hit EOPNOTSUPP when flowtable offload provides a VLAN pop action.

Fixes: efce49dfe6 ("netfilter: flowtable: add vlan pop action offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:55:05 -07:00
Pablo Neira Ayuso
014d029876 net: ethernet: mtk_eth_soc: missing mutex
Patch 2ed37183ab ("netfilter: flowtable: separate replace, destroy and
stats to different workqueues") splits the workqueue per event type. Add
a mutex to serialize updates.

Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Reported-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:55:05 -07:00
Pablo Neira Ayuso
0e389028ad net: ethernet: mtk_eth_soc: fix undefined reference to `dsa_port_from_netdev'
Caused by:

 CONFIG_NET_DSA=m
 CONFIG_NET_MEDIATEK_SOC=y

mtk_ppe_offload.c:undefined reference to `dsa_port_from_netdev'

Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:55:04 -07:00
David S. Miller
6dd06ec7c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Add vlan match and pop actions to the flowtable offload,
   patches from wenxu.

2) Reduce size of the netns_ct structure, which itself is
   embedded in struct net Make netns_ct a read-mostly structure.
   Patches from Florian Westphal.

3) Add FLOW_OFFLOAD_XMIT_UNSPEC to skip dst check from garbage
   collector path, as required by the tc CT action. From Roi Dayan.

4) VLAN offload fixes for nftables: Allow for matching on both s-vlan
   and c-vlan selectors. Fix match of VLAN id due to incorrect
   byteorder. Add a new routine to properly populate flow dissector
   ethertypes.

5) Missing keys in ip{6}_route_me_harder() results in incorrect
   routes. This includes an update for selftest infra. Patches
   from Ido Schimmel.

6) Add counter hardware offload support through FLOW_CLS_STATS.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:49:50 -07:00
Linus Walleij
8d892d6094 net: ethernet: ixp4xx: Set the DMA masks explicitly
The former fix only papered over the actual problem: the
ethernet core expects the netdev .dev member to have the
proper DMA masks set, or there will be BUG_ON() triggered
in kernel/dma/mapping.c.

Fix this by simply copying dma_mask and dma_mask_coherent
from the parent device.

Fixes: e45d0fad4a ("net: ethernet: ixp4xx: Use parent dev for DMA pool")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:47:56 -07:00
Alexey Dobriyan
c6400e3fc3 netlink: simplify nl_set_extack_cookie_u64(), nl_set_extack_cookie_u32()
Taking address of a function argument directly works just fine.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:41:28 -07:00
DENG Qingfang
6ecaf81d4a net: ethernet: mediatek: fix a typo bug in flow offloading
Issue was traffic problems after a while with increased ping times if
flow offload is active. It turns out that key_offset with cookie is
needed in rhashtable_params but was re-assigned to head_offset.
Fix the assignment.

Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:39:52 -07:00
David S. Miller
32c1fdb5e9 Merge branch 'hns3-next'
Huazhong Tan says:

====================
net: hns3: misc updates for -next

This series includes some misc updates for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:38:05 -07:00
Huazhong Tan
e407efdd94 net: hns3: change the value of the SEPARATOR_VALUE macro in hclgevf_main.c
The SEPARATOR_VALUE macro is used as separator when getting
the register value, but the value of this macro is different
between pf and vf, it is a bit confusing for the user, so
synchronize the value of vf with pf.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:38:05 -07:00
Huazhong Tan
8ed64dbe0b net: hns3: cleanup inappropriate spaces in struct hlcgevf_tqp_stats
Modify some inappropriate spaces in comments of struct
hlcgevf_tqp_stats.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:38:05 -07:00
Huazhong Tan
1c5a2ba679 net: hns3: remove a duplicate pf reset counting
When enter suspend mode the counter of pf reset will be increased
twice, since both hclge_prepare_general() and hclge_prepare_wait()
increase this counter. So remove the duplicate counting in
hclge_prepare_general().

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:38:05 -07:00
Randy Dunlap
46fd447161 net: xilinx: drivers need/depend on HAS_IOMEM
kernel test robot reports build errors in 3 Xilinx ethernet drivers.
They all use ioremap functions that are only available when HAS_IOMEM
is set/enabled. If it is not enabled, they all have build errors,
so make these 3 drivers depend on HAS_IOMEM.

ld: drivers/net/ethernet/xilinx/xilinx_emaclite.o: in function `xemaclite_of_probe':
xilinx_emaclite.c:(.text+0x9fc): undefined reference to `devm_ioremap_resource'

ld: drivers/net/ethernet/xilinx/xilinx_axienet_main.o: in function `axienet_probe':
xilinx_axienet_main.c:(.text+0x942): undefined reference to `devm_ioremap_resource'

ld: drivers/net/ethernet/xilinx/ll_temac_main.o: in function `temac_probe':
ll_temac_main.c:(.text+0x1283): undefined reference to `devm_platform_ioremap_resource_byname'
ld: ll_temac_main.c:(.text+0x13ad): undefined reference to `devm_of_iomap'
ld: ll_temac_main.c:(.text+0x162e): undefined reference to `devm_platform_ioremap_resource'

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Zhang Changzhong <zhangchangzhong@huawei.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: stable@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:36:00 -07:00
David S. Miller
6ed33c1312 Merge branch 'enetc-flow-control'
Vladimir Oltean says:

====================
Flow control for NXP ENETC

This patch series contains logic for enabling the lossless mode on the
RX rings of the ENETC, and the PAUSE thresholds on the internal FIFO
memory.

During testing it was found that, with the default FIFO configuration,
a sender which isn't persuaded by our PAUSE frames and keeps sending
will cause some MAC RX frame errors. To mitigate this, we need to ensure
that the FIFO never runs completely full, so we need to fix up a setting
that was supposed to be configured well out of reset. Unfortunately this
requires the addition of a new mini-driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Vladimir Oltean
a864888788 net: enetc: add support for flow control
In the ENETC receive path, a frame received by the MAC is first stored
in a 256KB 'FIFO' memory, then transferred to DRAM when enqueuing it to
the RX ring. The FIFO is a shared resource for all ENETC ports, but
every port keeps track of its own memory utilization, on RX and on TX.

There is a setting for RX rings through which they can either operate in
'lossy' mode (where the lack of a free buffer causes an immediate
discard of the frame) or in 'lossless' mode (where the lack of a free
buffer in the ring makes the frame stay longer in the FIFO).

In turn, when the memory utilization of the FIFO exceeds a certain
margin, the MAC can be configured to emit PAUSE frames.

There is enough FIFO memory to buffer up to 3 MTU-sized frames per RX
port while not jeopardizing the other use cases (jumbo frames), and
also not consume bytes from the port TX allocations. Also, 3 MTU-sized
frames worth of memory is enough to ensure zero loss for 64 byte packets
at 1G line rate.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Vladimir Oltean
b764dc6cc1 arm64: dts: ls1028a: declare the Integrated Endpoint Register Block node
Add a node describing the address in the SoC memory space for the IERB.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Vladimir Oltean
e7d48e5fbf net: enetc: add a mini driver for the Integrated Endpoint Register Block
The NXP ENETC is a 4-port Ethernet controller which 'smells' to
operating systems like 4 distinct PCIe PFs with SR-IOV, each PF having
its own driver instance, but in fact there are some hardware resources
which are shared between all ports, like for example the 256 KB SRAM
FIFO between the MACs and the Host Transfer Agent which DMAs frames to
DRAM.

To hide the stuff that cannot be neatly exposed per port, the hardware
designers came up with this idea of having a dedicated register block
which is supposed to be populated by the bootloader, and contains
everything configuration-related: MAC addresses, FIFO partitioning, etc.

When a port is reset using PCIe Function Level Reset, its defaults are
transferred from the IERB configuration. Most of the time, the settings
made through the IERB are read-only in the port's memory space (if they
are even visible), so they cannot be modified at runtime.

Linux doesn't have any advanced FIFO partitioning requirements at all,
but when reading through the hardware manual, it became clear that, even
though there are many good 'recommendations' for default values, many of
them were not actually put in practice on LS1028A. So we end up with a
default configuration that:

(a) does not have enough TX and RX byte credits to support the max MTU
    of 9600 (which the Linux driver claims already) properly (at full speed)
(b) allows the FIFO to be overrun with RX traffic, potentially
    overwriting internal data structures.

The last part sounds a bit catastrophic, but it isn't. Frames are
supposed to transit the FIFO for a very short time, but they can
actually accumulate there under 2 conditions:

(a) there is very severe congestion on DRAM memory, or
(b) the RX rings visible to the operating system were configured for
    lossless operation, and they just ran out of free buffers to copy
    the frame to. This is what is used to put backpressure onto the MAC
    with flow control.

So since ENETC has not supported flow control thus far, RX FIFO overruns
were never seen with Linux. But with the addition of flow control, we
should configure some registers to prevent this from happening. What we
are trying to protect against are bad actors which continue to send us
traffic despite the fact that we have signaled a PAUSE condition. Of
course we can't be lossless in that case, but it is best to configure
the FIFO to do tail dropping rather than letting it overrun.

So in a nutshell, this driver is a fixup for all the IERB default values
that should have been but aren't.

The IERB configuration needs to be done _before_ the PFs are enabled.
So every PF searches for the presence of the "fsl,ls1028a-enetc-ierb"
node in the device tree, and if it finds it, it "registers" with the
IERB, which means that it requests the IERB to fix up its default
values. This is done through -EPROBE_DEFER. The IERB driver is part of
the fsl_enetc module, but is technically a platform driver, since the
IERB is a good old fashioned MMIO region, as opposed to ENETC ports
which pretend to be PCIe devices.

The driver was already configuring ENETC_PTXMBAR (FIFO allocation for
TX) because due to an omission, TXMBAR is a read/write register in the
PF memory space. But the manual is quite clear that the formula for this
should depend upon the TX byte credits (TXBCR). In turn, the TX byte
credits are only readable/writable through the IERB. So if we want to
ensure that the TXBCR register also has a value that is correct and in
line with TXMBAR, there is simply no way this can be done from the PF
driver, access to the IERB is needed.

I could have modified U-Boot to fix up the IERB values, but that is
quite undesirable, as old U-Boot versions are likely to be floating
around for quite some time from now.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Vladimir Oltean
4ac7acc67f dt-bindings: net: fsl: enetc: add the IERB documentation
Mention the required compatible string and base address for the
Integrated Endpoint Register Block node.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Vladimir Oltean
87614b931c net: enetc: create a common enetc_pf_to_port helper
Even though ENETC interfaces are exposed as individual PCIe PFs with
their own driver instances, the ENETC is still fundamentally a
multi-port Ethernet controller, and some parts of the IP take a port
number (as can be seen in the PSFP implementation).

Create a common helper that can be used outside of the TSN code for
retrieving the ENETC port number based on the PF number. This is only
correct for LS1028A, the only Linux-capable instantiation of ENETC thus
far.

Note that ENETC port 3 is PF 6. The TSN code did not care about this
because ENETC port 3 does not support TSN, so the wrong mapping done by
enetc_get_port for PF 6 could have never been hit.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Du Cheng
ed8157f1eb net: sched: tapr: prevent cycle_time == 0 in parse_taprio_schedule
There is a reproducible sequence from the userland that will trigger a WARN_ON()
condition in taprio_get_start_time, which causes kernel to panic if configured
as "panic_on_warn". Catch this condition in parse_taprio_schedule to
prevent this condition.

Reported as bug on syzkaller:
https://syzkaller.appspot.com/bug?extid=d50710fd0873a9c6b40c

Reported-by: syzbot+d50710fd0873a9c6b40c@syzkaller.appspotmail.com
Signed-off-by: Du Cheng <ducheng2@gmail.com>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:29:07 -07:00
Alexei Starovoitov
900367b208 Merge branch 'Add a snprintf eBPF helper'
Florent Revest says:

====================

We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)

A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.

This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/20201126165748.1748417-1-revest@google.com/T/#u

- A new arg type guarantees the NULL-termination of string arguments and
  lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
  string is known at verification time, the format string validation is
  done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
  marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/20210310015455.1095207-1-revest@chromium.org/T/#u

---
Changes in v5:
- Fixed the bpf_printf_buf_used counter logic in try_get_fmt_tmp_buf
- Added a couple of extra incorrect specifiers tests
- Call test_snprintf_single__destroy unconditionally
- Fixed a C++-style comment

---
Changes in v4:
- Moved bpf_snprintf, bpf_printf_prepare and bpf_printf_cleanup to
  kernel/bpf/helpers.c so that they get built without CONFIG_BPF_EVENTS
- Added negative test cases (various invalid format strings)
- Renamed put_fmt_tmp_buf() as bpf_printf_cleanup()
- Fixed a mistake that caused temporary buffers to be unconditionally
  freed in bpf_printf_prepare
- Fixed a mistake that caused missing 0 character to be ignored
- Fixed a warning about integer to pointer conversion
- Misc cleanups

---
Changes in v3:
- Simplified temporary buffer acquisition with try_get_fmt_tmp_buf()
- Made zero-termination check more consistent
- Allowed NULL output_buffer
- Simplified the BPF_CAST_FMT_ARG macro
- Three new test cases: number padding, simple string with no arg and
  string length extraction only with a NULL output buffer
- Clarified helper's description for edge cases (eg: str_size == 0)
- Lots of cosmetic changes

---
Changes in v2:
- Extracted the format validation/argument sanitization in a generic way
  for all printf-like helpers.
- bpf_snprintf's str_size can now be 0
- bpf_snprintf is now exposed to all BPF program types
- We now preempt_disable when using a per-cpu temporary buffer
- Addressed a few cosmetic changes
====================

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-04-19 15:27:47 -07:00
Gustavo A. R. Silva
c1d9e34e11 ethtool: ioctl: Fix out-of-bounds warning in store_link_ksettings_for_user()
Fix the following out-of-bounds warning:

net/ethtool/ioctl.c:492:2: warning: 'memcpy' offset [49, 84] from the object at 'link_usettings' is out of the bounds of referenced subobject 'base' with type 'struct ethtool_link_settings' at offset 0 [-Warray-bounds]

The problem is that the original code is trying to copy data into a
some struct members adjacent to each other in a single call to
memcpy(). This causes a legitimate compiler warning because memcpy()
overruns the length of &link_usettings.base. Fix this by directly
using &link_usettings and _from_ as destination and source addresses,
instead.

This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().

Link: https://github.com/KSPP/linux/issues/109
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:27:37 -07:00
Florent Revest
c2e39c6bdc selftests/bpf: Add a series of tests for bpf_snprintf
The "positive" part tests all format specifiers when things go well.

The "negative" part makes sure that incorrect format strings fail at
load time.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-7-revest@chromium.org
2021-04-19 15:27:37 -07:00
Florent Revest
58c2b1f5e0 libbpf: Introduce a BPF_SNPRINTF helper macro
Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-6-revest@chromium.org
2021-04-19 15:27:37 -07:00
Florent Revest
83cd92b464 libbpf: Initialize the bpf_seq_printf parameters array field by field
When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-5-revest@chromium.org
2021-04-19 15:27:37 -07:00
Florent Revest
7b15523a98 bpf: Add a bpf_snprintf helper
The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.

Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org
2021-04-19 15:27:36 -07:00
Florent Revest
fff13c4bb6 bpf: Add a ARG_PTR_TO_CONST_STR argument type
This type provides the guarantee that an argument is going to be a const
pointer to somewhere in a read-only map value. It also checks that this
pointer is followed by a zero character before the end of the map value.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-3-revest@chromium.org
2021-04-19 15:27:36 -07:00
Florent Revest
d9c9e4db18 bpf: Factorize bpf_trace_printk and bpf_seq_printf
Two helpers (trace_printk and seq_printf) have very similar
implementations of format string parsing and a third one is coming
(snprintf). To avoid code duplication and make the code easier to
maintain, this moves the operations associated with format string
parsing (validation and argument sanitization) into one generic
function.

The implementation of the two existing helpers already drifted quite a
bit so unifying them entailed a lot of changes:

- bpf_trace_printk always expected fmt[fmt_size] to be the terminating
  NULL character, this is no longer true, the first 0 is terminating.
- bpf_trace_printk now supports %% (which produces the percentage char).
- bpf_trace_printk now skips width formating fields.
- bpf_trace_printk now supports the X modifier (capital hexadecimal).
- bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
- argument casting on 32 bit has been simplified into one macro and
  using an enum instead of obscure int increments.

- bpf_seq_printf now uses bpf_trace_copy_string instead of
  strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
- bpf_seq_printf now prints longs correctly on 32 bit architectures.

- both were changed to use a global per-cpu tmp buffer instead of one
  stack buffer for trace_printk and 6 small buffers for seq_printf.
- to avoid per-cpu buffer usage conflict, these helpers disable
  preemption while the per-cpu buffer is in use.
- both helpers now support the %ps and %pS specifiers to print symbols.

The implementation is also moved from bpf_trace.c to helpers.c because
the upcoming bpf_snprintf helper will be made available to all BPF
programs and will need it.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-2-revest@chromium.org
2021-04-19 15:27:36 -07:00
Dexuan Cui
ca9c54d2d6 net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)
Add a VF driver for Microsoft Azure Network Adapter (MANA) that will be
available in the future.

Co-developed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Co-developed-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:24:25 -07:00
Taehee Yoo
83c1ca257a mld: remove unnecessary prototypes
Some prototypes are unnecessary, so delete it.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:23:00 -07:00
David S. Miller
c589fa10f1 Merge branch 'nh-flushing'
Ido Schimmel says:

====================
nexthop: Support large scale nexthop flushing

Patch #1 fixes a day-one bug in the nexthop code and allows "ip nexthop
flush" to work correctly with large number of nexthops that do not fit
in a single-part dump.

Patch #2 adds a test case.

Targeting at net-next since this use case never worked, the flow is
pretty obscure and such a large number of nexthops is unlikely to be
used in any real-world scenario.

Tested with fib_nexthops.sh:

Tests passed: 219
Tests failed:   0
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:20:35 -07:00
Ido Schimmel
bf5eb67dc8 selftests: fib_nexthops: Test large scale nexthop flushing
Test that all the nexthops are flushed when a multi-part nexthop dump is
required for the flushing.

Without previous patch:

 # ./fib_nexthops.sh
 TEST: Large scale nexthop flushing                                  [FAIL]

With previous patch:

 # ./fib_nexthops.sh
 TEST: Large scale nexthop flushing                                  [ OK ]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:20:34 -07:00
Ido Schimmel
9e46fb656f nexthop: Restart nexthop dump based on last dumped nexthop identifier
Currently, a multi-part nexthop dump is restarted based on the number of
nexthops that have been dumped so far. This can result in a lot of
nexthops not being dumped when nexthops are simultaneously deleted:

 # ip nexthop | wc -l
 65536
 # ip nexthop flush
 Dump was interrupted and may be inconsistent.
 Flushed 36040 nexthops
 # ip nexthop | wc -l
 29496

Instead, restart the dump based on the nexthop identifier (fixed number)
of the last successfully dumped nexthop:

 # ip nexthop | wc -l
 65536
 # ip nexthop flush
 Dump was interrupted and may be inconsistent.
 Flushed 65536 nexthops
 # ip nexthop | wc -l
 0

Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Tested-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:20:34 -07:00
Linus Torvalds
7af0814097 Revert "gcov: clang: fix clang-11+ build"
This reverts commit 04c53de57c.

Nathan Chancellor points out that it should not have been merged into
mainline by itself. It was a fix for "gcov: use kvmalloc()", which is
still in -mm/-next. Merging it alone has broken the build.

Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/2384465683?check_suite_focus=true
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-19 15:08:49 -07:00
Tom Lendacky
a3ba26ecfb KVM: SVM: Make sure GHCB is mapped before updating
Access to the GHCB is mainly in the VMGEXIT path and it is known that the
GHCB will be mapped. But there are two paths where it is possible the GHCB
might not be mapped.

The sev_vcpu_deliver_sipi_vector() routine will update the GHCB to inform
the caller of the AP Reset Hold NAE event that a SIPI has been delivered.
However, if a SIPI is performed without a corresponding AP Reset Hold,
then the GHCB might not be mapped (depending on the previous VMEXIT),
which will result in a NULL pointer dereference.

The svm_complete_emulated_msr() routine will update the GHCB to inform
the caller of a RDMSR/WRMSR operation about any errors. While it is likely
that the GHCB will be mapped in this situation, add a safe guard
in this path to be certain a NULL pointer dereference is not encountered.

Fixes: f1c6366e30 ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
Fixes: 647daca25d ("KVM: SVM: Add support for booting APs in an SEV-ES guest")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Message-Id: <a5d3ebb600a91170fc88599d5a575452b3e31036.1617979121.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-19 18:04:47 -04:00
Wanpeng Li
a1fa4cbd53 KVM: X86: Do not yield to self
If the target is self we do not need to yield, we can avoid malicious
guest to play this.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1617941911-5338-3-git-send-email-wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-19 18:04:46 -04:00
Wanpeng Li
4a7132efff KVM: X86: Count attempted/successful directed yield
To analyze some performance issues with lock contention and scheduling,
it is nice to know when directed yield are successful or failing.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1617941911-5338-2-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-19 18:04:46 -04:00
Wanpeng Li
2b519b5797 x86/kvm: Don't bother __pv_cpu_mask when !CONFIG_SMP
Enable PV TLB shootdown when !CONFIG_SMP doesn't make sense. Let's
move it inside CONFIG_SMP. In addition, we can avoid define and
alloc __pv_cpu_mask when !CONFIG_SMP and get rid of 'alloc' variable
in kvm_alloc_cpumask.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1617941911-5338-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-19 18:04:45 -04:00
Ben Gardon
4c6654bd16 KVM: x86/mmu: Tear down roots before kvm_mmu_zap_all_fast returns
To avoid saddling a vCPU thread with the work of tearing down an entire
paging structure, take a reference on each root before they become
obsolete, so that the thread initiating the fast invalidation can tear
down the paging structure and (most likely) release the last reference.
As a bonus, this teardown can happen under the MMU lock in read mode so
as not to block the progress of vCPU threads.

Signed-off-by: Ben Gardon <bgardon@google.com>
Message-Id: <20210401233736.638171-14-bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-19 18:04:45 -04:00
Ben Gardon
b7cccd397f KVM: x86/mmu: Fast invalidation for TDP MMU
Provide a real mechanism for fast invalidation by marking roots as
invalid so that their reference count will quickly fall to zero
and they will be torn down.

One negative side affect of this approach is that a vCPU thread will
likely drop the last reference to a root and be saddled with the work of
tearing down an entire paging structure. This issue will be resolved in
a later commit.

Signed-off-by: Ben Gardon <bgardon@google.com>
Message-Id: <20210401233736.638171-13-bgardon@google.com>
[Move the loop to tdp_mmu.c, otherwise compilation fails on 32-bit. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-19 18:04:35 -04:00
David S. Miller
56aa7b21a5 wireless-drivers-next patches for v5.13
Second set of patches for v5.13. A lot of iwlwifi and mt76 patches
 this time, and also smaller features and fixes all over.
 
 mt76
 
 * mt7915/mt7615 decapsulation offload support
 
 * threaded NAPI support
 
 * new device IDs
 
 * mt7921 device reset support
 
 * rx timestamp support
 
 iwlwifi
 
 * passive scan support for 6GHz
 
 * new hardware support
 
 wilc1000
 
 * CRC support for SPI bus
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJge+xHAAoJEG4XJFUm622b3+MH/RLorFA9iU+jdbq9nA1a9f/K
 roiB9+lJogZKdy0rphF3d+/ItOPe1CZ791HP0GRRA3SsE+jO3LAxzEWvh27k7NoB
 G3gdGSXtMkUY70Fab89QG8pyn07/R/403gr6H6sxMYg3Y/2gUHUbCqPN2N4DXw0/
 2MVtxvrV4WERxirChJV3uzord8BGce1H69S6+jVLQiuoJM1PZZVr6vw+OSw7qntL
 Zc2fMjBlBulU2R3EFiOkm08PwM8mJRB8JJiQ8vnB2nMeKpFOnCgm3Hozly7q/zCB
 xCWlZcKBPYhtCRLoipvWnTt0SopZXyqslJbBBqlFVUkjPdrLqBBcVv3j5K7OZd4=
 =DVDR
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-2021-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.13

Second set of patches for v5.13. A lot of iwlwifi and mt76 patches
this time, and also smaller features and fixes all over.

mt76

* mt7915/mt7615 decapsulation offload support

* threaded NAPI support

* new device IDs

* mt7921 device reset support

* rx timestamp support

iwlwifi

* passive scan support for 6GHz

* new hardware support

wilc1000

* CRC support for SPI bus
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 14:35:05 -07:00
Stefano Garzarella
e16edc99d6 vsock/vmci: log once the failed queue pair allocation
VMCI feature is not supported in conjunction with the vSphere Fault
Tolerance (FT) feature.

VMware Tools can repeatedly try to create a vsock connection. If FT is
enabled the kernel logs is flooded with the following messages:

    qp_alloc_hypercall result = -20
    Could not attach to queue pair with -20

"qp_alloc_hypercall result = -20" was hidden by commit e8266c4c33
("VMCI: Stop log spew when qp allocation isn't possible"), but "Could
not attach to queue pair with -20" is still there flooding the log.

Since the error message can be useful in some cases, print it only once.

Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 14:30:23 -07:00
J. Bruce Fields
aba2072f45 nfsd: grant read delegations to clients holding writes
It's OK to grant a read delegation to a client that holds a write,
as long as it's the only client holding the write.

We originally tried to do this in commit 94415b06eb ("nfsd4: a
client's own opens needn't prevent delegations"), which had to be
reverted in commit 6ee65a7730 ("Revert "nfsd4: a client's own
opens needn't prevent delegations"").

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-04-19 16:41:36 -04:00
J. Bruce Fields
ebd9d2c2f5 nfsd: reshuffle some code
No change in behavior, I'm just moving some code around to avoid forward
references in a following patch.

(To do someday: figure out how to split up nfs4state.c.  It's big and
disorganized.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-04-19 16:41:36 -04:00