Commit Graph

934121 Commits

Author SHA1 Message Date
Denis Kirjanov
2cef30d7bd xen: netif.h: add a new extra type for XDP
The patch adds a new extra type to be able to diffirentiate
between RX responses on xen-netfront side with the adjusted offset
required for XDP processing.

The offset value from a guest is passed via xenstore.

Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 15:25:14 -07:00
Alexei Starovoitov
91f77560e4 Merge branch 'test_progs-improvements'
Jesper Dangaard Brouer says:

====================
V3: Reorder patches to cause less code churn.

The BPF selftest 'test_progs' contains many tests, that cover all the
different areas of the kernel where BPF is used.  The CI system sees this
as one test, which is impractical for identifying what team/engineer is
responsible for debugging the problem.

This patchset add some options that makes it easier to create a shell
for-loop that invoke each (top-level) test avail in test_progs. Then each
test FAIL/PASS result can be presented the CI system to have a separate
bullet. (For Red Hat use-case in Beaker https://beaker-project.org/)

Created a public script[1] that uses these features in an advanced way.
Demonstrating howto reduce the number of (top-level) tests by grouping tests
together via using the existing test pattern selection feature, and then
using the new --list feature combined with exclude (-b) to get a list of
remaining test names that was not part of the groups.

[1] https://github.com/netoptimizer/prototype-kernel/blob/master/scripts/bpf_selftests_grouping.sh
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-07-01 15:22:13 -07:00
Jesper Dangaard Brouer
c1f1f3656e selftests/bpf: Test_progs option for listing test names
The program test_progs have some very useful ability to specify a list of
test name substrings for selecting which tests to run.

This patch add the ability to list the selected test names without running
them. This is practical for seeing which tests gets selected with given
select arguments (which can also contain a exclude list via --name-blacklist).

This output can also be used by shell-scripts in a for-loop:

 for N in $(./test_progs --list -t xdp); do \
   ./test_progs -t $N 2>&1 > result_test_${N}.log & \
 done ; wait

This features can also be used for looking up a test number and returning
a testname. If the selection was empty then a shell EXIT_FAILURE is
returned.  This is useful for scripting. e.g. like this:

 n=1;
 while [ $(./test_progs --list -n $n) ] ; do \
   ./test_progs -n $n ; n=$(( n+1 )); \
 done

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363985751.930467.9610992940793316982.stgit@firesoul
2020-07-01 15:22:13 -07:00
Jesper Dangaard Brouer
643e7233aa selftests/bpf: Test_progs option for getting number of tests
It can be practial to get the number of tests that test_progs contain.
This could for example be used to create a shell for-loop construct that
runs the individual tests.

Like:
 for N in $(seq 1 $(./test_progs -c)); do
   ./test_progs -n $N 2>&1 > result_test_${N}.log &
 done ; wait

V2: Add the ability to return the count for the selected tests. This is
useful for getting a count e.g. after excluding some tests with option -b.
The current beakers test script like to report the max test count upfront.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363985244.930467.12617117873058936829.stgit@firesoul
2020-07-01 15:22:13 -07:00
Jesper Dangaard Brouer
6c92bd5cd4 selftests/bpf: Test_progs indicate to shell on non-actions
When a user selects a non-existing test the summary is printed with
indication 0 for all info types, and shell "success" (EXIT_SUCCESS) is
indicated. This can be understood by a human end-user, but for shell
scripting is it useful to indicate a shell failure (EXIT_FAILURE).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363984736.930467.17956007131403952343.stgit@firesoul
2020-07-01 15:22:13 -07:00
Andrii Nakryiko
17bbf925c6 tools/bpftool: Turn off -Wnested-externs warning
Turn off -Wnested-externs to avoid annoying warnings in BUILD_BUG_ON macro when
compiling bpftool:

In file included from /data/users/andriin/linux/tools/include/linux/build_bug.h:5,
                 from /data/users/andriin/linux/tools/include/linux/kernel.h:8,
                 from /data/users/andriin/linux/kernel/bpf/disasm.h:10,
                 from /data/users/andriin/linux/kernel/bpf/disasm.c:8:
/data/users/andriin/linux/kernel/bpf/disasm.c: In function ‘__func_get_name’:
/data/users/andriin/linux/tools/include/linux/compiler.h:37:38: warning: nested extern declaration of ‘__compiletime_assert_0’ [-Wnested-externs]
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                      ^~~~~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/compiler.h:16:15: note: in definition of macro ‘__compiletime_assert’
   extern void prefix ## suffix(void) __compiletime_error(msg); \
               ^~~~~~
/data/users/andriin/linux/tools/include/linux/compiler.h:37:2: note: in expansion of macro ‘_compiletime_assert’
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^~~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
 #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                     ^~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
  BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
  ^~~~~~~~~~~~~~~~
/data/users/andriin/linux/kernel/bpf/disasm.c:20:2: note: in expansion of macro ‘BUILD_BUG_ON’
  BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID);
  ^~~~~~~~~~~~

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200701212816.2072340-1-andriin@fb.com
2020-07-02 00:18:28 +02:00
Hao Luo
8d821b5db7 selftests/bpf: Switch test_vmlinux to use hrtimer_range_start_ns.
The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
programs. But in a kernel built by clang, which performs more aggresive
inlining, that function gets inlined into its caller SyS_nanosleep.
Therefore, even though fentry and kprobe do hook on the function,
they aren't triggered by the call to nanosleep in the test.

A possible fix is switching to use a function that is less likely to
be inlined, such as hrtimer_range_start_ns. The EXPORT_SYMBOL functions
shouldn't be inlined based on the description of [1], therefore safe
to use for this test. Also the arguments of this function include the
duration of sleep, therefore suitable for test verification.

[1] af3b56289b time: don't inline EXPORT_SYMBOL functions

Tested:
 In a clang build kernel, before this change, the test fails:

 test_vmlinux:PASS:skel_open 0 nsec
 test_vmlinux:PASS:skel_attach 0 nsec
 test_vmlinux:PASS:tp 0 nsec
 test_vmlinux:PASS:raw_tp 0 nsec
 test_vmlinux:PASS:tp_btf 0 nsec
 test_vmlinux:FAIL:kprobe not called
 test_vmlinux:FAIL:fentry not called

 After switching to hrtimer_range_start_ns, the test passes:

 test_vmlinux:PASS:skel_open 0 nsec
 test_vmlinux:PASS:skel_attach 0 nsec
 test_vmlinux:PASS:tp 0 nsec
 test_vmlinux:PASS:raw_tp 0 nsec
 test_vmlinux:PASS:tp_btf 0 nsec
 test_vmlinux:PASS:kprobe 0 nsec
 test_vmlinux:PASS:fentry 0 nsec

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200701175315.1161242-1-haoluo@google.com
2020-07-01 15:10:27 -07:00
Radoslaw Tyl
a296d665ea ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support
Added full support for new version Ethtool API. New API allow use
2500Gbase-T and 5000base-T supported and advertised link speed modes.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 15:04:54 -07:00
Jeff Kirsher
bb0967c04e ixgbe: Cleanup unneeded delay in ethtool test
There is a 4 seconds delay in ixgbe_diag_test() that is holding up other
ioctls such as SIOCGIFCONF that Oracle database applications use.
One of Oracle's product runs "ethtool -t ethX online" periodically for
system monitoring and that is impacting database applications that use
SIOCGIFCONF at that same time.

This 4 second delay was needed in out early 1GbE parts to give the PHY
time to recover from a reset.  This code was carried forward to the 10 GbE
driver even it was not needed for the supported PHYs in the ixgbe driver.

CC: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
CC: Jack Vogel <jack.vogel@oracle.com>
Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:47:58 -07:00
Tony Nguyen
9358076642 iavf: Fix updating statistics
Commit bac8486116 ("iavf: Refactor the watchdog state machine") inverted
the logic for when to update statistics. Statistics should be updated when
no other commands are pending, instead they were only requested when a
command was processed. iavf_request_stats() would see a pending request
and not request statistics to be updated. This caused statistics to never
be updated; fix the logic.

Fixes: bac8486116 ("iavf: Refactor the watchdog state machine")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2020-07-01 14:45:59 -07:00
Ciara Loftus
44ea803e2f i40e: introduce new dump desc XDP command
Interfaces already exist for dumping Rx and Tx descriptor information.
Introduce another for doing the same for XDP descriptors.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:44:17 -07:00
Ciara Loftus
890c402c7b i40e: add XDP ring statistics to dump VSI debug output
Prior to this, only the Rx and Tx ring statistics were dumped. The XDP
ring statistics are now dumped as well.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:40:07 -07:00
Ciara Loftus
e2968260e1 i40e: add XDP ring statistics to VSI stats
Prior to this, only Rx and Tx ring statistics were accounted for.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:35:54 -07:00
Magnus Karlsson
1fd972ebe5 i40e: move check of full Tx ring to outside of send loop
Move the check if the HW Tx ring is full to outside the send
loop. Currently it is checked for every single descriptor that we
send. Instead, tell the send loop to only process a maximum number of
packets equal to the number of available slots in the Tx ring. This
way, we can remove the check inside the send loop to and gain some
performance.

Suggested-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:31:41 -07:00
Magnus Karlsson
4b5539c01d i40e: eliminate division in napi_poll data path
Eliminate a division in the napi_poll data path. This division is
executed even though it is only needed in the rare case when there are
not enough interrupt lines so they have to be shared between queue
pairs. Instead, just test for this case and only execute the division
if needed. The code has been lifted from the ice driver.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:27:11 -07:00
Magnus Karlsson
5574ff7b7b i40e: optimize AF_XDP Tx completion path
Improve the performance of the AF_XDP zero-copy Tx completion
path. When there are no XDP buffers being sent using XDP_TX or
XDP_REDIRECT, we do not have go through the SW ring to clean up any
entries since the AF_XDP path does not use these. In these cases, just
fast forward the next-to-use counter and skip going through the SW
ring. The limit on the maximum number of entries to complete is also
removed since the algorithm is now O(1). To simplify the code path, the
maximum number of entries to complete for the XDP path is therefore
also increased from 256 to 512 (the default number of Tx HW
descriptors). This should be fine since the completion in the XDP path
is faster than in the SKB path that has 256 as the maximum number.

This patch provides around 4% throughput improvement for the l2fwd
application in xdpsock on my machine.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:24:14 -07:00
Wei Yongjun
753f3884f2 iavf: fix error return code in iavf_init_get_resources()
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: b66c7bc1cd ("iavf: Refactor init state machine")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:18:54 -07:00
Arkadiusz Kubalewski
d5ec9e2ce4 i40e: Add support for a new feature Total Port Shutdown
After OS requests to down a link on a physical network port, the
traffic is no longer being processed but the physical link with
a link partner is still established.

Currently there is a feature (Link down on close) which allows
to physically bring the link down (after OS request).

With this patch new feature with similar capability is introduced:
TOTAL_PORT_SHUTDOWN
Allows to physically disable the link on the NIC's port.
If enabled, (after link down request from the OS)
no link, traffic or led activity is possible on that port.

If I40E_FLAG_TOTAL_PORT_SHUTDOWN is enabled, the
I40E_FLAG_LINK_DOWN_ON_CLOSE_ENABLED must be explicitly forced to
true and cannot be disabled at that time.
The functionalities are exclusive in terms of configuration, but
they also have similar behavior (allowing to disable physical link
of the port), with following differences:
- LINK_DOWN_ON_CLOSE_ENABLED is configurable at host OS run-time
  and is supported by whole family of 7xx Intel Ethernet Controllers
- TOTAL_PORT_SHUTDOWN may be enabled only before OS loads (in BIOS)
  only if motherboard's BIOS and NIC's FW has support of it
- when LINK_DOWN_ON_CLOSE_ENABLED is used, the link is being brought
  down by sending phy_type=0 to NIC's FW
- when TOTAL_PORT_SHUTDOWN is used, phy_type is not altered, instead
  the link is being brought down by clearing bit
  (I40E_AQ_PHY_ENABLE_LINK) in abilities field of
  i40e_aq_set_phy_config structure

Introduced changes:
- new private flag I40E_FLAG_TOTAL_PORT_SHUTDOWN for handling the
  feature
- probe of NVM if the feature was enabled at driver's port
  initialization
- special handling on link-down procedure to let FW physically
  shutdown the port if the feature was enabled

Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 14:17:16 -07:00
Jeff Kirsher
5463fce643 ethernet/intel: Convert fallthrough code comments
Convert all the remaining 'fall through" code comments to the newer
'fallthrough;' keyword.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 13:47:43 -07:00
David S. Miller
6d79dc6765 Merge branch 'net-ethernet-use-generic-power-management'
Vaibhav Gupta says:

====================
net: ethernet: use generic power management

Linux Kernel Mentee: Remove Legacy Power Management.

The purpose of this patch series is to remove legacy power management callbacks
from net ethernet drivers.

The callbacks performing suspend() and resume() operations are still calling
pci_save_state(), pci_set_power_state(), etc. and handling the power management
themselves, which is not recommended.

The conversion requires the removal of the those function calls and change the
callback definition accordingly and make use of dev_pm_ops structure.

All patches are compile-tested only.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:34 -07:00
Vaibhav Gupta
40c1b1ee55 natsemi: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable_device, which is not recommended. Hence, removed.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
4c2ad1263b vxge: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Use "struct dev_pm_ops" variable to bind the callbacks.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
64120615d1 ksz884x: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable_wake(), pci_save/restore_sate() and
pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
0e3e206a3e mlx4: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Use "struct dev_pm_ops" variable to bind the callbacks.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
e9a7f8c586 benet: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
78cad4cec6 sundance: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
1c2e4839ec liquidio: use generic power management
Drivers should not use legacy power management as they have to manage power
states and related operations, for the device, themselves. This driver was
handling them with the help of PCI helper functions.

With generic PM, all essentials will be handled by the PCI core. Driver
needs to do only device-specific operations.

The driver defined empty-body .suspend() and .resume() callbacks earlier.
They can now be define NULL and bind with "struct dev_pm_ops" variable.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
817a89ae10 ena_netdev: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
a7c48c7211 starfire: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_save/restore_sate() and pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
33b7a252c8 ne2k-pci: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Vaibhav Gupta
7b46681cf4 typhoon: use generic power management
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.

After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.

In this driver:
typhoon_resume() calls typhoon_wakeup() which then calls PCI helper
functions pci_set_power_state() and pci_restore_state(). The only other
function, using typhoon_wakeup() is typhoon_open().

Thus remove the pci_*() calls from tyhpoon_wakeup() and place them in
typhoon_open(), maintaining the order, to retain the normal behavior of
the function

Now, typhoon_suspend() calls typhoon_sleep() which then calls PCI helper
functions pci_enable_wake(), pci_disable_device() and
pci_set_power_state(). Other functions:
 - typhoon_open()
 - typhoon_close()
 - typhoon_init_one()
are also invoking typhoon_sleep(). Thus, in this case, cannot simply
move PCI helper functions call.

Hence, define a new function typhoon_sleep_early() which will do all the
operations, which typhoon_sleep() was doing before calling PCI helper
functions. Now typhoon_sleep() will call typhoon_sleep_early() to do
those tasks, hence, the behavior for _open(), _close and _init_one() remain
unchanged. And typhon_suspend() only requires typhoon_sleep_early().

Compile-tested only.

Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:58:33 -07:00
Hulk Robot
4f195d2803 qed: Make symbol 'qed_hw_err_type_descr' static
Fix sparse build warning:

drivers/net/ethernet/qlogic/qed/qed_main.c:2480:6: warning:
 symbol 'qed_hw_err_type_descr' was not declared. Should it be static?

Signed-off-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:46:11 -07:00
Colin Ian King
2a6d6c31f1 net/packet: remove redundant initialization of variable err
The variable err is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01 12:44:52 -07:00
Randy Dunlap
6b207d66aa bpf: Fix net/core/filter build errors when INET is not enabled
Fix build errors when CONFIG_INET is not set/enabled.

(.text+0x2b1b): undefined reference to `tcp_prot'
(.text+0x2b3b): undefined reference to `tcp_prot'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/b1a858ec-7e04-56bc-248a-62cb9bbee726@infradead.org
2020-07-01 08:36:58 -07:00
Alexei Starovoitov
64f0013c07 Merge branch 'bpf_get_task_stack'
Song Liu says:

====================
This set introduces a new helper bpf_get_task_stack(). The primary use case
is to dump all /proc/*/stack to seq_file via bpf_iter__task.

A few different approaches have been explored and compared:

  1. A simple wrapper around stack_trace_save_tsk(), as v1 [1].

     This approach introduces new syntax, which is different to existing
     helper bpf_get_stack(). Therefore, this is not ideal.

  2. Extend get_perf_callchain() to support "task" as argument.

     This approach reuses most of bpf_get_stack(). However, extending
     get_perf_callchain() requires non-trivial changes to architecture
     specific code. Which is error prone.

  3. Current (v2) approach, leverages most of existing bpf_get_stack(), and
     uses stack_trace_save_tsk() to handle architecture specific logic.

[1] https://lore.kernel.org/netdev/20200623070802.2310018-1-songliubraving@fb.com/

Changes v4 => v5:
1. Rebase and work around git-am issue. (Alexei)
2. Update commit log for 4/4. (Yonghong)

Changes v3 => v4:
1. Simplify the selftests with bpf_iter.h. (Yonghong)
2. Add example output to commit log of 4/4. (Yonghong)

Changes v2 => v3:
1. Rebase on top of bpf-next. (Yonghong)
2. Sanitize get_callchain_entry(). (Peter)
3. Use has_callchain_buf for bpf_get_task_stack. (Andrii)
4. Other small clean up. (Yonghong, Andrii).

Changes v1 => v2:
1. Reuse most of bpf_get_stack() logic. (Andrii)
2. Fix unsigned long vs. u64 mismatch for 32-bit systems. (Yonghong)
3. Add %pB support in bpf_trace_printk(). (Daniel)
4. Fix buffer size to bytes.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-07-01 08:26:22 -07:00
Song Liu
c7568114bc selftests/bpf: Add bpf_iter test with bpf_get_task_stack()
The new test is similar to other bpf_iter tests. It dumps all
/proc/<pid>/stack to a seq_file. Here is some example output:

pid:     2873 num_entries:        3
[<0>] worker_thread+0xc6/0x380
[<0>] kthread+0x135/0x150
[<0>] ret_from_fork+0x22/0x30

pid:     2874 num_entries:        9
[<0>] __bpf_get_stack+0x15e/0x250
[<0>] bpf_prog_22a400774977bb30_dump_task_stack+0x4a/0xb3c
[<0>] bpf_iter_run_prog+0x81/0x170
[<0>] __task_seq_show+0x58/0x80
[<0>] bpf_seq_read+0x1c3/0x3b0
[<0>] vfs_read+0x9e/0x170
[<0>] ksys_read+0xa7/0xe0
[<0>] do_syscall_64+0x4c/0xa0
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Note: bpf_iter test as-is doesn't print the contents of the seq_file. To
see the example above, it is necessary to add printf() to do_dummy_read.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-5-songliubraving@fb.com
2020-07-01 08:23:59 -07:00
Song Liu
2df6bb5493 bpf: Allow %pB in bpf_seq_printf() and bpf_trace_printk()
This makes it easy to dump stack trace in text.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-4-songliubraving@fb.com
2020-07-01 08:23:59 -07:00
Song Liu
fa28dcb82a bpf: Introduce helper bpf_get_task_stack()
Introduce helper bpf_get_task_stack(), which dumps stack trace of given
task. This is different to bpf_get_stack(), which gets stack track of
current task. One potential use case of bpf_get_task_stack() is to call
it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file.

bpf_get_task_stack() uses stack_trace_save_tsk() instead of
get_perf_callchain() for kernel stack. The benefit of this choice is that
stack_trace_save_tsk() doesn't require changes in arch/. The downside of
using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the
stack trace to unsigned long array. For 32-bit systems, we need to
translate it to u64 array.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com
2020-07-01 08:23:19 -07:00
Song Liu
d141b8bc57 perf: Expose get/put_callchain_entry()
Sanitize and expose get/put_callchain_entry(). This would be used by bpf
stack map.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-2-songliubraving@fb.com
2020-07-01 08:22:08 -07:00
Alexei Starovoitov
bba1dc0b55 bpf: Remove redundant synchronize_rcu.
bpf_free_used_maps() or close(map_fd) will trigger map_free callback.
bpf_free_used_maps() is called after bpf prog is no longer executing:
bpf_prog_put->call_rcu->bpf_prog_free->bpf_free_used_maps.
Hence there is no need to call synchronize_rcu() to protect map elements.

Note that hash_of_maps and array_of_maps update/delete inner maps via
sys_bpf() that calls maybe_wait_bpf_programs() and synchronize_rcu().

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/bpf/20200630043343.53195-2-alexei.starovoitov@gmail.com
2020-07-01 08:07:13 -07:00
Julian Anastasov
f9200a52ee ipvs: avoid expiring many connections from timer
Add new functions ip_vs_conn_del() and ip_vs_conn_del_put()
to release many IPVS connections in process context.
They are suitable for connections found in table
when we do not want to overload the timers.

Currently, the change is useful for the dropentry delayed
work but it will be used also in following patch
when flushing connections to failed destinations.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-01 10:18:20 +02:00
Andrii Nakryiko
8c18311067 selftests/bpf: Add byte swapping selftest
Add simple selftest validating byte swap built-ins and compile-time macros.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630152125.3631920-3-andriin@fb.com
2020-07-01 09:06:12 +02:00
Andrii Nakryiko
30ad688094 libbpf: Make bpf_endian co-exist with vmlinux.h
Make bpf_endian.h compatible with vmlinux.h. It is a frequent request from
users wanting to use bpf_endian.h in their BPF applications using CO-RE and
vmlinux.h.

To achieve that, re-implement byte swap macros and drop all the header
includes. This way it can be used both with linux header includes, as well as
with a vmlinux.h.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630152125.3631920-2-andriin@fb.com
2020-07-01 09:06:12 +02:00
David S. Miller
2b04a66156 Merge branch 'cxgb4-add-mirror-action-support-for-TC-MATCHALL'
Rahul Lakkireddy says:

====================
cxgb4: add mirror action support for TC-MATCHALL

This series of patches add support to mirror all ingress traffic
for TC-MATCHALL ingress offload.

Patch 1 adds support to dynamically create a mirror Virtual Interface
(VI) that accepts all mirror ingress traffic when mirror action is
set in TC-MATCHALL offload.

Patch 2 adds support to allocate mirror Rxqs and setup RSS for the
mirror VI.

Patch 3 adds support to replicate all the main VI configuration to
mirror VI. This includes replicating MTU, promiscuous mode,
all-multicast mode, and enabled netdev Rx feature offloads.

v3:
- Replace mirror VI refcount_t with normal u32 variable in all patches.
- Add back calling cxgb4_port_mirror_start() in cxgb_open(), which
  was there in v1, but got missed in v2 during refactoring, in patch
  3.

v2:
- Add mutex to protect all mirror VI data, instead of just
  mirror Rxqs, in patch 1 and 2.
- Remove the un-needed mirror Rxq mutex in patch 2.
- Simplify the replication code by refactoring t4_set_rxmode()
  to handle mirror VI, instead of duplicating the t4_set_rxmode()
  calls in multiple places in patch 3.
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:35:00 -07:00
Rahul Lakkireddy
696c278fdf cxgb4: add main VI to mirror VI config replication
When mirror VI is enabled, replicate various VI config params
enabled on main VI to mirror VI. These include replicating MTU,
promiscuous mode, all-multicast mode, and enabled netdev Rx
feature offloads.

v3:
- Replace mirror VI refcount_t with normal u32 variable.
- Add back calling cxgb4_port_mirror_start() in cxgb_open(), which
  was there in v1, but got missed in v2 during refactoring.

v2:
- Simplify the replication code by refactoring t4_set_rxmode()
  to handle mirror VI, instead of duplicating the t4_set_rxmode()
  calls in multiple places.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:34:34 -07:00
Rahul Lakkireddy
2b465ed00f cxgb4: add support for mirror Rxqs
When mirror VI is enabled, allocate the mirror Rxqs and setup the
mirror VI RSS table. The mirror Rxqs are allocated/freed when
the mirror VI is created/destroyed or when underlying port is
brought up/down, respectively.

v3:
- Replace mirror VI refcount_t with normal u32 variable.

v2:
- Use mutex to protect all mirror VI data, instead of just
  mirror Rxqs.
- Remove the un-needed mirror Rxq mutex.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:34:34 -07:00
Rahul Lakkireddy
fd2261d8ed cxgb4: add mirror action to TC-MATCHALL offload
Add mirror Virtual Interface (VI) support to receive all ingress
mirror traffic from the underlying device. The mirror VI is
created dynamically, if the TC-MATCHALL rule has a corresponding
mirror action. Also request MSI-X vectors needed for the mirror VI
Rxqs. If no vectors are available, then disable mirror VI support.

v3:
- Replace mirror VI refcount_t with normal u32 variable.

v2:
- Add mutex to protect all mirror VI data, instead of just
  mirror Rxqs.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:34:34 -07:00
Nathan Chancellor
75603a3112 pcnet32: Mark PM functions as __maybe_unused
In certain configurations without power management support, the
following warnings happen:

../drivers/net/ethernet/amd/pcnet32.c:2928:12: warning:
'pcnet32_pm_resume' defined but not used [-Wunused-function]
 2928 | static int pcnet32_pm_resume(struct device *device_d)
      |            ^~~~~~~~~~~~~~~~~
../drivers/net/ethernet/amd/pcnet32.c:2916:12: warning:
'pcnet32_pm_suspend' defined but not used [-Wunused-function]
 2916 | static int pcnet32_pm_suspend(struct device *device_d)
      |            ^~~~~~~~~~~~~~~~~~

Mark these functions as __maybe_unused to make it clear to the compiler
that this is going to happen based on the configuration, which is the
standard for these types of functions.

Fixes: a86688fbef ("pcnet32: Convert to generic power management")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:17:54 -07:00
Nathan Chancellor
0adcd2981d amd8111e: Mark PM functions as __maybe_unused
In certain configurations without power management support, the
following warnings happen:

../drivers/net/ethernet/amd/amd8111e.c:1623:12: warning:
'amd8111e_resume' defined but not used [-Wunused-function]
 1623 | static int amd8111e_resume(struct device *dev_d)
      |            ^~~~~~~~~~~~~~~
../drivers/net/ethernet/amd/amd8111e.c:1584:12: warning:
'amd8111e_suspend' defined but not used [-Wunused-function]
 1584 | static int amd8111e_suspend(struct device *dev_d)
      |            ^~~~~~~~~~~~~~~~

Mark these functions as __maybe_unused to make it clear to the compiler
that this is going to happen based on the configuration, which is the
standard for these types of functions.

Fixes: 2caf751fe0 ("amd8111e: Convert to generic power management")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 18:17:53 -07:00
David S. Miller
2429ec265d Merge branch 'net-improve-devres-helpers'
Bartosz Golaszewski says:

====================
net: improve devres helpers

So it seems like there's no support for relaxing certain networking devres
helpers to not require previously allocated structures to also be managed.
However the way mdio devres variants are implemented is still wrong and I
modified my series to address it while keeping the functions strict.

First two patches modify the ixgbe driver to get rid of the last user of
devm_mdiobus_free().

Patches 3, 4, 5 and 6 are mostly cosmetic.

Patch 7 fixes the way devm_mdiobus_register() is implemented.

Patches 8 & 9 provide a managed variant of of_mdiobus_register() and
last patch uses it in mtk-star-emac driver.

v1 -> v2:
- drop the patch relaxing devm_register_netdev()
- require struct mii_bus to be managed in devm_mdiobus_register() and
  devm_of_mdiobus_register() but don't store that information in the
  structure itself: use devres_find() instead
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 15:57:34 -07:00