Patch series "Calculate pcp->high based on zone sizes and active CPUs", v2.
The per-cpu page allocator (PCP) is meant to reduce contention on the zone
lock but the sizing of batch and high is archaic and neither takes the
zone size into account or the number of CPUs local to a zone. With larger
zones and more CPUs per node, the contention is getting worse.
Furthermore, the fact that vm.percpu_pagelist_fraction adjusts both batch
and high values means that the sysctl can reduce zone lock contention but
also increase allocation latencies.
This series disassociates pcp->high from pcp->batch and then scales
pcp->high based on the size of the local zone with limited impact to
reclaim and accounting for active CPUs but leaves pcp->batch static. It
also adapts the number of pages that can be on the pcp list based on
recent freeing patterns.
The motivation is partially to adjust to larger memory sizes but is also
driven by the fact that large batches of page freeing via release_pages()
often shows zone contention as a major part of the problem. Another is a
bug report based on an older kernel where a multi-terabyte process can
takes several minutes to exit. A workaround was to use
vm.percpu_pagelist_fraction to increase the pcp->high value but testing
indicated that a production workload could not use the same values because
of an increase in allocation latencies. Unfortunately, I cannot reproduce
this test case myself as the multi-terabyte machines are in active use but
it should alleviate the problem.
The series aims to address both and partially acts as a pre-requisite.
pcp only works with order-0 which is useless for SLUB (when using high
orders) and THP (unconditionally). To store high-order pages on PCP, the
pcp->high values need to be increased first.
This patch (of 6):
The vm.percpu_pagelist_fraction is used to increase the batch and high
limits for the per-cpu page allocator (PCP). The intent behind the sysctl
is to reduce zone lock acquisition when allocating/freeing pages but it
has a problem. While it can decrease contention, it can also increase
latency on the allocation side due to unreasonably large batch sizes.
This leads to games where an administrator adjusts
percpu_pagelist_fraction on the fly to work around contention and
allocation latency problems.
This series aims to alleviate the problems with zone lock contention while
avoiding the allocation-side latency problems. For the purposes of
review, it's easier to remove this sysctl now and reintroduce a similar
sysctl later in the series that deals only with pcp->high.
Link: https://lkml.kernel.org/r/20210525080119.5455-1-mgorman@techsingularity.net
Link: https://lkml.kernel.org/r/20210525080119.5455-2-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm-cpufreq:
cpufreq: Make cpufreq_online() call driver->offline() on errors
cpufreq: loongson2: Remove unused linux/sched.h headers
cpufreq: sh: Remove unused linux/sched.h headers
cpufreq: stats: Clean up local variable in cpufreq_stats_create_table()
cpufreq: intel_pstate: hybrid: Fix build with CONFIG_ACPI unset
cpufreq: sc520_freq: add 'fallthrough' to one case
cpufreq: intel_pstate: Add Cometlake support in no-HWP mode
cpufreq: intel_pstate: Add Icelake servers support in no-HWP mode
cpufreq: intel_pstate: hybrid: CPU-specific scaling factor
cpufreq: intel_pstate: hybrid: Avoid exposing two global attributes
* pm-cpuidle:
cpuidle: teo: remove unneeded semicolon in teo_select()
cpuidle: teo: Use kerneldoc documentation in admin-guide
cpuidle: teo: Rework most recent idle duration values treatment
cpuidle: teo: Change the main idle state selection logic
cpuidle: teo: Cosmetic modification of teo_select()
cpuidle: teo: Cosmetic modifications of teo_update()
intel_idle: Adjust the SKX C6 parameters if PC6 is disabled
Pull pstore updates from Kees Cook:
"Use normal block device I/O path for pstore/blk. (Christoph Hellwig,
Kees Cook, Pu Lehui)"
* tag 'pstore-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore/blk: Include zone in pstore_device_info
pstore/blk: Fix kerndoc and redundancy on blkdev param
pstore/blk: Use the normal block device I/O path
pstore/blk: Move verify_size() macro out of function
pstore/blk: Improve failure reporting
Pull documentation updates from Jonathan Corbet:
"This was a reasonably active cycle for documentation; this includes:
- Some kernel-doc cleanups. That script is still regex onslaught from
hell, but it has gotten a little better.
- Improvements to the checkpatch docs, which are also used by the
tool itself.
- A major update to the pathname lookup documentation.
- Elimination of :doc: markup, since our automarkup magic can create
references from filenames without all the extra noise.
- The flurry of Chinese translation activity continues.
Plus, of course, the usual collection of updates, typo fixes, and
warning fixes"
* tag 'docs-5.14' of git://git.lwn.net/linux: (115 commits)
docs: path-lookup: use bare function() rather than literals
docs: path-lookup: update symlink description
docs: path-lookup: update get_link() ->follow_link description
docs: path-lookup: update WALK_GET, WALK_PUT desc
docs: path-lookup: no get_link()
docs: path-lookup: update i_op->put_link and cookie description
docs: path-lookup: i_op->follow_link replaced with i_op->get_link
docs: path-lookup: Add macro name to symlink limit description
docs: path-lookup: remove filename_mountpoint
docs: path-lookup: update do_last() part
docs: path-lookup: update path_mountpoint() part
docs: path-lookup: update path_to_nameidata() part
docs: path-lookup: update follow_managed() part
docs: Makefile: Use CONFIG_SHELL not SHELL
docs: Take a little noise out of the build process
docs: x86: avoid using ReST :doc:`foo` markup
docs: virt: kvm: s390-pv-boot.rst: avoid using ReST :doc:`foo` markup
docs: userspace-api: landlock.rst: avoid using ReST :doc:`foo` markup
docs: trace: ftrace.rst: avoid using ReST :doc:`foo` markup
docs: trace: coresight: coresight.rst: avoid using ReST :doc:`foo` markup
...
Pull media updates from Mauro Carvalho Chehab:
- V4L2 core control API was split into separate files
- New RC maps: tango and tc-90405
- Hantro driver got support for G2/HEVC decoder
- av7710 is moving to staging, together with some legacy APIs
- several cleanups related to compat_ioctl32 code
- Move the MPEG-2 stateless control type out of staging
- Address several issues with RPM get logic on media drivers
- Lots of cleanups, bug fixes and improvements.
* tag 'media/v5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (394 commits)
media: s5p-mfc: Fix display delay control creation
media: mtk-vpu: on suspend, read/write regs only if vpu is running
media: video-mux: Skip dangling endpoints
media: Fix Media Controller API config checks
media: i2c: rdacm20: Re-work ov10635 reset
media: i2c: rdacm20: Check return values
media: i2c: rdacm20: Report camera module name
media: i2c: rdacm20: Enable noise immunity
media: i2c: rdacm20: Embed 'serializer' field
media: i2c: rdacm21: Power up OV10640 before OV490
media: i2c: rdacm21: Fix OV10640 powerup
media: i2c: rdacm21: Add delay after OV490 reset
media: i2c: max9271: Introduce wake_up() function
media: i2c: max9271: Check max9271_write() return
media: i2c: max9286: Rework comments in .bound()
media: i2c: max9286: Define high channel amplitude
media: i2c: max9286: Cache channel amplitude
media: i2c: max9286: Rename reverse_channel_mv
media: i2c: max9286: Adjust parameters indent
media: hantro: add support for Rockchip RK3036
...
Commit 95b88f4d71 ("dm writecache: pause
writeback if cache full and origin being written directly") introduced a
code that pauses cache flushing if we are issuing writes directly to the
origin.
Improve that initial commit by making the timeout code configurable
(via the option "pause_writeback"). Also change the default from 1s to
3s because it performed better.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Pull x86 splitlock updates from Ingo Molnar:
- Add the "ratelimit:N" parameter to the split_lock_detect= boot
option, to rate-limit the generation of bus-lock exceptions.
This is both easier on system resources and kinder to offending
applications than the current policy of outright killing them.
- Document the split-lock detection feature and its parameters.
* tag 'x86-splitlock-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation/x86: Add ratelimit in buslock.rst
Documentation/admin-guide: Add bus lock ratelimit
x86/bus_lock: Set rate limit for bus lock
Documentation/x86: Add buslock.rst
Pull x86 cleanups from Ingo Molnar:
"Misc cleanups & removal of obsolete code"
* tag 'x86-cleanups-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sgx: Correct kernel-doc's arg name in sgx_encl_release()
doc: Remove references to IBM Calgary
x86/setup: Document that Windows reserves the first MiB
x86/crash: Remove crash_reserve_low_1M()
x86/setup: Remove CONFIG_X86_RESERVE_LOW and reservelow= options
x86/alternative: Align insn bytes vertically
x86: Fix leftover comment typos
x86/asm: Simplify __smp_mb() definition
x86/alternatives: Make the x86nops[] symbol static
Add a "metadata_only" parameter that when present: only metadata is
promoted to the cache. This option improves performance for heavier
REQ_META workloads (e.g. device-mapper-test-suite's "git clone and
checkout" benchmark improves from 341s to 312s).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
When the clocksource watchdog marks a clock as unstable, this might
be due to that clock being unstable or it might be due to delays that
happen to occur between the reads of the two clocks. It would be good
to have a way of testing the clocksource watchdog's ability to
distinguish between these two causes of clock skew and instability.
Therefore, provide a new clocksource-wdtest module selected by a new
TEST_CLOCKSOURCE_WATCHDOG Kconfig option. This module has a single module
parameter named "holdoff" that provides the number of seconds of delay
before testing should start, which defaults to zero when built as a module
and to 10 seconds when built directly into the kernel. Very large systems
that boot slowly may need to increase the value of this module parameter.
This module uses hand-crafted clocksource structures to do its testing,
thus avoiding messing up timing for the rest of the kernel and for user
applications. This module first verifies that the ->uncertainty_margin
field of the clocksource structures are set sanely. It then tests the
delay-detection capability of the clocksource watchdog, increasing the
number of consecutive delays injected, first provoking console messages
complaining about the delays and finally forcing a clock-skew event.
Unexpected test results cause at least one WARN_ON_ONCE() console splat.
If there are no splats, the test has passed. Finally, it fuzzes the
value returned from a clocksource to test the clocksource watchdog's
ability to detect time skew.
This module checks the state of its clocksource after each test, and
uses WARN_ON_ONCE() to emit a console splat if there are any failures.
This should enable all types of test frameworks to detect any such
failures.
This facility is intended for diagnostic use only, and should be avoided
on production systems.
Reported-by: Chris Mason <clm@fb.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Feng Tang <feng.tang@intel.com>
Link: https://lore.kernel.org/r/20210527190124.440372-5-paulmck@kernel.org
Currently, if skew is detected on a clock marked CLOCK_SOURCE_VERIFY_PERCPU,
that clock is checked on all CPUs. This is thorough, but might not be
what you want on a system with a few tens of CPUs, let alone a few hundred
of them.
Therefore, by default check only up to eight randomly chosen CPUs. Also
provide a new clocksource.verify_n_cpus kernel boot parameter. A value of
-1 says to check all of the CPUs, and a non-negative value says to randomly
select that number of CPUs, without concern about selecting the same CPU
multiple times. However, make use of a cpumask so that a given CPU will be
checked at most once.
Suggested-by: Thomas Gleixner <tglx@linutronix.de> # For verify_n_cpus=1.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Feng Tang <feng.tang@intel.com>
Link: https://lore.kernel.org/r/20210527190124.440372-3-paulmck@kernel.org
When the clocksource watchdog marks a clock as unstable, this might be due
to that clock being unstable or it might be due to delays that happen to
occur between the reads of the two clocks. Yes, interrupts are disabled
across those two reads, but there are no shortage of things that can delay
interrupts-disabled regions of code ranging from SMI handlers to vCPU
preemption. It would be good to have some indication as to why the clock
was marked unstable.
Therefore, re-read the watchdog clock on either side of the read from the
clock under test. If the watchdog clock shows an excessive time delta
between its pair of reads, the reads are retried.
The maximum number of retries is specified by a new kernel boot parameter
clocksource.max_cswd_read_retries, which defaults to three, that is, up to
four reads, one initial and up to three retries. If more than one retry
was required, a message is printed on the console (the occasional single
retry is expected behavior, especially in guest OSes). If the maximum
number of retries is exceeded, the clock under test will be marked
unstable. However, the probability of this happening due to various sorts
of delays is quite small. In addition, the reason (clock-read delays) for
the unstable marking will be apparent.
Reported-by: Chris Mason <clm@fb.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Feng Tang <feng.tang@intel.com>
Link: https://lore.kernel.org/r/20210527190124.440372-1-paulmck@kernel.org
Introduce an rq-qos policy that assigns an I/O priority to requests based
on blk-cgroup configuration settings. This policy has the following
advantages over the ioprio_set() system call:
- This policy is cgroup based so it has all the advantages of cgroups.
- While ioprio_set() does not affect page cache writeback I/O, this rq-qos
controller affects page cache writeback I/O for filesystems that support
assiociating a cgroup with writeback I/O. See also
Documentation/admin-guide/cgroup-v2.rst.
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210618004456.7280-5-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The conversion tools used during DocBook/LaTeX/html/Markdown->ReST
conversion and some cut-and-pasted text contain some characters that
aren't easily reachable on standard keyboards and/or could cause
troubles when parsed by the documentation build system.
Replace the occurences of the following characters:
- U+00a0 (' '): NO-BREAK SPACE
as it can cause lines being truncated on PDF output
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/551a2af0e654226067e5c376d3e2d959cc738f39.1623826294.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Currently we need to use as many acpi_mask_gpe options as we want to have
GPEs to be masked. Even with two it already becomes inconveniently large
the kernel command line.
Instead, allow acpi_mask_gpe to represent bitmap list.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a kernel command line option that disables printing of events to
console at late_initcall_sync(). This is useful when needing to see
specific events written to console on boot up, but not wanting it when
user space starts, as user space may make the console so noisy that the
system becomes inoperable.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Commit bf382fb0bcef4 ("block: remove legacy IO schedulers", Oct 12 2018)
removes the CFQ scheduler, together with blkio.weight and
blkio.weight_device described in cgroup v1 documentation. Users are
supposed to use the BFQ scheduler, which cgroup file for setting weight
is blkio.bfq.weight, but there is no way to set per-device weight.
Later, commit 795fe54c2a per-device weights for BFQ, meaning that
blkio.bfq.weight and blkio.bfq.weight_device can be used in a way
similar to the old CFQ cgroup interface.
Yet, the cgroup v1 docs were never updated. Fix this:
- use the new file names;
- fix the range for weight (used to be 10..1000, now 1..1000);
- link to BFQ scheduler docs.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove redundant details of blkdev and fix up resulting kerndoc.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Stop poking into block layer internals and just open the block device
file an use kernel_read and kernel_write on it. Note that this means
the transformation from name_to_dev_t can't be used anymore when
pstore_blk is loaded as a module: a full filesystem device path name
must be used instead. Additionally removes ":internal:" kerndoc link,
since no such documentation remains.
Co-developed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
The :doc:`foo` tag is auto-generated via automarkup.py.
So, use the filename at the sources, instead of :doc:`foo`.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
The :doc:`foo` tag is auto-generated via automarkup.py.
So, use the filename at the sources, instead of :doc:`foo`.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Some parts of the guide are aged, hence need be updated.
1) The backup area of the 1st 640K on X86_64 has been removed
by below commits, update the description accordingly.
commit 7c321eb2b8 ("x86/kdump: Remove the backup region handling")
commit 6f599d8423 ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified")
2) Sort out the descripiton of "crashkernel syntax" part.
3) And some other minor cleanups.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Link: https://lore.kernel.org/r/20210609083218.GB591017@MiWiFi-R3L-srv
[jc: added blank line to fix added build warning]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
There are two descriptions of the TEO (Timer Events Oriented) cpuidle
governor in the kernel source tree, one in the C file containing its
code and one in cpuidle.rst which is part of admin-guide.
Instead of trying to keep them both in sync and in order to reduce
text duplication, include the governor description from the C file
directly into cpuidle.rst.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PSI accounts stalls for each cgroup separately and aggregates it at each
level of the hierarchy. This causes additional overhead with psi_avgs_work
being called for each cgroup in the hierarchy. psi_avgs_work has been
highly optimized, however on systems with large number of cgroups the
overhead becomes noticeable.
Systems which use PSI only at the system level could avoid this overhead
if PSI can be configured to skip per-cgroup stall accounting.
Add "cgroup_disable=pressure" kernel command-line option to allow
requesting system-wide only pressure stall accounting. When set, it
keeps system-wide accounting under /proc/pressure/ but skips accounting
for individual cgroups and does not expose PSI nodes in cgroup hierarchy.
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
The CONFIG_X86_RESERVE_LOW build time and reservelow= command line option
allowed to control the amount of memory under 1M that would be reserved at
boot to avoid using memory that can be potentially clobbered by BIOS.
Since the entire range under 1M is always reserved there is no need for
these options anymore and they can be removed.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210601075354.5149-3-rppt@kernel.org
The conversion tools used during DocBook/LaTeX/html/Markdown->ReST
conversion and some cut-and-pasted text contain some characters that
aren't easily reachable on standard keyboards and/or could cause
troubles when parsed by the documentation build system.
Replace the occurences of the following characters:
- U+201c ('“'): LEFT DOUBLE QUOTATION MARK
- U+201d ('”'): RIGHT DOUBLE QUOTATION MARK
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
The header defines the user space interface but may be mistaken as
kernel-only header due to its location. Add "uapi" directory under
driver's include directory and move the header there.
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
With help from platform firmware (ACPI) it is possible to power on
retimers even when there is no USB4 link (e.g nothing is connected to
the USB4 ports). This allows us to bring the USB4 sideband up so that we
can access retimers and upgrade their NVM firmware.
If the platform has support for this, we expose two additional
attributes under USB4 ports: offline and rescan. These can be used to
bring the port offline, rescan for the retimers and put the port online
again. The retimer NVM upgrade itself works the same way than with cable
connected.
Signed-off-by: Rajmohan Mani <rajmohan.mani@intel.com>
Co-developed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
can and wireless trees. Notably including fixes for the recently
announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
touching some core parts of the stack, too, but nothing hair-raising.
Current release - regressions:
- tipc: make node link identity publish thread safe
- dsa: felix: re-enable TAS guard band mode
- stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()
- stmmac: fix system hang if change mac address after interface
ifdown
Current release - new code bugs:
- mptcp: avoid OOB access in setsockopt()
- bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
- ethtool: stats: fix a copy-paste error - init correct array size
Previous releases - regressions:
- sched: fix packet stuck problem for lockless qdisc
- net: really orphan skbs tied to closing sk
- mlx4: fix EEPROM dump support
- bpf: fix alu32 const subreg bound tracking on bitwise operations
- bpf: fix mask direction swap upon off reg sign change
- bpf, offload: reorder offload callback 'prepare' in verifier
- stmmac: Fix MAC WoL not working if PHY does not support WoL
- packetmmap: fix only tx timestamp on request
- tipc: skb_linearize the head skb when reassembling msgs
Previous releases - always broken:
- mac80211: address recent "FragAttacks" vulnerabilities
- mac80211: do not accept/forward invalid EAPOL frames
- mptcp: avoid potential error message floods
- bpf, ringbuf: deny reserve of buffers larger than ringbuf to
prevent out of buffer writes
- bpf: forbid trampoline attach for functions with variable arguments
- bpf: add deny list of functions to prevent inf recursion of tracing
programs
- tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
- can: isotp: prevent race between isotp_bind() and
isotp_setsockopt()
- netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
fallback to non-AVX2 version
Misc:
- bpf: add kconfig knob for disabling unpriv bpf by default"
* tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (172 commits)
net: phy: Document phydev::dev_flags bits allocation
mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer
mptcp: avoid error message on infinite mapping
mptcp: drop unconditional pr_warn on bad opt
mptcp: avoid OOB access in setsockopt()
nfp: update maintainer and mailing list addresses
net: mvpp2: add buffer header handling in RX
bnx2x: Fix missing error code in bnx2x_iov_init_one()
net: zero-initialize tc skb extension on allocation
net: hns: Fix kernel-doc
sctp: fix the proc_handler for sysctl encap_port
sctp: add the missing setting for asoc encap_port
bpf, selftests: Adjust few selftest result_unpriv outcomes
bpf: No need to simulate speculative domain for immediates
bpf: Fix mask direction swap upon off reg sign change
bpf: Wrap aux data inside bpf_sanitize_info container
bpf: Fix BPF_LSM kconfig symbol dependency
selftests/bpf: Add test for l3 use of bpf_redirect_peer
bpftool: Add sock_release help info for cgroup attach/prog load command
net: dsa: microchip: enable phy errata workaround on 9567
...