Use cache friendly helpers to better use cpu caches
while reading /proc/net/netstat
Tested on a platform with 256 threads (AMD Rome)
Before: 305 usec spent in netstat_seq_show()
After: 130 usec spent in netstat_seq_show()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210128162145.1703601-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The commit 41b14fb872 ("net: Do not clear the sock TX queue in
sk_set_socket()") removes sk_tx_queue_clear() from sk_set_socket() and adds
it instead in sk_alloc() and sk_clone_lock() to fix an issue introduced in
the commit e022f0b4a0 ("net: Introduce sk_tx_queue_mapping"). On the
other hand, the original commit had already put sk_tx_queue_clear() in
sk_prot_alloc(): the callee of sk_alloc() and sk_clone_lock(). Thus
sk_tx_queue_clear() is called twice in each path.
If we remove sk_tx_queue_clear() in sk_alloc() and sk_clone_lock(), it
currently works well because (i) sk_tx_queue_mapping is defined between
sk_dontcopy_begin and sk_dontcopy_end, and (ii) sock_copy() called after
sk_prot_alloc() in sk_clone_lock() does not overwrite sk_tx_queue_mapping.
However, if we move sk_tx_queue_mapping out of the no copy area, it
introduces a bug unintentionally.
Therefore, this patch adds a compile-time check to take care of the order
of sock_copy() and sk_tx_queue_clear() and removes sk_tx_queue_clear() from
sk_prot_alloc() so that it does the only allocation and its callers
initialize fields.
CC: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20210128150217.6060-1-kuniyu@amazon.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The main arcnet interrupt handler calls arcnet_close() then
arcnet_open(), if the RESET status flag is encountered.
This is invalid:
1) In general, interrupt handlers should never call ->ndo_stop() and
->ndo_open() functions. They are usually full of blocking calls and
other methods that are expected to be called only from drivers
init and exit code paths.
2) arcnet_close() contains a del_timer_sync(). If the irq handler
interrupts the to-be-deleted timer, del_timer_sync() will just loop
forever.
3) arcnet_close() also calls tasklet_kill(), which has a warning if
called from irq context.
4) For device reset, the sequence "arcnet_close(); arcnet_open();" is
not complete. Some children arcnet drivers have special init/exit
code sequences, which then embed a call to arcnet_open() and
arcnet_close() accordingly. Check drivers/net/arcnet/com20020.c.
Run the device RESET sequence from a scheduled workqueue instead.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20210128194802.727770-1-a.darwish@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In order to query tm info of nodes, priority and qset
for debugging, adds three debugfs files tm_nodes,
tm_priority and tm_qset in newly created tm directory.
Unlike previous debugfs commands, these three files
just support read ops, so they only support to use cat
command to dump their info.
The new tm file style is acccording to suggestion from
Jakub Kicinski's opinion as link https://lkml.org/lkml/2020/9/29/2101.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add some interfaces to get information of tm priority and qset,
then they can be used by debugfs.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xin Long says:
====================
net: add support for ip generic checksum offload for gre
This patchset it to add ip generic csum processing first in
skb_csum_hwoffload_help() in Patch 1/2 and then add csum
offload support for GRE header in Patch 2/2.
====================
Link: https://lore.kernel.org/r/cover.1611825446.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch is to add csum offload support for gre header:
On the TX path in gre_build_header(), when CHECKSUM_PARTIAL's set
for inner proto, it will calculate the csum for outer proto, and
inner csum will be offloaded later. Otherwise, CHECKSUM_PARTIAL
and csum_start/offset will be set for outer proto, and the outer
csum will be offloaded later.
On the GSO path in gre_gso_segment(), when CHECKSUM_PARTIAL is
not set for inner proto and the hardware supports csum offload,
CHECKSUM_PARTIAL and csum_start/offset will be set for outer
proto, and outer csum will be offloaded later. Otherwise, it
will do csum for outer proto by calling gso_make_checksum().
Note that SCTP has to do the csum by itself for non GSO path in
sctp_packet_pack(), as gre_build_header() can't handle the csum
with CHECKSUM_PARTIAL set for SCTP CRC csum offload.
v1->v2:
- remove the SCTP part, as GRE dev doesn't support SCTP CRC CSUM
and it will always do checksum for SCTP in sctp_packet_pack()
when it's not a GSO packet.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
NETIF_F_IP|IPV6_CSUM feature flag indicates UDP and TCP csum offload
while NETIF_F_HW_CSUM feature flag indicates ip generic csum offload
for HW, which includes not only for TCP/UDP csum, but also for other
protocols' csum like GRE's.
However, in skb_csum_hwoffload_help() it only checks features against
NETIF_F_CSUM_MASK(NETIF_F_HW|IP|IPV6_CSUM). So if it's a non TCP/UDP
packet and the features doesn't support NETIF_F_HW_CSUM, but supports
NETIF_F_IP|IPV6_CSUM only, it would still return 0 and leave the HW
to do csum.
This patch is to support ip generic csum processing by checking
NETIF_F_HW_CSUM for all protocols, and check (NETIF_F_IP_CSUM |
NETIF_F_IPV6_CSUM) only for TCP and UDP.
Note that we're using skb->csum_offset to check if it's a TCP/UDP
proctol, this might be fragile. However, as Alex said, for now we
only have a few L4 protocols that are requesting Tx csum offload,
we'd better fix this until a new protocol comes with a same csum
offset.
v1->v2:
- not extend skb->csum_not_inet, but use skb->csum_offset to tell
if it's an UDP/TCP csum packet.
v2->v3:
- add a note in the changelog, as Willem suggested.
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmATyC0THG1rbEBwZW5n
dXRyb25peC5kZQAKCRCpyVqK+u3vqUMNB/4s7WkirDS+fMDpdB+vRjqguvJ/R4Ef
loplrX5NN8437xZUrA0x4d/jQ59jXOOT7F3MjEN2kU389FPikVmpw6owUSbxQhOD
cZ8BK1WhBNJBezOITfO4TWGowKsBN3guEBolcqBrCtMt97mxQ5u9rB0Ke0uFyGZ7
2ws1BpGK/wQBeORkyHRS7hY8dlyZbFJkUl4J9oKYuhvIwHc0x26+LZFFWvs8GFFu
NSnDqwhg4ImI27/wamyHGSWukk2IdUgKVwtNzUpSzEWmeEtp2EHFTgY56ErizCce
8EihbIB15brChkMO2J8mrHoJxzWR99kMqWuJ/EKdiycxtMx4+fj3wz91
=QfkK
-----END PGP SIGNATURE-----
Merge tag 'linux-can-next-for-5.12-20210129' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
linux-can-next-for-5.12-20210129
All patches are by me and target the mcp251xfd driver. The first 4
patches update the information regarding the "85% of (FSYSCLK/2)"
errata. The other 4 are misc cleanups, unitfy error messages, add
missing postfix to a macro, simplify the return of a function, and
make use of dev_err_probe() in the mcp251xfd_probe() function.
====================
Link: https://lore.kernel.org/r/20210129084302.3040284-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use the new mhi_get_free_desc_count helper to track queue usage
instead of relying on the locally maintained rx_queued count.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The RX queue size can be determined at runtime by retrieving the
number of available transfer descriptors.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This section was missed during the conversion to ReST, so convert it in the
same style as the surrounding section titles.
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Link: https://lore.kernel.org/r/20210128111930.29473-1-jlu@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This converts the driver to use the new tasklet API introduced in
commit 12cc923f1c ("tasklet: Introduce new initialization API")
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Link: https://lore.kernel.org/r/20210127173256.13954-2-kernel@esmil.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Previously a temporary tasklet structure was initialized on the stack
using DECLARE_TASKLET_OLD() and then copied over and modified. Nothing
else in the kernel seems to use this pattern, so let's just call
tasklet_init() like everyone else.
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Link: https://lore.kernel.org/r/20210127173256.13954-1-kernel@esmil.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Till now the code assumed that need to copy reduced size of the
ste because the rest is the mask part which shouldn't be changed.
This is not true for all types of HW (like STEv1).
Take all 64B from the new STE and write them in the replaced STE place.
This change will make it easier to handle all STE HW types because we have
all the data that is about to be written into HW.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
STEv0 format and STEv1 HW format are different, each has a
different order:
STEv0: CTRL 32B, TAG 16B, BITMASK 16B
STEv1: CTRL 32B, BITMASK 16B, TAG 16B
To make this transparent to upper layers we introduce a
new ste_ctx function to format the STE prior to writing it.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
In these cases we need to update only the ctrl area of the STE.
So it is better to write only the control 32B and avoid copying
the unneeded reduced 48B (control 32B + tag 16B).
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add HW specific modify header fields and logic to STEv1 file.
Since STEv0 and STEv1 modify actions values are different, each
version has its own implementation.
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add HW specific action apply logic to STEv1.
Since STEv0 and STEv1 actions format is different, each
version has its implementation.
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add HW specific setter and getters to STEv1 file.
Since STEv0 and STEv1 format are different, each version
should implemented different setters and getters.
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Some flex parser protocols are native as part of STEv1.
The check for supported protocols was modified to allow this.
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add STEv1 match logic to a new file.
This file will be used for HW specific STEv1.
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add mlx5_ifc_dr_ste_v1.h - a new header with HW specific
STE structs for version 1.
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Paul Blakey says:
====================
net/sched: cls_flower: Add support for matching on ct_state reply flag
This patchset adds software match support and offload of flower
match ct_state reply flag (+/-rpl).
The first patch adds the definition for the flag and match to flower.
Second patch gives the direction of the connection to the offloading
drivers via ct_metadata flow offload action.
The last patch does offload of this new ct_state by using the supplied
connection's direction.
====================
Link: https://lore.kernel.org/r/1611757967-18236-1-git-send-email-paulb@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Give offloading drivers the direction of the offloaded ct flow,
this will be used for matches on direction (ct_state +/-rpl).
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bongsu Jeon says:
====================
Add nci suit and virtual nci device driver
1/2 is the Virtual NCI device driver.
2/2 is the NCI selftest suite
====================
Link: https://lore.kernel.org/r/20210127130829.4026-1-bongsu.jeon@samsung.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This is the NCI test suite. It tests the NFC/NCI module using virtual NCI
device. Test cases consist of making the virtual NCI device on/off and
controlling the device's polling for NCI1.0 and NCI2.0 version.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
NCI virtual device simulates a NCI device to the user. It can be used to
validate the NCI module and applications. This driver supports
communication between the virtual NCI device and NCI module.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It's better make 'pkt_sk()' inline here, as non-inline function
shouldn't occur in headers. Besides, this function is simple
enough to be inline.
Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Link: https://lore.kernel.org/r/20210127123302.29842-1-dong.menglong@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pointers to receive-buffer packets sent by Hyper-V are used within the
guest VM. Hyper-V can send packets with erroneous values or modify
packet fields after they are processed by the guest. To defend against
these scenarios, copy (sections of) the incoming packet after validating
their length and offset fields in netvsc_filter_receive(). In this way,
the packet can no longer be modified by the host.
Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20210126162907.21056-1-parri.andrea@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Fix the virt_addr_valid() returning true for < PAGE_OFFSET addresses.
- Do not blindly trust the DMA masks from ACPI/IORT.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmAUWtoACgkQa9axLQDI
XvFuwQ//S3Qu1HZycB+YRji8cD0iXvXACCyfq41eqVXUWIE0lIlm+WNUxXYpcYhR
rkKNhtLwSfpCP5I9bjQvoSp1WMWB+2w+SNNorCA3nGxq2B/HYp9MQZ/qsVYW4j23
6bo19lqXs+R3ltk8yujrdZtC/ktokNaoTrDoeuyinL4NivGMhNMOFEZhcNIL4IvU
ckpoRpYMoZ7g/pNR1tl8PlCaHkOgWkkwz0HV/YvB7KwxEPBKBqH+a0zYLpVNqJLP
cpffjKDY8fGZw7iL2cdXkt9+TswbBlJvGxCHpZ57JfWdnG7Azs3WsZbG+FheGRrX
ndi7bDxo2XqkY50etpDsmGO3e6uK2S3BzuWL4Sjy1ZRki2onmTGS3rg5cdWAsQWM
P7KO8yh9gJQM/3PACq/2GA6Rx5lINq1CWeJuZn8IdlGHUgBzJmNN5Fjv/8vjL8z8
pIHiiuLHHV5OwPBdAqpQ2q1u3wNAsST30Z+le2QxY6jwP8cn6y61now6dkho5vdM
1UqqW0Mk/rWYTytnzJDLG5Ez/xEotKw6z29clbWA/pk7b0hE4iHfNwFr81eV2k7C
xK6cSzgYS3/PBFpdcQ6eZXTM0RcMcLHT2VWH6bOF/KbkDMM5xb0jXnkbSC1987aI
Iaq6xILPhDXvQJIb4AJq+qhgmYRUKdJasncPyWNNZd6qmCv1P04=
=CBSN
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Fix the virt_addr_valid() returning true for < PAGE_OFFSET addresses.
- Do not blindly trust the DMA masks from ACPI/IORT.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
ACPI/IORT: Do not blindly trust DMA masks from firmware
arm64: Fix kernel address detection of __is_lm_address()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmAUIkAACgkQxWXV+ddt
WDsWVg/+IIEk9H1v9q9ShvVmPvmnlT8/0ywj1hdwFMBkFBjIeU8tBz9ZMGPXCzrF
XemmWKChVOnR3SIq/bMrwuRC/Gv/pBvwVshXLP51YJHv7lSGX0Ayrb27BFQcVaC/
3QhpE7veEiqxwLyMj+LWG4hE2X+oqiqzrXCpeC5un4zEluT45RSKooqueQ4jM8aw
DrKLQA57a1YEIqrE2KQzy5A6BnSNyxPXEEX34kbugmmen46Fh77hrwme1K9vQn1t
v3/V4LcarXADxxokAxU2Igb/vK0+BN33NOYsBwLWWD4kUaTGS4KczsDOowkRRTMH
/qiQUdca0X7ElR+VFl8rgB8PxuJcZ87aCdsMkErUA4sjxyp11VDIeEgirPNAcXtR
b+1LIkn3k3l8JzkKyXwDuZuNBsh0idTY24IE+QDBMIGq+jE1N6N3t5gEwa2NeaiP
9O5QnS5XAJCo8a9+gp1aF5z94vwQwvf9TA80nGrnpxGmXEEEZ9PgXsc4JON1Blhn
NtJDwBPzEjHCEYdE73/lRMsLmYeGhpRugKb+lQ+OTo2iZzxH2SjWn9vXKiN7vAp2
zysjzdPfkY5BLggH5cPg0fuRaf/Is00EeVqn3eA7QsFKDhrpoPFBO+aV5xeshsaz
8fjt7kkXFb+Vyy4SDvmPioJQ7/MFZ5Czn+BL1JwO4l/vYcEMUzM=
=/yHv
-----END PGP SIGNATURE-----
Merge tag 'for-5.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes for a late rc:
- fix lockdep complaint on 32bit arches and also remove an unsafe
memory use due to device vs filesystem lifetime
- two fixes for free space tree:
* race during log replay and cache rebuild, now more likely to
happen due to changes in this dev cycle
* possible free space tree corruption with online conversion
during initial tree population"
* tag 'for-5.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix log replay failure due to race with space cache rebuild
btrfs: fix lockdep warning due to seqcount_mutex on 32bit arch
btrfs: fix possible free space tree corruption with online conversion
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmAUXQsQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgppO4EAClcqoneAuhT4UvRVNxblXPhPaoC69aNgXd
s+34uQSCqeWrWIAokfKp8bh3kyRqe00591auA7DwtwNqGpWuIECX8o9QvROEkuxv
0o4JFGMTHOJKP1W79Oy3RpF5oee6rMMOQN7EFL272p2xd8NRCP33c4fKvJRz+DDE
0kCcZhVjca0nZ+9OJC+WAlV+dit3azCAKSp7cItJsdOgZL74ZcGECm0pA8RpStyi
tQrUr2yiHLkm1lcOYfid0fG2/5a4vAGZQav+EshOWYw9UGeMquq/aqPuZZtEUjKe
oEECACfJ9cWErsi1CirIk5j5RKHOHmFSG3kRAmyvFB4f3YDGYxerI7eodWjNA0d5
38wW96sWuV4l0ShPmD3jGWIDTTcDZh4nEImCObf5YJFbr2fQXofWVWseIyo0zG8Y
zDa1N/M7XgkrScX8OF33NC1uv/oExhHA7jXuQN6mRBESYjcCrH2Lf6mXAA2C8u4T
z1RaG7ckRXGSbV3ol1ROrHj0RTXQ3zeIHj3yMRU8TKH0z6s+ob46D2PZCLi6cLvI
IuELhzKsS1EzMSVsYk9/AegynWFjVCRJoVUVxTsrxfGEF7attwmur3lOAjbZwSWb
jXlRbrkgBL1Pwbjg8AODEoq0jJgVM/S/3fG2rpcYLwwYC+FQ73/K+URmEuMsqkFC
GrYllTSMFg==
=hb7W
-----END PGP SIGNATURE-----
Merge tag 'block-5.11-2021-01-29' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"All over the place fixes for this release:
- blk-cgroup iteration teardown resched fix (Baolin)
- NVMe pull request from Christoph:
- add another Write Zeroes quirk (Chaitanya Kulkarni)
- handle a no path available corner case (Daniel Wagner)
- use the proper RCU aware list_add helper (Chao Leng)
- bcache regression fix (Coly)
- bdev->bd_size_lock IRQ fix. This will be fixed in drivers for 5.12,
but for now, we'll make it IRQ safe (Damien)
- null_blk zoned init fix (Damien)
- add_partition() error handling fix (Dinghao)
- s390 dasd kobject fix (Jan)
- nbd fix for freezing queue while adding connections (Josef)
- tag queueing regression fix (Ming)
- revert of a patch that inadvertently meant that we regressed write
performance on raid (Maxim)"
* tag 'block-5.11-2021-01-29' of git://git.kernel.dk/linux-block:
null_blk: cleanup zoned mode initialization
nvme-core: use list_add_tail_rcu instead of list_add_tail for nvme_init_ns_head
nvme-multipath: Early exit if no path is available
nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a SPCC device
bcache: only check feature sets when sb->version >= BCACHE_SB_VERSION_CDEV_WITH_FEATURES
block: fix bd_size_lock use
blk-cgroup: Use cond_resched() when destroy blkgs
Revert "block: simplify set_init_blocksize" to regain lost performance
nbd: freeze the queue while we're adding connections
s390/dasd: Fix inconsistent kobject removal
block: Fix an error handling in add_partition
blk-mq: test QUEUE_FLAG_HCTX_ACTIVE for sbitmap_shared in hctx_may_queue
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmAUXJoQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpplXD/9v4iQNBN/TzLnFufOAoSX8Y6Gm/0ykr7k3
wPVNBMJ4g7twdI2FDFZn6GDfEpT7+aIjSyPOcGbUznvVFNYzLrTdGpzxOXZ91E6K
G0wpxhYgQxeiaCYpfa4JFw1bfPSWM/e9IZ7dqO2rpUj0yJC2+0mUDP2xpoTbyfeR
bP/qVMp7Ij0WRul4GWHUN/KURYnpY97/3uGcXqjyxYA06KstIMMfxWCqx0So3eGR
MCjrHtASey/I0XnhcJ0M7Wa2OJBHzrh9txP2YCHtI1u3mU13V65L0kw5i4FzFlKY
g7OpAXmUnWuoLtUe/aPX5/gtSbtYeRrkmF4PRv/7FtW+pE9mWo7LtJC6ymWW/ymG
5qa3oc3X1A25EMnMngLfOcgOHkMQW5NQzBMXGObuYSQoiwp3eJY8JdJCabqbM8kx
9oJlOKiZU/jEbzNvPGZmjSjGj7uzAL90fK9K3X7pCB/ZIynzQo5mVhaGoeWNW6Nq
b+G0qcL79Ct1tas0Dgan86388yiS56CUGJOIGyDTvlIlKSCXo3K7/e1SFnt43M6K
WRHp8MgL7crM7UZpKAyBZD4BeL3SHp3sJMYdd0EgrJiHCO2IODDAmsuF8n57ef/1
aSmKKo8/hjxxFZ7NsBF8N3y+1SfItKjr3sZgGW6hXM+kzNFM2WPcOBHGsoejon/e
sZlBSj8D+w==
=oCb5
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.11-2021-01-29' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"We got the cancelation story sorted now, so for all intents and
purposes, this should be it for 5.11 outside of any potential little
fixes that may come in. This contains:
- task_work task state fixes (Hao, Pavel)
- Cancelation fixes (me, Pavel)
- Fix for an inflight req patch in this release (Pavel)
- Fix for a lock deadlock issue (Pavel)"
* tag 'io_uring-5.11-2021-01-29' of git://git.kernel.dk/linux-block:
io_uring: reinforce cancel on flush during exit
io_uring: fix sqo ownership false positive warning
io_uring: fix list corruption for splice file_get
io_uring: fix flush cqring overflow list while TASK_INTERRUPTIBLE
io_uring: fix wqe->lock/completion_lock deadlock
io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE
io_uring: only call io_cqring_ev_posted() if events were posted
io_uring: if we see flush on exit, cancel related tasks
The LiteX SOC controller driver makes use of IOMEM functions like
devm_platform_ioremap_resource(), which are only available if
CONFIG_HAS_IOMEM is defined.
This causes the driver to be enable under make ARCH=um allyesconfig,
even though it won't build.
By adding a dependency on HAS_IOMEM, the driver will not be enabled on
architectures which don't support it.
Fixes: 22447a99c9 ("drivers/soc/litex: add LiteX SoC Controller driver")
Signed-off-by: David Gow <davidgow@google.com>
[shorne@gmail.com: Fix typo in commit message pointed out in review]
Signed-off-by: Stafford Horne <shorne@gmail.com>
Including:
- AMD IOMMU Fix to make sure features are detected before they
are queried.
- Intel IOMMU address alignment check fix for an IOLTB flushing
command.
- Performance fix for Intel IOMMU to make sure the code does not
do full IOTLB flushes all the time. Those flushes are very
expensive on emulated IOMMUs.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmAUOaQACgkQK/BELZcB
GuO2hQ/9HOWNK4R06JufpOy+G29M4nY74RkfB2DuFjamouKkgSEODyv9bzz6Rhn6
WvTPzIxhaFuaw5MCiOp1a80CnzP9rPLn8zMAG9jSh4dHMnIvCZsXrJJkIT1NIBOH
Sryz1tqu3MNWvOauoQnopRSbmhlEMTfgOY62Tn4bcz92Y49LH5/gb0Wcz/YZbFZd
q3f3UKI2xOTB2nGckxGEOMtpWSkaeNZs4GhPl9HwFXUwkNgPleVJZcnx/KL6fPRk
WTmPOgf7EVoVr/2aAtLvEWwZIJ6OE0D1AgYgPn6y18UsBYQiAz1/5k+M2r1lJo5L
UKnjSsRuybAO6aQwRdl7kQXk9bTzU0D7tVqMvWLsiJP02U/+0LiRH1WIY+eWnkRt
syHBWTYQFqLryHmXyy7rNEEslqgidP+SBSH6NDdzKJR5OwjzAqNgrnMxIlciL/n0
1hlwR3B5OjliHilMZo8/Bc9XzAMlK7nl5EpMcCvhYocX5pJV1FQrOhrIgRASWVZ4
Ce+F4RxgbpyYihDNg3PpE2DMh/5OjxXCIsTGh5BA4AQgINrRe6ETwh6Vw3NBJI5k
T7L/MVqo59nBPPZ5RmVdyl6XjTTHFMt8C9zpnNg4LbOuNUP0Udm0vLaiJKldzg5B
oth+P7ftg9H9KJtOQtkfSP8A/WCaHRqHJOuMwZ53eb7EL+JEq0E=
=djR4
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- AMD IOMMU fix to make sure features are detected before they are
queried.
- Intel IOMMU address alignment check fix for an IOLTB flushing
command.
- Performance fix for Intel IOMMU to make sure the code does not do
full IOTLB flushes all the time. Those flushes are very expensive
on emulated IOMMUs.
* tag 'iommu-fixes-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Do not use flush-queue when caching-mode is on
iommu/vt-d: Correctly check addr alignment in qi_flush_dev_iotlb_pasid()
iommu/amd: Use IVHD EFR for early initialization of IOMMU features
- Fix a deadlock caused by attempting to acquire the same mutex
twice in a row in the "kexec jump" code (Baoquan He).
- Modify the hibernation image saving code to flush the unwritten
data to the swap storage later so as to avoid failing to write the
image signature which is possible in some cases (Laurent Badel).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmAUNj8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxrEcP/2KQPLD4PkHMw8qr2h2m9Dp6Lc5bl+C2
bEL/IeDNojtndF7z9q3Fp7EOpffJJV1q9zX06HEKZF4d59fa9gE5oGt9bRcpRbpf
74cDRTLCNr4UpigzTJux2wfgy9XZ8mWuRzIQUTOHgn17YK2tKteTFInxsCqo45+A
i6zj0EYM/0UVGX48ZPf/JS6QqzI5Zh73dOuz/PjqTsmKBKQl3X1mJRGyLKeBhb6I
MTaBR622PyTDCXzksLxApk4k1Oh7+f6TRUMmykA8KdIwRZCfdp23AxzT8EWaRXZD
BNUwCBCKLSiQFtuySvXLgeMAf2yPk0B+0CHFAriy8YiuGqJSN4Q4/PtnDl7TS61J
BieKAJPbNClvNRc3j8XxyWHR1lcNabxsoE4l4PKXVrrsHu7qrylJV1+d/ZfeL5o+
k0izFUf5PCECBo0nIA1sWWWJU0ro5YQ3mkTB6Yk0jTt4PK//UaZjrFhpbebtPWnS
M06El03mzebRDl87K6L5/kDAty8yx+5Y1L3Y/KSk3X4LTsySnwsIbPJh1ZUL9HLe
FXJRa7zUYX0CiwXT65oWhnrbaat02BA/CrkFVmkFPA/+izhgN580TcDx7ljC3Hyt
1WrsWyvmmmPYrTDqB6DirrwwAYqF9XO53lqf42CFSzdu+fjoDHwDVUyEOMQMO50p
HuLwvCyGb7Mm
=jh7b
-----END PGP SIGNATURE-----
Merge tag 'pm-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a deadlock in the 'kexec jump' code and address a possible
hibernation image creation issue.
Specifics:
- Fix a deadlock caused by attempting to acquire the same mutex twice
in a row in the "kexec jump" code (Baoquan He)
- Modify the hibernation image saving code to flush the unwritten
data to the swap storage later so as to avoid failing to write the
image signature which is possible in some cases (Laurent Badel)"
* tag 'pm-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: hibernate: flush swap writer after marking
kernel: kexec: remove the lock operation of system_transition_mutex
- Modify the ACPI thermal driver to avoid evaluating _TMP directly
in its Notify () handler callback and running too many thermal
checks for one thermal zone at the same time so as to address a
work item accumulation issue observed on some systems that fail
to shut down as a result of it (Rafael Wysocki).
- Modify the ACPI uevent file creation code to avoid putting
multiple "MODALIAS=" entries in one uevent file in sysfs which
breaks systemd-udevd (Kai-Heng Feng).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmAUMLkSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxhFcP/2G2fh0ewRbusrUOOQVsmzDiaowsjX9Z
V5dIhhe8B8FP3nF/cw1PHHrvExWqPMHswIWQq84vHzvhQBl2f5VxKwg6VKhY0e0J
dlK7zCNyMTkVPl/6xJdKW7+xZBRe1Bg0pgsSI4joVrn39P777g0iCeDznJvamT9g
rwWUckGwGff6jzH0oWjWhMTrIkMlzgdYSL2+zHAUzZmUkxwRaId8yk+JnBZfebhc
HOX8XUl2Pd0rbHdDWbuaJKOOcDVz6Fy/c1HIppVpe5dwVVFZ4jpI/DABH/h6Skyq
A+arRA8oMk/YORdsp8z4wcW6F8JXneUfulOizVnyhuC+244ABytCq2R6+OT7cbCB
QsDVIFuc1NRmwVVJV2c9hfsBSa53TwUOLlIoi9xtOIm5WBPwGpdyRBFVA1I58jnT
td9BlvR/Lmn051FLtHhCIhxSpANv6leawWsI0LnTmO5bNwPQwg6upWl61he5K8Vi
nNBBs6nptlq7RA9t9tj+x3CGgK2Dd21+lb25LuOnX0eBLL8VvtBWeR1THEasMSBs
Maajb0YPWjyRrcRpXx+qjU2P++LjpqSEOJdeBvjrdrmlID39WEraLmuJAi4KDCXI
oZikBMvdCIDlHN+yu8tMW5M4rFRDLQ1CfLz2ABsNtbUDBBsmYwkgQwrGYRS1KJWv
VbDBlMBRPhs9
=4+WP
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix the handling of notifications in the ACPI thermal driver and
address a device enumeration issue leading to the presence of multiple
'MODALIAS=' entries in one uevent file in sysfs in some cases.
Specifics:
- Modify the ACPI thermal driver to avoid evaluating _TMP directly in
its Notify () handler callback and running too many thermal checks
for one thermal zone at the same time so as to address a work item
accumulation issue observed on some systems that fail to shut down
as a result of it (Rafael Wysocki)
- Modify the ACPI uevent file creation code to avoid putting multiple
'MODALIAS=' entries in one uevent file in sysfs which breaks
systemd-udevd (Kai-Heng Feng)"
* tag 'acpi-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: thermal: Do not call acpi_thermal_check() directly
ACPI: sysfs: Prefer "compatible" modalias
nouveau:
- fix svm init conditions
- fix nv50 modesetting regression
- fix cursor plane modifiers
- fix > 64x64 cursor regression
vc4:
- Fix LBM size calculation
- Fix high resolutions for hvs5
i915:
- Fix ICL MG PHY vswing
- Fix subplatform handling
- Fix selftest memleak
- Clear CACHE_MODE prior to clearing residuals
- Always flush the active worker before returning from the wait
- Always try to reserve GGTT address 0x0
amdgpu:
- Fix a fan control regression on some boards
- Fix clang warning
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJgE4RmAAoJEAx081l5xIa+aaUQAIzLpu+6B1JE/wYURi1ICpvQ
M4+oV/5M3yC6WeWZG+E6zOpBegZApZowF7inzkZKHdsru3YTgdP/kSjlC+lyJMF2
l2QPDTckm/RXTI4vSTsFudWWiL69HGjhkgBnb+cyjT/YeReUBcHZzmeNLU23v0zl
rPDZM3tIN7BZHglDB4uolC7rAQulT+TfcpcwvCA3qamkYUJOAsCnFc7dW9Q/6hDy
BFaQ9n5pM9NxA8azLYcB5qCcTKQt347FzX6A936h0FCgKoJu/EfrDQRf3Bxc0o+o
eizK8WUjtrPbWh8Rtvyfi8dIFiY0v/lUjWETDmiy3aBKv9t4gEAYfL2yFmdS/0Dx
60M8Bgbodz5RG63l6If0Di62Znh2Pp9kDFbfmlhdchYxCRxkSFmFqvmL6eH5QD2C
YpMsfRTQ3vAolpAw4kV2XAS6ogfNoLzr5u4h8zcP5z0B4psIa/+2jaNPJh1nuYn4
R5fBRvMi3deTYIeL3KTJ6AppsaLqMazEHsjf5i25Sy7nxqLEJIoFu9xx6D6RSxcG
i5Hfa4Lj/1j35IwOthZvtGyPskc+b8OPGUqREM3Am9tu2r0XM103aV7e0Ny1AEOS
ZkfaKgbHHBQbyZD7AoeIBC/7/+QJjrBGmRIUpd62Cgx9OVkVHJsOPGqINGoD23+m
Qx6kHnBsWSaOhwHSYj0E
=caGV
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2021-01-29' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Weekly fixes for graphics, nothing too major, nouveau has a few
regression fixes for various fallout from header changes previously,
vc4 has two fixes, two amdgpu, and a smattering of i915 fixes.
All seems on course for a quieter rc7, fingers crossed.
nouveau:
- fix svm init conditions
- fix nv50 modesetting regression
- fix cursor plane modifiers
- fix > 64x64 cursor regression
vc4:
- Fix LBM size calculation
- Fix high resolutions for hvs5
i915:
- Fix ICL MG PHY vswing
- Fix subplatform handling
- Fix selftest memleak
- Clear CACHE_MODE prior to clearing residuals
- Always flush the active worker before returning from the wait
- Always try to reserve GGTT address 0x0
amdgpu:
- Fix a fan control regression on some boards
- Fix clang warning"
* tag 'drm-fixes-2021-01-29' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/kms/gk104-gp1xx: Fix > 64x64 cursors
drm/nouveau/kms/nv50-: Report max cursor size to userspace
drivers/nouveau/kms/nv50-: Reject format modifiers for cursor planes
drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices
drm/nouveau/dispnv50: Restore pushing of all data.
amdgpu: fix clang build warning
Revert "drm/amdgpu/swsmu: drop set_fan_speed_percent (v2)"
drm/i915/gt: Always try to reserve GGTT address 0x0
drm/i915: Always flush the active worker before returning from the wait
drm/i915/selftest: Fix potential memory leak
drm/i915: Check for all subplatform bits
drm/i915: Fix ICL MG PHY vswing handling
drm/i915/gt: Clear CACHE_MODE prior to clearing residuals
drm/vc4: Correct POS1_SCL for hvs5
drm/vc4: Correct lbm size and calculation
drm/nouveau/nvif: fix method count when pushing an array
It turns out that the vfs_iocb_iter_{read,write}() functions are
entirely broken, and don't actually use the passed-in file pointer for
IO - only for the preparatory work (permission checking and for the
write_iter function lookup).
That worked fine for overlayfs, which always builds the new iocb with
the same file pointer that it passes in, but in the general case it ends
up doing nonsensical things (and could cause an iterator call that
doesn't even match the passed-in file pointer).
This subtly broke the tty conversion to write_iter in commit
9bb48c82ac ("tty: implement write_iter"), because the console
redirection didn't actually end up redirecting anything, since the
passed-in file pointer was basically ignored, and the actual write was
done with the original non-redirected console tty after all.
The main visible effect of this is that the console messages were no
longer logged to /var/log/boot.log during graphical boot.
Fix the issue by simply not using the vfs write "helper" function at
all, and just redirecting the write entirely internally to the tty
layer. Do the target writability permission checks when actually
registering the target tty with TIOCCONS instead of at write time.
Fixes: 9bb48c82ac ("tty: implement write_iter")
Reported-and-tested-by: Hans de Goede <hdegoede@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To avoid potential compilation problems, replaced the badly written
MB_TO_SECTS() macro (missing parenthesis around the argument use) with
the inline function mb_to_sects(). And while at it, simplify the
calculation of the total number of zones of the device using the
round_up() macro.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>