Commit Graph

1295510 Commits

Author SHA1 Message Date
Matthew Brost
90be4cc6f7
drm/xe: Add xe_gt_tlb_invalidation_fence_init helper
Other layers should not be touching struct xe_gt_tlb_invalidation_fence
directly, add helper for initialization.

v2:
 - Add dma_fence_get and list init to xe_gt_tlb_invalidation_fence_init

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-2-matthew.brost@intel.com
(cherry picked from commit a522b285c6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:22 -04:00
Michal Wajdeczko
4f7652dcd3
drm/xe/pf: Fix VF config validation on multi-GT platforms
When validating VF config on the media GT, we may wrongly report
that VF is already partially configured on it, as we consider GGTT
and LMEM provisioning done on the primary GT (since both GGTT and
LMEM are tile-level resources, not a GT-level).

This will cause skipping a VF auto-provisioning on the media-GT and
in result will block a VF from successfully initialize that GT.

Fix that by considering GGTT and LMEM configurations only when
checking if a VF provisioning is complete, and omit GGTT and LMEM
when reporting empty/partial provisioning.

Fixes: 234670cea9 ("drm/xe/pf: Skip fair VFs provisioning if already provisioned")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240806180516.618-1-michal.wajdeczko@intel.com
(cherry picked from commit 5bdacb0907)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:22 -04:00
Matthew Brost
55ea73aacf
drm/xe: Build PM into GuC CT layer
Take PM ref when any G2H are outstanding, drop when none are
outstanding.

To safely ensure we have PM ref when in the GuC CT layer, a PM ref needs
to be held when scheduler messages are pending too.

v2:
 - Add outer PM protections to xe_file_close (CI)
v3:
 - Only take PM ref 0->1 and drop on 1->0 (Matthew Auld)
v4:
 - Add assert to G2H increment function
v5:
 - Rebase
v6:
 - Declare xe as local variable in xe_file_close (CI)

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719172905.1527927-5-matthew.brost@intel.com
(cherry picked from commit d930c19fdf)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:22 -04:00
Michal Wajdeczko
64da63cd3f
drm/xe/vf: Fix register value lookup
We should use the number of actual entries stored in the runtime
register buffer, not the maximum number of entries that this buffer
can hold, otherwise bsearch() may fail and we may miss the data and
wrongly report unexpected access to some registers.

Fixes: 4edadc41a3 ("drm/xe/vf: Use register values obtained from the PF")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718203155.486-1-michal.wajdeczko@intel.com
(cherry picked from commit ad16682db1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:22 -04:00
Umesh Nerlige Ramappa
817c70e2ba
drm/xe: Fix use after free when client stats are captured
xe_file_close triggers an asynchronous queue cleanup and then frees up
the xef object. Since queue cleanup flushes all pending jobs and the KMD
stores client usage stats into the xef object after jobs are flushed, we
see a use-after-free for the xef object. Resolve this by taking a
reference to xef from xe_exec_queue.

While at it, revert an earlier change that contained a partial work
around for this issue.

v2:
- Take a ref to xef even for the VM bind queue (Matt)
- Squash patches relevant to that fix and work around (Lucas)

v3: Fix typo (Lucas)

Fixes: ce62827bc2 ("drm/xe: Do not access xe file when updating exec queue run_ticks")
Fixes: 6109f24f87 ("drm/xe: Add helper to accumulate exec queue runtime")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1908
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-5-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 2149ded630)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:22 -04:00
Umesh Nerlige Ramappa
6309f9b1fc
drm/xe: Take a ref to xe file when user creates a VM
Take a reference to xef when user creates the VM and put the reference
when user destroys the VM.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-4-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit a2387e6949)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:21 -04:00
Umesh Nerlige Ramappa
d28bb0120f
drm/xe: Add ref counting for xe_file
Add ref counting for xe_file.

v2:
- Add kernel doc for exported functions (Matt)
- Instead of xe_file_destroy, export the get/put helpers (Lucas)

v3: Fixup the kernel-doc format and description (Matt, Lucas)

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-3-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit ce8c161cba)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:21 -04:00
Umesh Nerlige Ramappa
e98a032c03
drm/xe: Move part of xe_file cleanup to a helper
In order to make xe_file ref counted, move destruction of xe_file
members to a helper.

v2: Move xe_vm_close_and_put back into xe_file_close (Matt)

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-2-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 3d0c4a62cc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:21 -04:00
Matthew Brost
ddeb7989a9
drm/xe: Validate user fence during creation
Fail invalid addresses during user fence creation.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240717140429.1396820-1-matthew.brost@intel.com
(cherry picked from commit 0fde907da2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-15 09:44:21 -04:00
Paolo Abeni
9c5af2d7df netfilter pull request 24-08-15
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAma9K0sACgkQ1V2XiooU
 IOQnDBAAr/f1e4LPZMrzV3D2eN4+pajqmREG7Qha8LtEWBQvOOeabfqolsefHx18
 Plzy/LIg/ntEJN8pYG+/BCrQujGMQd+ihsD099c3b2C3/t7lXofaZsxmWu+/z/Lw
 RovagzMGSt2ziprqrbV45U7YkmNe+vkGIsseD4y2VVUGWFNM+DEtyh2uwp3dxrGD
 E5uPN1uUelVsfJsAMdKGsthiKkJvrGN2S80GzD4xJaupc7CltOmc82R4D80gMSmw
 9hBTsbslwJ0TyvFjYPXaVAhGYfrLECqUxwJW8sJdlVcGSJBcx1Q6WN9+rpqYwKKE
 lgQGTQqLBmQ5mC1Z0RJNcKELYejYoUVhsleQr/WA+zWxbTIp01cp0W4QE9VZdBbn
 LCiAnvc6TAp6GN94e04/dBHUYq+eL0Wy1kvu5g3LQg0iTzqYGUww0VHG/iix/L8I
 xiVZNtauVZ8SdS5xN3ARcSWzV32pBEWnq67PZExniw5RrYZ99nsY9yXuYhaAumy0
 f7iKz52ROsxLEMAilDEucb/ont1PSx0q5S6JZVzUVzijYEz4hqiAjC3L+PiPuL7M
 HIAKT2QT+8EbpwjHO8dcMSvsaJSmMHb0k9YULqB/gMJDhBk1znqOICkL5K2hROD4
 SIjISgoYSX3Wb5/yTL+OCRmByzRrXy4NensZMCuZMIRLdES5pgc=
 =H17I
 -----END PGP SIGNATURE-----

Merge tag 'nf-24-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Ignores ifindex for types other than mcast/linklocal in ipv6 frag
   reasm, from Tom Hughes.

2) Initialize extack for begin/end netlink message marker in batch,
   from Donald Hunter.

3) Initialize extack for flowtable offload support, also from Donald.

4) Dropped packets with cloned unconfirmed conntracks in nfqueue,
   later it should be possible to explore lookup after reinject but
   Florian prefers this approach at this stage. From Florian Westphal.

5) Add selftest for cloned unconfirmed conntracks in nfqueue for
   previous update.

6) Audit after filling netlink header successfully in object dump,
   from Phil Sutter.

7-8) Fix concurrent dump and reset which could result in underflow
     counter / quota objects.

netfilter pull request 24-08-15

* tag 'nf-24-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests
  netfilter: nf_tables: Introduce nf_tables_getobj_single
  netfilter: nf_tables: Audit log dump reset after the fact
  selftests: netfilter: add test for br_netfilter+conntrack+queue combination
  netfilter: nf_queue: drop packets with cloned unconfirmed conntracks
  netfilter: flowtable: initialise extack before use
  netfilter: nfnetlink: Initialise extack before use in ACKs
  netfilter: allow ipv6 fragments to arrive on different devices
====================

Link: https://patch.msgid.link/20240814222042.150590-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 13:25:06 +02:00
Paolo Abeni
34dfdf210d Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'
Jijie Shao says:

====================
There are some bugfix for the HNS3 ethernet driver
====================

Link: https://patch.msgid.link/20240813141024.1707252-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 13:07:11 +02:00
Peiyang Wang
7660833d21 net: hns3: use correct release function during uninitialization
pci_request_regions is called to apply for PCI I/O and memory resources
when the driver is initialized, Therefore, when the driver is uninstalled,
pci_release_regions should be used to release PCI I/O and memory resources
instead of pci_release_mem_regions is used to release memory reasouces
only.

Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 13:07:08 +02:00
Peiyang Wang
86db7bfb06 net: hns3: void array out of bound when loop tnl_num
When query reg inf of SSU, it loops tnl_num times. However, tnl_num comes
from hardware and the length of array is a fixed value. To void array out
of bound, make sure the loop time is not greater than the length of array

Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 13:07:08 +02:00
Jie Wang
be5e816d00 net: hns3: fix a deadlock problem when config TC during resetting
When config TC during the reset process, may cause a deadlock, the flow is
as below:
                             pf reset start
                                 │
                                 ▼
                              ......
setup tc                         │
    │                            ▼
    ▼                      DOWN: napi_disable()
napi_disable()(skip)             │
    │                            │
    ▼                            ▼
  ......                      ......
    │                            │
    ▼                            │
napi_enable()                    │
                                 ▼
                           UINIT: netif_napi_del()
                                 │
                                 ▼
                              ......
                                 │
                                 ▼
                           INIT: netif_napi_add()
                                 │
                                 ▼
                              ......                 global reset start
                                 │                      │
                                 ▼                      ▼
                           UP: napi_enable()(skip)    ......
                                 │                      │
                                 ▼                      ▼
                              ......                 napi_disable()

In reset process, the driver will DOWN the port and then UINIT, in this
case, the setup tc process will UP the port before UINIT, so cause the
problem. Adds a DOWN process in UINIT to fix it.

Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 13:07:08 +02:00
Peiyang Wang
30545e17ea net: hns3: use the user's cfg after reset
Consider the followed case that the user change speed and reset the net
interface. Before the hw change speed successfully, the driver get old
old speed from hw by timer task. After reset, the previous speed is config
to hw. As a result, the new speed is configed successfully but lost after
PF reset. The followed pictured shows more dirrectly.

+------+              +----+                 +----+
| USER |              | PF |                 | HW |
+---+--+              +-+--+                 +-+--+
    |  ethtool -s 100G  |                      |
    +------------------>|   set speed 100G     |
    |                   +--------------------->|
    |                   |  set successfully    |
    |                   |<---------------------+---+
    |                   |query cfg (timer task)|   |
    |                   +--------------------->|   | handle speed
    |                   |     return 200G      |   | changing event
    |  ethtool --reset  |<---------------------+   | (100G)
    +------------------>|  cfg previous speed  |<--+
    |                   |  after reset (200G)  |
    |                   +--------------------->|
    |                   |                      +---+
    |                   |query cfg (timer task)|   |
    |                   +--------------------->|   | handle speed
    |                   |     return 100G      |   | changing event
    |                   |<---------------------+   | (200G)
    |                   |                      |<--+
    |                   |query cfg (timer task)|
    |                   +--------------------->|
    |                   |     return 200G      |
    |                   |<---------------------+
    |                   |                      |
    v                   v                      v

This patch save new speed if hw change speed successfully, which will be
used after reset successfully.

Fixes: 2d03eacc0b ("net: hns3: Only update mac configuation when necessary")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 13:07:08 +02:00
Jie Wang
8445d9d3c0 net: hns3: fix wrong use of semaphore up
Currently, if hns3 PF or VF FLR reset failed after five times retry,
the reset done process will directly release the semaphore
which has already released in hclge_reset_prepare_general.
This will cause down operation fail.

So this patch fixes it by adding reset state judgement. The up operation is
only called after successful PF FLR reset.

Fixes: 8627bdedc4 ("net: hns3: refactor the precedure of PF FLR")
Fixes: f28368bb45 ("net: hns3: refactor the procedure of VF FLR")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 13:07:08 +02:00
Matthieu Baerts (NGI0)
7965a7f32a selftests: net: lib: kill PIDs before del netns
When deleting netns, it is possible to still have some tasks running,
e.g. background tasks like tcpdump running in the background, not
stopped because the test has been interrupted.

Before deleting the netns, it is then safer to kill all attached PIDs,
if any. That should reduce some noises after the end of some tests, and
help with the debugging of some issues. That's why this modification is
seen as a "fix".

Fixes: 25ae948b44 ("selftests/net: add lib.sh")
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20240813-upstream-net-20240813-selftests-net-lib-kill-v1-1-27b689b248b8@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 12:57:09 +02:00
Oleksij Rempel
cdc90f7538 pse-core: Conditionally set current limit during PI regulator registration
Fix an issue where `devm_regulator_register()` would fail for PSE
controllers that do not support current limit control, such as simple
GPIO-based controllers like the podl-pse-regulator. The
`REGULATOR_CHANGE_CURRENT` flag and `max_uA` constraint are now
conditionally set only if the `pi_set_current_limit` operation is
supported. This change prevents the regulator registration routine from
attempting to call `pse_pi_set_current_limit()`, which would return
`-EOPNOTSUPP` and cause the registration to fail.

Fixes: 4a83abcef5 ("net: pse-pd: Add new power limit get and set c33 features")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Tested-by: Kyle Swenson <kyle.swenson@est.tech>
Link: https://patch.msgid.link/20240813073719.2304633-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 12:51:32 +02:00
Alex Bee
fd45cc614b drm/rockchip: inno-hdmi: Fix infoframe upload
HDMI analyser shows that the AVI infoframe is no being longer send.

The switch to the HDMI connector api should have used the frame content
which is now given in the buffer parameter, but instead still uses the
(now) empty and superfluous packed_frame variable.

Fix it.

Fixes: 65548c8ff0 ("drm/rockchip: inno_hdmi: Switch to HDMI connector")
Signed-off-by: Alex Bee <knaerzche@gmail.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240805110855.274140-2-knaerzche@gmail.com
2024-08-15 12:31:47 +02:00
Marc Zyngier
1f1b194284 net: thunder_bgx: Fix netdev structure allocation
Commit 94833addfa ("net: thunderx: Unembed netdev structure") had
a go at dynamically allocating the netdev structures for the thunderx_bgx
driver.  This change results in my ThunderX box catching fire (to be fair,
it is what it does best).

The issues with this change are that:

- bgx_lmac_enable() is called *after* bgx_acpi_register_phy() and
  bgx_init_of_phy(), both expecting netdev to be a valid pointer.

- bgx_init_of_phy() populates the MAC addresses for *all* LMACs
  attached to a given BGX instance, and thus needs netdev for each of
  them to have been allocated.

There is a few things to be said about how the driver mixes LMAC and
BGX states which leads to this sorry state, but that's beside the point.

To address this, go back to a situation where all netdev structures
are allocated before the driver starts relying on them, and move the
freeing of these structures to driver removal. Someone brave enough
can always go and restructure the driver if they want.

Fixes: 94833addfa ("net: thunderx: Unembed netdev structure")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Breno Leitao <leitao@debian.org>
Cc: Sunil Goutham <sgoutham@marvell.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240812141322.1742918-1-maz@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 12:29:33 +02:00
Danielle Ratson
fde25c20f5 net: ethtool: Allow write mechanism of LPL and both LPL and EPL
CMIS 5.2 standard section 9.4.2 defines four types of firmware update
supported mechanism: None, only LPL, only EPL, both LPL and EPL.

Currently, only LPL (Local Payload) type of write firmware block is
supported. However, if the module supports both LPL and EPL the flashing
process wrongly fails for no supporting LPL.

Fix that, by allowing the write mechanism to be LPL or both LPL and
EPL.

Fixes: c4f78134d4 ("ethtool: cmis_fw_update: add a layer for supporting firmware update using CDB")
Reported-by: Vladyslav Mykhaliuk <vmykhaliuk@nvidia.com>
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20240812140824.3718826-1-danieller@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 12:20:14 +02:00
Cong Wang
69139d2919 vsock: fix recursive ->recvmsg calls
After a vsock socket has been added to a BPF sockmap, its prot->recvmsg
has been replaced with vsock_bpf_recvmsg(). Thus the following
recursiion could happen:

vsock_bpf_recvmsg()
 -> __vsock_recvmsg()
  -> vsock_connectible_recvmsg()
   -> prot->recvmsg()
    -> vsock_bpf_recvmsg() again

We need to fix it by calling the original ->recvmsg() without any BPF
sockmap logic in __vsock_recvmsg().

Fixes: 634f1a7110 ("vsock: support sockmap")
Reported-by: syzbot+bdb4bd87b5e22058e2a4@syzkaller.appspotmail.com
Tested-by: syzbot+bdb4bd87b5e22058e2a4@syzkaller.appspotmail.com
Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240812022153.86512-1-xiyou.wangcong@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15 12:07:04 +02:00
Samuel Holland
f75c235565 arm64: Fix KASAN random tag seed initialization
Currently, kasan_init_sw_tags() is called before setup_per_cpu_areas(),
so per_cpu(prng_state, cpu) accesses the same address regardless of the
value of "cpu", and the same seed value gets copied to the percpu area
for every CPU. Fix this by moving the call to smp_prepare_boot_cpu(),
which is the first architecture hook after setup_per_cpu_areas().

Fixes: 3c9e3aa110 ("kasan: add tag related helper functions")
Fixes: 3f41b60938 ("kasan: fix random seed generation for tag-based mode")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20240814091005.969756-1-samuel.holland@sifive.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-08-15 11:04:56 +01:00
Griffin Kroah-Hartman
0863bffda1 Revert "serial: 8250_omap: Set the console genpd always on if no console suspend"
This reverts commit 68e6939ea9.

Kevin reported that this causes a crash during suspend on platforms that
dont use PM domains.

Link: https://lore.kernel.org/r/7ha5hgpchq.fsf@baylibre.com
Cc: Thomas Richard <thomas.richard@bootlin.com>
Fixes: 68e6939ea9 ("serial: 8250_omap: Set the console genpd always on if no console suspend")
Cc: stable <stable@kernel.org>
Reported-by: Kevin Hilman <khilman@kernel.org>
Signed-off-by: Griffin Kroah-Hartman <griffin@kroah.com>
Link: https://lore.kernel.org/r/20240814111747.82371-1-griffin@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-15 07:22:10 +02:00
Jakub Kicinski
b2ca1661c7 wireless fixes for v6.11
We have few fixes to drivers. The most important here is a fix for iwlwifi which
 caused major slowdowns for several users.
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAma85fkRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZsh7Af9GN+/ctiQx0/RTJoLn9C14KCT4ti/dkam
 ChgzoKIZBXf1YjHr/nicWIKykhiigFEaoiS0Cu6WX8Hd0xXCKBUpGxHemBergx7j
 jo3XnTOXLXKOTuUUrFJVTzawBKNEVLHLgMRM8E5i9Vif9sXcKKZeQYFpK5O0/Kkr
 ZP1LQTAJpHIdsihwfBesoagQfo23HxwELbbDOduycsgMEFpXM5MYz4BGprOU+bma
 pApctAD3wyc9mn5LZhwA9tX5OcM+54nBILe9ftUaMnSyEZHVZemaalUrhZdNYLqr
 y4CacQUN1tw2Q9b8OkREfzEUvcjhfcd1Vpg3lLAMHjwIsYDNDPH8Ng==
 =ZRLx
 -----END PGP SIGNATURE-----

Merge tag 'wireless-2024-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Kalle Valo says:

====================
wireless fixes for v6.11

We have few fixes to drivers. The most important here is a fix for
iwlwifi which caused major slowdowns for several users.

* tag 'wireless-2024-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: iwlwifi: correctly lookup DMA address in SG table
  wifi: mt76: mt7921: fix NULL pointer access in mt7921_ipv6_addr_change
  wifi: brcmfmac: cfg80211: Handle SSID based pmksa deletion
  wifi: rtlwifi: rtl8192du: Initialise value32 in _rtl92du_init_queue_reserved_page
  wifi: ath12k: use 128 bytes aligned iova in transmit path for WCN7850
====================

Link: https://patch.msgid.link/20240814171606.E14A0C116B1@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-14 20:40:43 -07:00
Abhinav Jain
6c569b77f0 selftest: af_unix: Fix kselftest compilation warnings
Change expected_buf from (const void *) to (const char *)
in function __recvpair().
This change fixes the below warnings during test compilation:

```
In file included from msg_oob.c:14:
msg_oob.c: In function ‘__recvpair’:

../../kselftest_harness.h:106:40: warning: format ‘%s’ expects argument
of type ‘char *’,but argument 6 has type ‘const void *’ [-Wformat=]

../../kselftest_harness.h:101:17: note: in expansion of macro ‘__TH_LOG’
msg_oob.c:235:17: note: in expansion of macro ‘TH_LOG’

../../kselftest_harness.h:106:40: warning: format ‘%s’ expects argument
of type ‘char *’,but argument 6 has type ‘const void *’ [-Wformat=]

../../kselftest_harness.h:101:17: note: in expansion of macro ‘__TH_LOG’
msg_oob.c:259:25: note: in expansion of macro ‘TH_LOG’
```

Fixes: d098d77232 ("selftest: af_unix: Add msg_oob.c.")
Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240814080743.1156166-1-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-14 20:38:27 -07:00
Linus Torvalds
1fb918967b for-6.11-rc3-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAma9KksACgkQxWXV+ddt
 WDvMYw/8Ds6A3IcMd1AByPDryhHZnOpqU/YQS/HhneisTg08MYHD0TMZLX02GXw0
 vzqUyBHQ9yOYEt18SVtX67Dzapy0SWZWTK8Er/tJCn+14SLWVMdRiJCO2Y3Rdm8T
 J3S/b610iEVl5z6S6SpFD+zc56liCyVHfpK6obwSFBCzAyN6vm2p0ls5vq+hGsjb
 s/dOPJfOMnFTOFXVIumJ5KJRCubGuwG+PhZO9engwxiFIr1O4xxedzhKocXNSiiE
 +jt96gnKwW/K/Wh59YFGLbKk7h/jsIzM2CqE+JrEZlwIN+oNODFvP+Z37DbSC6jN
 0x7G8gqY8vXgxOUk1rub+IqP+/wIjMmCTzkxpO2uy80hB0h3YzqzZnUNHk6DlIu/
 zOhgcMnp5SMuvtJIXBpP2HRzzG/UbxfrjaPKDmUvwKvCUydrw0xL+XWdDMhi3bSn
 NPDW/Ixl1XGMS131YYla1v/KXTKJwZ2hK045svx7A8Aok0WaXAcYfwValQ9GMO0j
 gk89DRN0tMikBebaVD3aakE1FqyC/3dEZ80D3LSs7cgDGQA27wjdESHcSTtY373/
 +fCBXDH/N9ubanZwsu+gUEzNp3DukEoaw2r73IicxcbsiskDpEqvddcKQmrZQ1xW
 03UVziw1LpzkNSsDMZrlBwIoo5SnqbEbsEkTCMtjivJYOzhfyYk=
 =din2
 -----END PGP SIGNATURE-----

Merge tag 'for-6.11-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - extend tree-checker verification of directory item type

 - fix regression in page/folio and extent state tracking in xarray, the
   dirty status can get out of sync and can cause problems e.g. a hang

 - in send, detect last extent and allow to clone it instead of sending
   it as write, reduces amount of data transferred in the stream

 - fix checking extent references when cleaning deleted subvolumes

 - fix one more case in the extent map shrinker, let it run only in the
   kswapd context so it does not cause latency spikes during other
   operations

* tag 'for-6.11-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix invalid mapping of extent xarray state
  btrfs: send: allow cloning non-aligned extent if it ends at i_size
  btrfs: only run the extent map shrinker from kswapd tasks
  btrfs: tree-checker: reject BTRFS_FT_UNKNOWN dir type
  btrfs: check delayed refs when we're checking if a ref exists
2024-08-14 17:56:15 -07:00
Breno Leitao
14d069d929 i2c: tegra: Do not mark ACPI devices as irq safe
On ACPI machines, the tegra i2c module encounters an issue due to a
mutex being called inside a spinlock. This leads to the following bug:

	BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
	...

	Call trace:
	__might_sleep
	__mutex_lock_common
	mutex_lock_nested
	acpi_subsys_runtime_resume
	rpm_resume
	tegra_i2c_xfer

The problem arises because during __pm_runtime_resume(), the spinlock
&dev->power.lock is acquired before rpm_resume() is called. Later,
rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
mutexes, triggering the error.

To address this issue, devices on ACPI are now marked as not IRQ-safe,
considering the dependency of acpi_subsys_runtime_resume() on mutexes.

Fixes: bd2fdedbf2 ("i2c: tegra: Add the ACPI support")
Cc: <stable@vger.kernel.org> # v5.17+
Co-developed-by: Michael van der Westhuizen <rmikey@meta.com>
Signed-off-by: Michael van der Westhuizen <rmikey@meta.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-08-15 00:22:28 +02:00
Phil Sutter
bd662c4218 netfilter: nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests
Objects' dump callbacks are not concurrency-safe per-se with reset bit
set. If two CPUs perform a reset at the same time, at least counter and
quota objects suffer from value underrun.

Prevent this by introducing dedicated locking callbacks for nfnetlink
and the asynchronous dump handling to serialize access.

Fixes: 43da04a593 ("netfilter: nf_tables: atomic dump and reset for stateful objects")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14 23:44:55 +02:00
Phil Sutter
69fc3e9e90 netfilter: nf_tables: Introduce nf_tables_getobj_single
Outsource the reply skb preparation for non-dump getrule requests into a
distinct function. Prep work for object reset locking.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14 23:37:44 +02:00
Phil Sutter
e0b6648b04 netfilter: nf_tables: Audit log dump reset after the fact
In theory, dumpreset may fail and invalidate the preceeding log message.
Fix this and use the occasion to prepare for object reset locking, which
benefits from a few unrelated changes:

* Add an early call to nfnetlink_unicast if not resetting which
  effectively skips the audit logging but also unindents it.
* Extract the table's name from the netlink attribute (which is verified
  via earlier table lookup) to not rely upon validity of the looked up
  table pointer.
* Do not use local variable family, it will vanish.

Fixes: 8e6cf365e1 ("audit: log nftables configuration change events")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14 23:37:35 +02:00
Florian Westphal
ea2306f033 selftests: netfilter: add test for br_netfilter+conntrack+queue combination
Trigger cloned skbs leaving softirq protection.
This triggers splat without the preceeding change
("netfilter: nf_queue: drop packets with cloned unconfirmed
 conntracks"):

WARNING: at net/netfilter/nf_conntrack_core.c:1198 __nf_conntrack_confirm..

because local delivery and forwarding will race for confirmation.

Based on a reproducer script from Yi Chen.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14 23:37:27 +02:00
Florian Westphal
7d8dc1c7be netfilter: nf_queue: drop packets with cloned unconfirmed conntracks
Conntrack assumes an unconfirmed entry (not yet committed to global hash
table) has a refcount of 1 and is not visible to other cores.

With multicast forwarding this assumption breaks down because such
skbs get cloned after being picked up, i.e.  ct->use refcount is > 1.

Likewise, bridge netfilter will clone broad/mutlicast frames and
all frames in case they need to be flood-forwarded during learning
phase.

For ip multicast forwarding or plain bridge flood-forward this will
"work" because packets don't leave softirq and are implicitly
serialized.

With nfqueue this no longer holds true, the packets get queued
and can be reinjected in arbitrary ways.

Disable this feature, I see no other solution.

After this patch, nfqueue cannot queue packets except the last
multicast/broadcast packet.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14 23:37:23 +02:00
Donald Hunter
e976713730 netfilter: flowtable: initialise extack before use
Fix missing initialisation of extack in flow offload.

Fixes: c29f74e0df ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14 23:37:16 +02:00
Donald Hunter
d1a7b382a9 netfilter: nfnetlink: Initialise extack before use in ACKs
Add missing extack initialisation when ACKing BATCH_BEGIN and BATCH_END.

Fixes: bf2ac490d2 ("netfilter: nfnetlink: Handle ACK flags for batch messages")
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14 23:27:38 +02:00
Linus Torvalds
d07b43284a s390:
* Fix failure to start guests with kvm.use_gisa=0
 
 * Panic if (un)share fails to maintain security.
 
 ARM:
 
 * Use kvfree() for the kvmalloc'd nested MMUs array
 
 * Set of fixes to address warnings in W=1 builds
 
 * Make KVM depend on assembler support for ARMv8.4
 
 * Fix for vgic-debug interface for VMs without LPIs
 
 * Actually check ID_AA64MMFR3_EL1.S1PIE in get-reg-list selftest
 
 * Minor code / comment cleanups for configuring PAuth traps
 
 * Take kvm->arch.config_lock to prevent destruction / initialization
   race for a vCPU's CPUIF which may lead to a UAF
 
 x86:
 
 * Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)
 
 * Fix smatch issues
 
 * Small cleanups
 
 * Make x2APIC ID 100% readonly
 
 * Fix typo in uapi constant
 
 Generic:
 
 * Use synchronize_srcu_expedited() on irqfd shutdown
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAma85nQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPq5gf8DVZjs2yNwdPLAUN7AOpElsobFnVN
 etJ1V09XsfSYMIAV6ksvC1VBzHpyOJN2QKuF0nfIiISsV8W9xLcV2rKIodsGBvdV
 K5ODL/yxeYI27t6Uferra1AGlchtn3tlpZzVarZIgRvZa3NMXaQPYJdcQr2Oybou
 7hZsboMTl6jaSl6NELzcBRksfkcOqQLQoUqVBqlkBTM3yyFRmV85BisRkOWCIBfA
 9c+kn7ZWBfOqYaXjiLeCrdCKrUuFLfR3ejJibYFan5MULYHIL95W/WJo9uQJ1QBr
 BNjMfmtVZ2JOWya40uUSKrvxJ0IErAlMNgmnpjeA4cBqYK5GRHUKvO5jNw==
 =A8SH
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "s390:

   - Fix failure to start guests with kvm.use_gisa=0

   - Panic if (un)share fails to maintain security.

  ARM:

   - Use kvfree() for the kvmalloc'd nested MMUs array

   - Set of fixes to address warnings in W=1 builds

   - Make KVM depend on assembler support for ARMv8.4

   - Fix for vgic-debug interface for VMs without LPIs

   - Actually check ID_AA64MMFR3_EL1.S1PIE in get-reg-list selftest

   - Minor code / comment cleanups for configuring PAuth traps

   - Take kvm->arch.config_lock to prevent destruction / initialization
     race for a vCPU's CPUIF which may lead to a UAF

  x86:

   - Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)

   - Fix smatch issues

   - Small cleanups

   - Make x2APIC ID 100% readonly

   - Fix typo in uapi constant

  Generic:

   - Use synchronize_srcu_expedited() on irqfd shutdown"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
  KVM: SEV: uapi: fix typo in SEV_RET_INVALID_CONFIG
  KVM: x86: Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)
  KVM: eventfd: Use synchronize_srcu_expedited() on shutdown
  KVM: selftests: Add a testcase to verify x2APIC is fully readonly
  KVM: x86: Make x2APIC ID 100% readonly
  KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id())
  KVM: x86: hyper-v: Remove unused inline function kvm_hv_free_pa_page()
  KVM: SVM: Fix an error code in sev_gmem_post_populate()
  KVM: SVM: Fix uninitialized variable bug
  KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface
  KVM: selftests: arm64: Correct feature test for S1PIE in get-reg-list
  KVM: arm64: Tidying up PAuth code in KVM
  KVM: arm64: vgic-debug: Exit the iterator properly w/o LPI
  KVM: arm64: Enforce dependency on an ARMv8.4-aware toolchain
  s390/uv: Panic for set and remove shared access UVC errors
  KVM: s390: fix validity interception issue when gisa is switched off
  docs: KVM: Fix register ID of SPSR_FIQ
  KVM: arm64: vgic: fix unexpected unlock sparse warnings
  KVM: arm64: fix kdoc warnings in W=1 builds
  KVM: arm64: fix override-init warnings in W=1 builds
  ...
2024-08-14 13:46:24 -07:00
Evan Green
1f5288874d
RISC-V: hwprobe: Add SCALAR to misaligned perf defines
In preparation for misaligned vector performance hwprobe keys, rename
the hwprobe key values associated with misaligned scalar accesses to
include the term SCALAR. Leave the old defines in place to maintain
source compatibility.

This change is intended to be a functional no-op.

Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-3-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:13:24 -07:00
Evan Green
c42e2f0767
RISC-V: hwprobe: Add MISALIGNED_PERF key
RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
hwprobe_key_is_bitmask(), when in reality it was an enum value. This
causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
since SLOW, FAST, and EMULATED have values whose bits overlap with
each other. If the caller asked for the set of CPUs that was SLOW or
EMULATED, the returned set would also include CPUs that were FAST.

Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
returns the same values in response to a direct query (with no flags),
but is properly handled as an enumerated value. As a result, SLOW,
FAST, and EMULATED are all correctly treated as distinct values under
the new key when queried with the WHICH_CPUS flag.

Leave the old key in place to avoid disturbing applications which may
have already come to rely on the key, with or without its broken
behavior with respect to the WHICH_CPUS flag.

Fixes: e178bf146e ("RISC-V: hwprobe: Introduce which-cpus flag")
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-2-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:13:23 -07:00
Haibo Xu
a445699879
RISC-V: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.

Fixes: eabd9db64e ("ACPI: RISCV: Add NUMA support based on SRAT and SLIT")
Reported-by: Andrew Jones <ajones@ventanamicro.com>
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/0d362a8ae50558b95685da4c821b2ae9e8cf78be.1722828421.git.haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:41 -07:00
Nam Cao
57d76bc51f
riscv: change XIP's kernel_map.size to be size of the entire kernel
With XIP kernel, kernel_map.size is set to be only the size of data part of
the kernel. This is inconsistent with "normal" kernel, who sets it to be
the size of the entire kernel.

More importantly, XIP kernel fails to boot if CONFIG_DEBUG_VIRTUAL is
enabled, because there are checks on virtual addresses with the assumption
that kernel_map.size is the size of the entire kernel (these checks are in
arch/riscv/mm/physaddr.c).

Change XIP's kernel_map.size to be the size of the entire kernel.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org> # v6.1+
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240508191917.2892064-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:33 -07:00
Celeste Liu
6111939463
riscv: entry: always initialize regs->a0 to -ENOSYS
Otherwise when the tracer changes syscall number to -1, the kernel fails
to initialize a0 with -ENOSYS and subsequently fails to return the error
code of the failed syscall to userspace. For example, it will break
strace syscall tampering.

Fixes: 52449c17bd ("riscv: entry: set a0 = -ENOSYS only when syscall != -1")
Reported-by: "Dmitry V. Levin" <ldv@strace.io>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Link: https://lore.kernel.org/r/20240627142338.5114-2-CoelacanthusHex@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:22 -07:00
Tom Hughes
3cd740b985 netfilter: allow ipv6 fragments to arrive on different devices
Commit 264640fc2c ("ipv6: distinguish frag queues by device
for multicast and link-local packets") modified the ipv6 fragment
reassembly logic to distinguish frag queues by device for multicast
and link-local packets but in fact only the main reassembly code
limits the use of the device to those address types and the netfilter
reassembly code uses the device for all packets.

This means that if fragments of a packet arrive on different interfaces
then netfilter will fail to reassemble them and the fragments will be
expired without going any further through the filters.

Fixes: 648700f76b ("inet: frags: use rhashtables for reassembly units")
Signed-off-by: Tom Hughes <tom@compton.nu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14 21:16:12 +02:00
Richard Fitzgerald
71833e79a4 i2c: Use IS_REACHABLE() for substituting empty ACPI functions
Replace IS_ENABLED() with IS_REACHABLE() to substitute empty stubs for:
    i2c_acpi_get_i2c_resource()
    i2c_acpi_client_count()
    i2c_acpi_find_bus_speed()
    i2c_acpi_new_device_by_fwnode()
    i2c_adapter *i2c_acpi_find_adapter_by_handle()
    i2c_acpi_waive_d0_probe()

commit f17c06c660 ("i2c: Fix conditional for substituting empty ACPI
functions") partially fixed this conditional to depend on CONFIG_I2C,
but used IS_ENABLED(), which is wrong since CONFIG_I2C is tristate.

CONFIG_ACPI is boolean but let's also change it to use IS_REACHABLE()
to future-proof it against becoming tristate.

Somehow despite testing various combinations of CONFIG_I2C and CONFIG_ACPI
we missed the combination CONFIG_I2C=m, CONFIG_ACPI=y.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: f17c06c660 ("i2c: Fix conditional for substituting empty ACPI functions")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202408141333.gYnaitcV-lkp@intel.com/
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-08-14 19:34:19 +02:00
Amit Shah
1c0e588169 KVM: SEV: uapi: fix typo in SEV_RET_INVALID_CONFIG
"INVALID" is misspelt in "SEV_RET_INAVLID_CONFIG". Since this is part of
the UAPI, keep the current definition and add a new one with the fix.

Fix-suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Amit Shah <amit.shah@amd.com>
Message-ID: <20240814083113.21622-1-amit@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14 13:05:42 -04:00
Haibo Xu
a21dcf0ea8 arm64: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.

Fixes: e189624916 ("arm64: numa: rework ACPI NUMA initialization")
Cc: <stable@vger.kernel.org> # 4.19.x
Reported-by: Andrew Jones <ajones@ventanamicro.com>
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/853d7f74aa243f6f5999e203246f0d1ae92d2b61.1722828421.git.haibo1.xu@intel.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-08-14 17:51:39 +01:00
Mark Rutland
f94511df53 arm64: uaccess: correct thinko in __get_mem_asm()
In the CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y version of __get_mem_asm(), we
incorrectly use _ASM_EXTABLE_##type##ACCESS_ERR() such that upon a fault
the extable fixup handler writes -EFAULT into "%w0", which is the
register containing 'x' (the result of the load).

This was a thinko in commit:

  86a6a68feb ("arm64: start using 'asm goto' for get_user() when available")

Prior to that commit _ASM_EXTABLE_##type##ACCESS_ERR_ZERO() was used
such that the extable fixup handler wrote -EFAULT into "%w0" (the
register containing 'err'), and zero into "%w1" (the register containing
'x'). When the 'err' variable was removed, the extable entry was updated
incorrectly.

Writing -EFAULT to the value register is unnecessary but benign:

* We never want -EFAULT in the value register, and previously this would
  have been zeroed in the extable fixup handler.

* In __get_user_error() the value is overwritten with zero explicitly in
  the error path.

* The asm goto outputs cannot be used when the goto label is taken, as
  older compilers (e.g. clang < 16.0.0) do not guarantee that asm goto
  outputs are usable in this path and may use a stale value rather than
  the value in an output register. Consequently, zeroing in the extable
  fixup handler is insufficient to ensure callers see zero in the error
  path.

* The expected usage of unsafe_get_user() and get_kernel_nofault()
  requires that the value is not consumed in the error path.

Some versions of GCC would mis-compile asm goto with outputs, and
erroneously omit subsequent assignments, breaking the error path
handling in __get_user_error(). This was discussed at:

  https://lore.kernel.org/lkml/ZpfxLrJAOF2YNqCk@J2N7QTR9R3.cambridge.arm.com/

... and was fixed by removing support for asm goto with outputs on those
broken compilers in commit:

  f2f6a8e887 ("init/Kconfig: remove CONFIG_GCC_ASM_GOTO_OUTPUT_WORKAROUND")

With that out of the way, we can safely replace the usage of
_ASM_EXTABLE_##type##ACCESS_ERR() with _ASM_EXTABLE_##type##ACCESS(),
leaving the value register unchanged in the case a fault is taken, as
was originally intended. This matches other architectures and matches
our __put_mem_asm().

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240807103731.2498893-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-08-14 17:51:11 +01:00
Sean Christopherson
66155de93b KVM: x86: Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)
Disallow read-only memslots for SEV-{ES,SNP} VM types, as KVM can't
directly emulate instructions for ES/SNP, and instead the guest must
explicitly request emulation.  Unless the guest explicitly requests
emulation without accessing memory, ES/SNP relies on KVM creating an MMIO
SPTE, with the subsequent #NPF being reflected into the guest as a #VC.

But for read-only memslots, KVM deliberately doesn't create MMIO SPTEs,
because except for ES/SNP, doing so requires setting reserved bits in the
SPTE, i.e. the SPTE can't be readable while also generating a #VC on
writes.  Because KVM never creates MMIO SPTEs and jumps directly to
emulation, the guest never gets a #VC.  And since KVM simply resumes the
guest if ES/SNP guests trigger emulation, KVM effectively puts the vCPU
into an infinite #NPF loop if the vCPU attempts to write read-only memory.

Disallow read-only memory for all VMs with protected state, i.e. for
upcoming TDX VMs as well as ES/SNP VMs.  For TDX, it's actually possible
to support read-only memory, as TDX uses EPT Violation #VE to reflect the
fault into the guest, e.g. KVM could configure read-only SPTEs with RX
protections and SUPPRESS_VE=0.  But there is no strong use case for
supporting read-only memslots on TDX, e.g. the main historical usage is
to emulate option ROMs, but TDX disallows executing from shared memory.
And if someone comes along with a legitimate, strong use case, the
restriction can always be lifted for TDX.

Don't bother trying to retroactively apply the restriction to SEV-ES
VMs that are created as type KVM_X86_DEFAULT_VM.  Read-only memslots can't
possibly work for SEV-ES, i.e. disallowing such memslots is really just
means reporting an error to userspace instead of silently hanging vCPUs.
Trying to deal with the ordering between KVM_SEV_INIT and memslot creation
isn't worth the marginal benefit it would provide userspace.

Fixes: 26c44aa9e0 ("KVM: SEV: define VM types for SEV and SEV-ES")
Fixes: 1dfe571c12 ("KVM: SEV: Add initial SEV-SNP support")
Cc: Peter Gonda <pgonda@google.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Vishal Annapurve <vannapurve@google.com>
Cc: Ackerly Tng <ackerleytng@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240809190319.1710470-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14 12:28:24 -04:00
Linus Torvalds
9d5906799f selinux/stable-6.11 PR 20240814
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAma8yoAUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOJWw/8D8ZhOLdK9HFKpywIA855CHXbTVHz
 VSM9avEykSHTPqWhGx9dqaS8hG/xG7Fh5t3Me2goh0E7dAZjJQCGf/PXyh8Ntotj
 5QJB3CEzi9o4yM5yCgNgEYtnVdP52Y8SxfzE9TQWST7EbWsrWDjE68a/08h0cRkr
 XTScFIB+SVHOJGBcdSKdYyVqnpgoE8OeWOxVa2OnThug8VPB5WMmJIPkMQ2ykOR4
 4dGykI8ULNSiZ+0skfdhxXaykYDdajzCSrk21OP2iaD/41X+I627HaJzCxa3m9s0
 z04XtaOV31Jye8McPF6dCHtloFQvZ/deXXeow2yytR7WnT5/KDwBt+8A7mDcKL0V
 uhW9DiwxCit6Pao4a/YuEerXqxSH4DdJmTCbqrHITnQWdAW9UQ0jsY1msxwal/aq
 l9kK1SJoB3/5A0Dcsc57TFCvh2dhwRnAU3Ik00SCW2LlZrsCmBSPMRCsTHDBVg49
 R5ya6qFB9+KEcAjZgJqa15X8WC8cmx+/96mDScd6VjS2Eyh57tZmJgonjlUwJplH
 A1erboJtlo2NHzMaxmIXEsaxJYvrOcwKDeQ80H81SBIDPLweW2FYDVZpp4/nx++2
 +8myHz9KM7iWFVceJuqvNRix4iR3RkXxvgocm1CzUedgXYUJOVM/li+0pjgJ/bUM
 gPo7Sr3BAk9T43k=
 =B712
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20240814' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fixes from Paul Moore:

 - Fix a xperms counting problem where we adding to the xperms count
   even if we failed to add the xperm.

 - Propogate errors from avc_add_xperms_decision() back to the caller so
   that we can trigger the proper cleanup and error handling.

 - Revert our use of vma_is_initial_heap() in favor of our older logic
   as vma_is_initial_heap() doesn't correctly handle the no-heap case
   and it is causing issues with the SELinux process/execheap access
   control. While the older SELinux logic may not be perfect, it
   restores the expected user visible behavior.

   Hopefully we will be able to resolve the problem with the
   vma_is_initial_heap() macro with the mm folks, but we need to fix
   this in the meantime.

* tag 'selinux-pr-20240814' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: revert our use of vma_is_initial_heap()
  selinux: add the processing of the failure of avc_add_xperms_decision()
  selinux: fix potential counting error in avc_add_xperms_decision()
2024-08-14 09:23:20 -07:00
Linus Torvalds
4ac0f08f44 vfs-6.11-rc4.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZrym4AAKCRCRxhvAZXjc
 oqT3AP9ydoUNavaZcRayH8r3ybvz9+aJGJ6Q7NznFVCk71vn0gD/buLzmq96Muns
 M5DWHbft2AFwK0Rz2nx8j5OXUeHwrQg=
 =HZBL
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.11-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "VFS:

   - Fix the name of file lease slab cache. When file leases were split
     out of file locks the name of the file lock slab cache was used for
     the file leases slab cache as well.

   - Fix a type in take_fd() helper.

   - Fix infinite directory iteration for stable offsets in tmpfs.

   - When the icache is pruned all reclaimable inodes are marked with
     I_FREEING and other processes that try to lookup such inodes will
     block.

     But some filesystems like ext4 can trigger lookups in their inode
     evict callback causing deadlocks. Ext4 does such lookups if the
     ea_inode feature is used whereby a separate inode may be used to
     store xattrs.

     Introduce I_LRU_ISOLATING which pins the inode while its pages are
     reclaimed. This avoids inode deletion during inode_lru_isolate()
     avoiding the deadlock and evict is made to wait until
     I_LRU_ISOLATING is done.

  netfs:

   - Fault in smaller chunks for non-large folio mappings for
     filesystems that haven't been converted to large folios yet.

   - Fix the CONFIG_NETFS_DEBUG config option. The config option was
     renamed a short while ago and that introduced two minor issues.
     First, it depended on CONFIG_NETFS whereas it wants to depend on
     CONFIG_NETFS_SUPPORT. The former doesn't exist, while the latter
     does. Second, the documentation for the config option wasn't fixed
     up.

   - Revert the removal of the PG_private_2 writeback flag as ceph is
     using it and fix how that flag is handled in netfs.

   - Fix DIO reads on 9p. A program watching a file on a 9p mount
     wouldn't see any changes in the size of the file being exported by
     the server if the file was changed directly in the source
     filesystem. Fix this by attempting to read the full size specified
     when a DIO read is requested.

   - Fix a NULL pointer dereference bug due to a data race where a
     cachefiles cookies was retired even though it was still in use.
     Check the cookie's n_accesses counter before discarding it.

  nsfs:

   - Fix ioctl declaration for NS_GET_MNTNS_ID from _IO() to _IOR() as
     the kernel is writing to userspace.

  pidfs:

   - Prevent the creation of pidfds for kthreads until we have a
     use-case for it and we know the semantics we want. It also confuses
     userspace why they can get pidfds for kthreads.

  squashfs:

   - Fix an unitialized value bug reported by KMSAN caused by a
     corrupted symbolic link size read from disk. Check that the
     symbolic link size is not larger than expected"

* tag 'vfs-6.11-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  Squashfs: sanity check symbolic link size
  9p: Fix DIO read through netfs
  vfs: Don't evict inode under the inode lru traversing context
  netfs: Fix handling of USE_PGPRIV2 and WRITE_TO_CACHE flags
  netfs, ceph: Revert "netfs: Remove deprecated use of PG_private_2 as a second writeback flag"
  file: fix typo in take_fd() comment
  pidfd: prevent creation of pidfds for kthreads
  netfs: clean up after renaming FSCACHE_DEBUG config
  libfs: fix infinite directory reads for offset dir
  nsfs: fix ioctl declaration
  fs/netfs/fscache_cookie: add missing "n_accesses" check
  filelock: fix name of file_lease slab cache
  netfs: Fault in smaller chunks for non-large folio mappings
2024-08-14 09:06:28 -07:00
Linus Torvalds
02f8ca3d49 bpf-6.11-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAma75lcACgkQ6rmadz2v
 bToAARAAjGtU7TXWr7mRO9IuwjhbnIv4STWsfZQAOm9qIVxl72vqE6YQVISSI6+l
 lBgVbrzvrd5DhO3y/xz5/4e/uS2zrS566jiwI+g7rvHYTI4CWBXrfewXIwPEjprT
 nXntq+tRM7DpdiZk3KmO3nf/CQyj0Pft5AJQR3B3fmehoqjrhDNk6YnSum1SoRPx
 yBparmWpIn4ysk2y0RCB4bpzjQIxCQwhYAVpQeiFar7blfWMKGYi+/ti0fPnvWFG
 atUU7JZhIBaXcDfVEC82vECd/CkxZpWrgtv9kgp3b75ylE3vhR6C9Kq2nd0zBOFU
 +wqrAwStLRDFnv+WkAHkPH2vIXQ+9ACwefINYrzzjFmoQr75jEYlo7oal2GU7sQI
 Y5M9qGW6HcVUtw9soGT5eoqlwK2Uot/FpvsAVKCZq6TcniCGDeQomXEy6Rmc5aSz
 HHIbmYSsSwuoWkXZ8Gn2He2QKTmGSG9d3EII8Ge6C7Ev2Rx1ARLZKG+jAy+igMfn
 QDL9capp3LSTvbPGYU1z0llN6nW4u9JrkCPjFdopwM8RZoXphArtZMki87StXmH/
 eZxA6PEJcFdBhCUkge14s/D7YW+dw05YBRsJ6QnEQULHx6pd77jlhT4Sv+U/SGL/
 ghtmqIdT8JreL4e4dCS/sH7fkaT9ZmO12Awi3dPz8cs/B5LwdWY=
 =sGTX
 -----END PGP SIGNATURE-----

Merge tag 'bpf-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Fix bpftrace regression from Kyle Huey.

   Tracing bpf prog was called with perf_event input arguments causing
   bpftrace produce garbage output.

 - Fix verifier crash in stacksafe() from Yonghong Song.

   Daniel Hodges reported verifier crash when playing with sched-ext.
   The stack depth in the known verifier state was larger than stack
   depth in being explored state causing out-of-bounds access.

 - Fix update of freplace prog in prog_array from Leon Hwang.

   freplace prog type wasn't recognized correctly.

* tag 'bpf-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  perf/bpf: Don't call bpf_overflow_handler() for tracing events
  selftests/bpf: Add a test to verify previous stacksafe() fix
  bpf: Fix a kernel verifier crash in stacksafe()
  bpf: Fix updating attached freplace prog in prog_array map
2024-08-14 08:57:24 -07:00