Commit Graph

1309651 Commits

Author SHA1 Message Date
Eric Dumazet
3cb7cf1540 net/sched: accept TCA_STAB only for root qdisc
Most qdiscs maintain their backlog using qdisc_pkt_len(skb)
on the assumption it is invariant between the enqueue()
and dequeue() handlers.

Unfortunately syzbot can crash a host rather easily using
a TBF + SFQ combination, with an STAB on SFQ [1]

We can't support TCA_STAB on arbitrary level, this would
require to maintain per-qdisc storage.

[1]
[   88.796496] BUG: kernel NULL pointer dereference, address: 0000000000000000
[   88.798611] #PF: supervisor read access in kernel mode
[   88.799014] #PF: error_code(0x0000) - not-present page
[   88.799506] PGD 0 P4D 0
[   88.799829] Oops: Oops: 0000 [#1] SMP NOPTI
[   88.800569] CPU: 14 UID: 0 PID: 2053 Comm: b371744477 Not tainted 6.12.0-rc1-virtme #1117
[   88.801107] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   88.801779] RIP: 0010:sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq
[ 88.802544] Code: 0f b7 50 12 48 8d 04 d5 00 00 00 00 48 89 d6 48 29 d0 48 8b 91 c0 01 00 00 48 c1 e0 03 48 01 c2 66 83 7a 1a 00 7e c0 48 8b 3a <4c> 8b 07 4c 89 02 49 89 50 08 48 c7 47 08 00 00 00 00 48 c7 07 00
All code
========
   0:	0f b7 50 12          	movzwl 0x12(%rax),%edx
   4:	48 8d 04 d5 00 00 00 	lea    0x0(,%rdx,8),%rax
   b:	00
   c:	48 89 d6             	mov    %rdx,%rsi
   f:	48 29 d0             	sub    %rdx,%rax
  12:	48 8b 91 c0 01 00 00 	mov    0x1c0(%rcx),%rdx
  19:	48 c1 e0 03          	shl    $0x3,%rax
  1d:	48 01 c2             	add    %rax,%rdx
  20:	66 83 7a 1a 00       	cmpw   $0x0,0x1a(%rdx)
  25:	7e c0                	jle    0xffffffffffffffe7
  27:	48 8b 3a             	mov    (%rdx),%rdi
  2a:*	4c 8b 07             	mov    (%rdi),%r8		<-- trapping instruction
  2d:	4c 89 02             	mov    %r8,(%rdx)
  30:	49 89 50 08          	mov    %rdx,0x8(%r8)
  34:	48 c7 47 08 00 00 00 	movq   $0x0,0x8(%rdi)
  3b:	00
  3c:	48                   	rex.W
  3d:	c7                   	.byte 0xc7
  3e:	07                   	(bad)
	...

Code starting with the faulting instruction
===========================================
   0:	4c 8b 07             	mov    (%rdi),%r8
   3:	4c 89 02             	mov    %r8,(%rdx)
   6:	49 89 50 08          	mov    %rdx,0x8(%r8)
   a:	48 c7 47 08 00 00 00 	movq   $0x0,0x8(%rdi)
  11:	00
  12:	48                   	rex.W
  13:	c7                   	.byte 0xc7
  14:	07                   	(bad)
	...
[   88.803721] RSP: 0018:ffff9a1f892b7d58 EFLAGS: 00000206
[   88.804032] RAX: 0000000000000000 RBX: ffff9a1f8420c800 RCX: ffff9a1f8420c800
[   88.804560] RDX: ffff9a1f81bc1440 RSI: 0000000000000000 RDI: 0000000000000000
[   88.805056] RBP: ffffffffc04bb0e0 R08: 0000000000000001 R09: 00000000ff7f9a1f
[   88.805473] R10: 000000000001001b R11: 0000000000009a1f R12: 0000000000000140
[   88.806194] R13: 0000000000000001 R14: ffff9a1f886df400 R15: ffff9a1f886df4ac
[   88.806734] FS:  00007f445601a740(0000) GS:ffff9a2e7fd80000(0000) knlGS:0000000000000000
[   88.807225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.807672] CR2: 0000000000000000 CR3: 000000050cc46000 CR4: 00000000000006f0
[   88.808165] Call Trace:
[   88.808459]  <TASK>
[   88.808710] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[   88.809261] ? page_fault_oops (arch/x86/mm/fault.c:715)
[   88.809561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
[   88.809806] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[   88.810074] ? sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq
[   88.810411] sfq_reset (net/sched/sch_sfq.c:525) sch_sfq
[   88.810671] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036)
[   88.810950] tbf_reset (./include/linux/timekeeping.h:169 net/sched/sch_tbf.c:334) sch_tbf
[   88.811208] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036)
[   88.811484] netif_set_real_num_tx_queues (./include/linux/spinlock.h:396 ./include/net/sch_generic.h:768 net/core/dev.c:2958)
[   88.811870] __tun_detach (drivers/net/tun.c:590 drivers/net/tun.c:673)
[   88.812271] tun_chr_close (drivers/net/tun.c:702 drivers/net/tun.c:3517)
[   88.812505] __fput (fs/file_table.c:432 (discriminator 1))
[   88.812735] task_work_run (kernel/task_work.c:230)
[   88.813016] do_exit (kernel/exit.c:940)
[   88.813372] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
[   88.813639] ? handle_mm_fault (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:97 ./arch/x86/include/asm/irqflags.h:155 ./include/linux/memcontrol.h:1022 ./include/linux/memcontrol.h:1045 ./include/linux/memcontrol.h:1052 mm/memory.c:5928 mm/memory.c:6088)
[   88.813867] do_group_exit (kernel/exit.c:1070)
[   88.814138] __x64_sys_exit_group (kernel/exit.c:1099)
[   88.814490] x64_sys_call (??:?)
[   88.814791] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
[   88.815012] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[   88.815495] RIP: 0033:0x7f44560f1975

Fixes: 175f9c1bba ("net_sched: Add size table for qdiscs")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20241007184130.3960565-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08 15:38:56 -07:00
Yu Liao
34d5b60017 selftests: vDSO: Explicitly include sched.h
The previous commit introduced the use of CLONE_NEWTIME without including
<sched.h> which contains its definition.

Add an explicit include of <sched.h> to ensure that CLONE_NEWTIME
is correctly defined before it is used.

Fixes: 2aec90036d ("selftests: vDSO: ensure vgetrandom works in a time namespace")
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-10-08 15:48:09 -06:00
Vitaly Lifshits
9d9e5347b0 e1000e: change I219 (19) devices to ADP
Sporadic issues, such as PHY access loss, have been observed on I219 (19)
devices. It was found that these devices have hardware more closely
related to ADP than MTP and the issues were caused by taking MTP-specific
flows.

Change the MAC and board types of these devices from MTP to ADP to
correctly reflect the LAN hardware, and flows, of these devices.

Fixes: db2d737d63 ("e1000e: Separate MTP board type from ADP")
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:39:22 -07:00
Mohamed Khalfella
330a699ecb igb: Do not bring the device up after non-fatal error
Commit 004d25060c ("igb: Fix igb_down hung on surprise removal")
changed igb_io_error_detected() to ignore non-fatal pcie errors in order
to avoid hung task that can happen when igb_down() is called multiple
times. This caused an issue when processing transient non-fatal errors.
igb_io_resume(), which is called after igb_io_error_detected(), assumes
that device is brought down by igb_io_error_detected() if the interface
is up. This resulted in panic with stacktrace below.

[ T3256] igb 0000:09:00.0 haeth0: igb: haeth0 NIC Link is Down
[  T292] pcieport 0000:00:1c.5: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0
[  T292] igb 0000:09:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[  T292] igb 0000:09:00.0:   device [8086:1537] error status/mask=00004000/00000000
[  T292] igb 0000:09:00.0:    [14] CmpltTO [  200.105524,009][  T292] igb 0000:09:00.0: AER:   TLP Header: 00000000 00000000 00000000 00000000
[  T292] pcieport 0000:00:1c.5: AER: broadcast error_detected message
[  T292] igb 0000:09:00.0: Non-correctable non-fatal error reported.
[  T292] pcieport 0000:00:1c.5: AER: broadcast mmio_enabled message
[  T292] pcieport 0000:00:1c.5: AER: broadcast resume message
[  T292] ------------[ cut here ]------------
[  T292] kernel BUG at net/core/dev.c:6539!
[  T292] invalid opcode: 0000 [#1] PREEMPT SMP
[  T292] RIP: 0010:napi_enable+0x37/0x40
[  T292] Call Trace:
[  T292]  <TASK>
[  T292]  ? die+0x33/0x90
[  T292]  ? do_trap+0xdc/0x110
[  T292]  ? napi_enable+0x37/0x40
[  T292]  ? do_error_trap+0x70/0xb0
[  T292]  ? napi_enable+0x37/0x40
[  T292]  ? napi_enable+0x37/0x40
[  T292]  ? exc_invalid_op+0x4e/0x70
[  T292]  ? napi_enable+0x37/0x40
[  T292]  ? asm_exc_invalid_op+0x16/0x20
[  T292]  ? napi_enable+0x37/0x40
[  T292]  igb_up+0x41/0x150
[  T292]  igb_io_resume+0x25/0x70
[  T292]  report_resume+0x54/0x70
[  T292]  ? report_frozen_detected+0x20/0x20
[  T292]  pci_walk_bus+0x6c/0x90
[  T292]  ? aer_print_port_info+0xa0/0xa0
[  T292]  pcie_do_recovery+0x22f/0x380
[  T292]  aer_process_err_devices+0x110/0x160
[  T292]  aer_isr+0x1c1/0x1e0
[  T292]  ? disable_irq_nosync+0x10/0x10
[  T292]  irq_thread_fn+0x1a/0x60
[  T292]  irq_thread+0xe3/0x1a0
[  T292]  ? irq_set_affinity_notifier+0x120/0x120
[  T292]  ? irq_affinity_notify+0x100/0x100
[  T292]  kthread+0xe2/0x110
[  T292]  ? kthread_complete_and_exit+0x20/0x20
[  T292]  ret_from_fork+0x2d/0x50
[  T292]  ? kthread_complete_and_exit+0x20/0x20
[  T292]  ret_from_fork_asm+0x11/0x20
[  T292]  </TASK>

To fix this issue igb_io_resume() checks if the interface is running and
the device is not down this means igb_io_error_detected() did not bring
the device down and there is no need to bring it up.

Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Fixes: 004d25060c ("igb: Fix igb_down hung on surprise removal")
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:39:21 -07:00
Aleksandr Loktionov
dac6c7b3d3 i40e: Fix macvlan leak by synchronizing access to mac_filter_hash
This patch addresses a macvlan leak issue in the i40e driver caused by
concurrent access to vsi->mac_filter_hash. The leak occurs when multiple
threads attempt to modify the mac_filter_hash simultaneously, leading to
inconsistent state and potential memory leaks.

To fix this, we now wrap the calls to i40e_del_mac_filter() and zeroing
vf->default_lan_addr.addr with spin_lock/unlock_bh(&vsi->mac_filter_hash_lock),
ensuring atomic operations and preventing concurrent access.

Additionally, we add lockdep_assert_held(&vsi->mac_filter_hash_lock) in
i40e_add_mac_filter() to help catch similar issues in the future.

Reproduction steps:
1. Spawn VFs and configure port vlan on them.
2. Trigger concurrent macvlan operations (e.g., adding and deleting
	portvlan and/or mac filters).
3. Observe the potential memory leak and inconsistent state in the
	mac_filter_hash.

This synchronization ensures the integrity of the mac_filter_hash and prevents
the described leak.

Fixes: fed0d9f132 ("i40e: Fix VF's MAC Address change on VM")
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:39:21 -07:00
Jason A. Donenfeld
3953a1d137 selftests: vDSO: improve getrandom and chacha error messages
Improve the error and skip condition messages to let the developer know
precisely where a test has failed. Also make better use of the ksft api
for this.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-10-08 15:21:52 -06:00
Jason A. Donenfeld
fe6305cbc7 selftests: vDSO: unconditionally build getrandom test
Rather than building on supported archs, build on all archs, and then
use the presence of the symbol in the vDSO to either skip the test or
move forward with it.

Note that this means that this test no longer checks whether the symbol
was correctly added to the kernel. But hopefully this will be clear
enough to developers and we'll cross our fingers that symbols aren't
removed by accident and not caught after this change.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-10-08 15:21:36 -06:00
Jason A. Donenfeld
3b5992eaf7 selftests: vDSO: unconditionally build chacha test
Rather than using symlinks to find the vgetrandom-chacha.S file for each
arch, store this in a file that uses the compiler to determine
architecture, and then make use of weak symbols to skip the test on
architectures that don't provide the code.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-10-08 15:21:29 -06:00
Marcin Szycik
bce9af1b03 ice: Fix increasing MSI-X on VF
Increasing MSI-X value on a VF leads to invalid memory operations. This
is caused by not reallocating some arrays.

Reproducer:
  modprobe ice
  echo 0 > /sys/bus/pci/devices/$PF_PCI/sriov_drivers_autoprobe
  echo 1 > /sys/bus/pci/devices/$PF_PCI/sriov_numvfs
  echo 17 > /sys/bus/pci/devices/$VF0_PCI/sriov_vf_msix_count

Default MSI-X is 16, so 17 and above triggers this issue.

KASAN reports:

  BUG: KASAN: slab-out-of-bounds in ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
  Read of size 8 at addr ffff8888b937d180 by task bash/28433
  (...)

  Call Trace:
   (...)
   ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
   kasan_report+0xed/0x120
   ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
   ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
   ice_vsi_cfg_def+0x3360/0x4770 [ice]
   ? mutex_unlock+0x83/0xd0
   ? __pfx_ice_vsi_cfg_def+0x10/0x10 [ice]
   ? __pfx_ice_remove_vsi_lkup_fltr+0x10/0x10 [ice]
   ice_vsi_cfg+0x7f/0x3b0 [ice]
   ice_vf_reconfig_vsi+0x114/0x210 [ice]
   ice_sriov_set_msix_vec_count+0x3d0/0x960 [ice]
   sriov_vf_msix_count_store+0x21c/0x300
   (...)

  Allocated by task 28201:
   (...)
   ice_vsi_cfg_def+0x1c8e/0x4770 [ice]
   ice_vsi_cfg+0x7f/0x3b0 [ice]
   ice_vsi_setup+0x179/0xa30 [ice]
   ice_sriov_configure+0xcaa/0x1520 [ice]
   sriov_numvfs_store+0x212/0x390
   (...)

To fix it, use ice_vsi_rebuild() instead of ice_vf_reconfig_vsi(). This
causes the required arrays to be reallocated taking the new queue count
into account (ice_vsi_realloc_stat_arrays()). Set req_txq and req_rxq
before ice_vsi_rebuild(), so that realloc uses the newly set queue
count.

Additionally, ice_vsi_rebuild() does not remove VSI filters
(ice_fltr_remove_all()), so ice_vf_init_host_cfg() is no longer
necessary.

Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Fixes: 2a2cb4c6c1 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:08:19 -07:00
Wojciech Drewek
fbcb968a98 ice: Flush FDB entries before reset
Triggering the reset while in switchdev mode causes
errors[1]. Rules are already removed by this time
because switch content is flushed in case of the reset.
This means that rules were deleted from HW but SW
still thinks they exist so when we get
SWITCHDEV_FDB_DEL_TO_DEVICE notification we try to
delete not existing rule.

We can avoid these errors by clearing the rules
early in the reset flow before they are removed from HW.
Switchdev API will get notified that the rule was removed
so we won't get SWITCHDEV_FDB_DEL_TO_DEVICE notification.
Remove unnecessary ice_clear_sw_switch_recipes.

[1]
ice 0000:01:00.0: Failed to delete FDB forward rule, err: -2
ice 0000:01:00.0: Failed to delete FDB guard rule, err: -2

Fixes: 7c945a1a8e ("ice: Switchdev FDB events support")
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:08:19 -07:00
Marcin Szycik
8e60dbcbaa ice: Fix netif_is_ice() in Safe Mode
netif_is_ice() works by checking the pointer to netdev ops. However, it
only checks for the default ice_netdev_ops, not ice_netdev_safe_mode_ops,
so in Safe Mode it always returns false, which is unintuitive. While it
doesn't look like netif_is_ice() is currently being called anywhere in Safe
Mode, this could change and potentially lead to unexpected behaviour.

Fixes: df006dd4b1 ("ice: Add initial support framework for LAG")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:08:19 -07:00
Marcin Szycik
b972060a47 ice: Fix entering Safe Mode
If DDP package is missing or corrupted, the driver should enter Safe Mode.
Instead, an error is returned and probe fails.

To fix this, don't exit init if ice_init_ddp_config() returns an error.

Repro:
* Remove or rename DDP package (/lib/firmware/intel/ice/ddp/ice.pkg)
* Load ice

Fixes: cc5776fe18 ("ice: Enable switching default Tx scheduler topology")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:08:19 -07:00
Linus Torvalds
75b607fab3 sched_ext: Fixes for v6.12-rc2
- ops.enqueue() didn't have a way to tell whether select_task_rq_scx() and
   thus ops.select() were skipped. Some schedulers were incorrectly using
   SCX_ENQ_WAKEUP. Add SCX_ENQ_CPU_SELECTED and fix scx_qmap using it.
 
 - Remove a spurious WARN_ON_ONCE() in scx_cgroup_exit().
 
 - Fix error information clobbering during load.
 
 - Add missing __weak markers to BPF helper declarations.
 
 - Doc update.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZwWKkA4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGelnAQDTA8GSIahTEHKM0c3yXE6K1/M56zo8Spp5OOA7
 kXHR3AD/Y0RcXgaCvMI13aozmQWq756gyB6/qczN0+X3jx6wZwI=
 =6xbe
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-6.12-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

 - ops.enqueue() didn't have a way to tell whether select_task_rq_scx()
   and thus ops.select() were skipped. Some schedulers were incorrectly
   using SCX_ENQ_WAKEUP. Add SCX_ENQ_CPU_SELECTED and fix scx_qmap using
   it.

 - Remove a spurious WARN_ON_ONCE() in scx_cgroup_exit()

 - Fix error information clobbering during load

 - Add missing __weak markers to BPF helper declarations

 - Doc update

* tag 'sched_ext-for-6.12-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Documentation: Update instructions for running example schedulers
  sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED
  sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called
  sched/core: Make select_task_rq() take the pointer to wake_flags instead of value
  sched_ext: scx_cgroup_exit() may be called without successful scx_cgroup_init()
  sched_ext: Improve error reporting during loading
  sched_ext: Add __weak markers to BPF helper function decalarations
2024-10-08 12:54:04 -07:00
Uwe Kleine-König
01ecc142ef fbdev: Switch back to struct platform_driver::remove()
After commit 0edb555a65 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.

Convert all platform drivers below drivers/video/fbdev to use .remove(),
with the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.

While touching these files, make indention of the struct initializer
consistent in several files.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
2024-10-08 21:47:18 +02:00
Zhang Rui
3fb0eea8a1 thermal: intel: int340x: processor: Add MMIO RAPL PL4 support
Similar to the MSR RAPL interface, MMIO RAPL supports PL4 too, so add
MMIO RAPL PL4d support to the processor_thermal driver.

As a result, the powercap sysfs for MMIO RAPL will show a new "peak
power" constraint.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-7-rui.zhang@intel.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-08 21:39:33 +02:00
Zhang Rui
bfc6819e4b thermal: intel: int340x: processor: Remove MMIO RAPL CPU hotplug support
CPU0/package0 is always online and the MMIO RAPL driver runs on single
package systems only, so there is no need to handle CPU hotplug in it.

Always register a RAPL package device for package 0 and remove the
unnecessary CPU hotplug support.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-6-rui.zhang@intel.com
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-08 21:39:33 +02:00
Sumeet Pawnikar
f517ff174a powercap: intel_rapl_msr: Add PL4 support for Arrowlake-U
Add PL4 support for ArrowLake-U platform.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-5-rui.zhang@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-08 21:39:33 +02:00
Zhang Rui
1d39092397 powercap: intel_rapl_tpmi: Ignore minor version change
The hardware definition of every TPMI feature contains a major and minor
version. When there is a change in the MMIO offset or change in the
definition of a field, hardware will change major version. For addition
of new fields without modifying existing MMIO offsets or fields, only
the minor version is changed.

If the driver has not been updated to recognize a new hardware major
version, it cannot provide the RAPL interface to users due to possible
register layout incompatibilities. However, the driver does not need to
be updated every time the hardware minor version changes because in that
case it will just miss some new functionality exposed by the hardware.

The current implementation causes the driver to refuse to work for any
hardware version change which is unnecessarily restrictive.

If there is a minor version mismatch, log an information message and
continue, but if there is a major version mismatch, log a warning and
exit (as before).

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-4-rui.zhang@intel.com
Fixes: 9eef7f9da9 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver")
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-08 21:38:49 +02:00
Devaansh-Kumar
e0ed52154e sched_ext: Documentation: Update instructions for running example schedulers
Since the artifact paths for tools changed, we need to update the documentation to reflect that path.

Signed-off-by: Devaansh-Kumar <devaanshk840@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-10-08 08:49:18 -10:00
Linus Torvalds
5b7c893ed5 Changes for 6.12-rc3
New:
 	implement fallocate for compressed files;
 	add support for the compression attribute;
 	optimize large writes to sparse files.
 Fixed:
 	fix several potential deadlock scenarios;
 	fix various internal bugs detected by syzbot;
 	add checks before accessing NTFS structures during parsing;
 	correct the format of output messages.
 Refactored:
 	replace fsparam_flag_no with fsparam_flag in options parser;
 	remove unused functions and macros.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEh0DEKNP0I9IjwfWEqbAzH4MkB7YFAmcFRkgACgkQqbAzH4Mk
 B7aQdQ//VT6lzBaxSZ0dm3xs/ygQYE3ter1OczAMQxh70O5px+Z0li5WCXQQe31W
 I74MmAy2G7C+JdzMkSc4F7+pFF21DO+1WQjODl9rP7yoHiLteiLFkMI+P5tHuFFT
 VPupAiCDQgm6QCtW2wofp/WR0ooxpLzC/yQsILrhUXJHdEY21oHLy/MQSOhVnv2k
 VeFdWA70+YS+JEQ+YyZBSeGeMVMQG9VROA57dZ+eAy1vgBUJGcKhYp9v4eiWG1uT
 A8rnV81FiHu+49OrE+ZwStguHkxu30dvlHuOqKW0pLW9/XoKCFf9w0WqLOGrAxLs
 lKznVgQZ521+QW+lNURMHBZt/bqQxQQInuOOH7jwLyBh0tmKoZlIU9Vw4V3yrm3t
 9NELb0FR/FeH02O89nMHl0SeaxDJvgFHDnfoOShuHaffur61d4RjChxqB4QcibOl
 Ibvtns71Uo89ngPIXdeTJK+7XZQVmAZHBXXYL+Rwr84Yv0tyn2q8DbrqzmphiYlR
 wbLyVBGqtx0ZcN/geuFLiVt2W/HSMPocXqrxv791ALX6cryL+CZdolQP7vWjt4Vh
 rKoaRZYN8fdcfT+BIYy95Kz0k12T7ColOoMfxVwugxK9ypF6tH6BJBwYXjCW4MOv
 AhkszJcsISxMKidsbzZA9xESM6M3sLHcW0swjAwWWRhc5zbSe6E=
 =j71V
 -----END PGP SIGNATURE-----

Merge tag 'ntfs3_for_6.12' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 updates from Konstantin Komarov:
"New:
   - implement fallocate for compressed files
   - add support for the compression attribute
   - optimize large writes to sparse files

 Fixes:
   - fix several potential deadlock scenarios
   - fix various internal bugs detected by syzbot
   - add checks before accessing NTFS structures during parsing
   - correct the format of output messages

  Refactoring:
   - replace fsparam_flag_no with fsparam_flag in options parser
   - remove unused functions and macros"

* tag 'ntfs3_for_6.12' of https://github.com/Paragon-Software-Group/linux-ntfs3: (25 commits)
  fs/ntfs3: Format output messages like others fs in kernel
  fs/ntfs3: Additional check in ntfs_file_release
  fs/ntfs3: Fix general protection fault in run_is_mapped_full
  fs/ntfs3: Sequential field availability check in mi_enum_attr()
  fs/ntfs3: Additional check in ni_clear()
  fs/ntfs3: Fix possible deadlock in mi_read
  ntfs3: Change to non-blocking allocation in ntfs_d_hash
  fs/ntfs3: Remove unused al_delete_le
  fs/ntfs3: Rename ntfs3_setattr into ntfs_setattr
  fs/ntfs3: Replace fsparam_flag_no -> fsparam_flag
  fs/ntfs3: Add support for the compression attribute
  fs/ntfs3: Implement fallocate for compressed files
  fs/ntfs3: Make checks in run_unpack more clear
  fs/ntfs3: Add rough attr alloc_size check
  fs/ntfs3: Stale inode instead of bad
  fs/ntfs3: Refactor enum_rstbl to suppress static checker
  fs/ntfs3: Fix sparse warning in ni_fiemap
  fs/ntfs3: Fix warning possible deadlock in ntfs_set_state
  fs/ntfs3: Fix sparse warning for bigendian
  fs/ntfs3: Separete common code for file_read/write iter/splice
  ...
2024-10-08 10:53:06 -07:00
Linus Torvalds
b2760b8390 perf tools fixes for v6.12:
- Fix an assert() to handle captured and unprocessed ARM CoreSight CPU traces.
 
 - Fix static build compilation error when libdw isn't installed or is too old.
 
 - Add missing include when building with !HAVE_DWARF_GETLOCATIONS_SUPPORT.
 
 - Add missing refcount put on 32-bit DSOs.
 
 - Fix disassembly of user space binaries by setting the binary_type of DSO when
   loading.
 
 - Update headers with the kernel sources, including asound.h, sched.h, fcntl,
   msr-index.h, irq_vectors.h, socket.h, list_sort.c and arm64's cputype.h.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZwU2dgAKCRCyPKLppCJ+
 J8uaAQDEbp0lMf1S/Y6vOGbnP6mGQCewQsXtIpSA4gcRMWlCCgD+O6ZxbnBCHOzn
 nQfBmbT62qUGuUA38Mg7pCyRXBd8FgU=
 =s4JZ
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v6.12-1-2024-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix an assert() to handle captured and unprocessed ARM CoreSight CPU
   traces

 - Fix static build compilation error when libdw isn't installed or is
   too old

 - Add missing include when building with
   !HAVE_DWARF_GETLOCATIONS_SUPPORT

 - Add missing refcount put on 32-bit DSOs

 - Fix disassembly of user space binaries by setting the binary_type of
   DSO when loading

 - Update headers with the kernel sources, including asound.h, sched.h,
   fcntl, msr-index.h, irq_vectors.h, socket.h, list_sort.c and arm64's
   cputype.h

* tag 'perf-tools-fixes-for-v6.12-1-2024-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf cs-etm: Fix the assert() to handle captured and unprocessed cpu trace
  perf build: Fix build feature-dwarf_getlocations fail for old libdw
  perf build: Fix static compilation error when libdw is not installed
  perf dwarf-aux: Fix build with !HAVE_DWARF_GETLOCATIONS_SUPPORT
  tools headers arm64: Sync arm64's cputype.h with the kernel sources
  perf tools: Cope with differences for lib/list_sort.c copy from the kernel
  tools check_headers.sh: Add check variant that excludes some hunks
  perf beauty: Update copy of linux/socket.h with the kernel sources
  tools headers UAPI: Sync the linux/in.h with the kernel sources
  perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools include UAPI: Sync linux/fcntl.h copy with the kernel sources
  tools include UAPI: Sync linux/sched.h copy with the kernel sources
  tools include UAPI: Sync sound/asound.h copy with the kernel sources
  perf vdso: Missed put on 32-bit dsos
  perf symbol: Set binary_type of dso when loading
2024-10-08 10:43:22 -07:00
Greg Thelen
1fd9e4f257 selftests: make kselftest-clean remove libynl outputs
Starting with 6.12 commit 85585b4bc8 ("selftests: add ncdevmem, netcat
for devmem TCP") kselftest-all creates additional outputs that
kselftest-clean does not cleanup:
  $ make defconfig
  $ make kselftest-all
  $ make kselftest-clean
  $ git clean -ndxf | grep tools/net
  Would remove tools/net/ynl/lib/__pycache__/
  Would remove tools/net/ynl/lib/ynl.a
  Would remove tools/net/ynl/lib/ynl.d
  Would remove tools/net/ynl/lib/ynl.o

Make kselftest-clean remove the newly added net/ynl outputs.

Fixes: 85585b4bc8 ("selftests: add ncdevmem, netcat for devmem TCP")
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://patch.msgid.link/20241005215600.852260-1-gthelen@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08 08:24:08 -07:00
Jakub Kicinski
c0a30936db Merge branch 'selftests-net-add-missing-gitignore-and-extra_clean-entries'
Javier Carrasco says:

====================
selftests: net: add missing gitignore and EXTRA_CLEAN entries.

This series is a cherry-pick on top of v6.12-rc1 from the one I sent
for selftests with other patches that were not net-related:

https://lore.kernel.org/all/20240925-selftests-gitignore-v3-0-9db896474170@gmail.com/

The patches have not been modified, and the Reviewed-by tags have
been kept.

v1: https://lore.kernel.org/20240930-net-selftests-gitignore-v1-0-65225a855946@gmail.com
====================

Link: https://patch.msgid.link/20241005-net-selftests-gitignore-v2-0-3a0b2876394a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08 08:16:34 -07:00
Javier Carrasco
0e43a5a7b2 selftests: net: rds: add gitignore file for include.sh
The generated include.sh should be ignored by git. Create a new
gitignore and add the file to the list.

Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241005-net-selftests-gitignore-v2-3-3a0b2876394a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08 08:16:34 -07:00
Javier Carrasco
4227b50cff selftests: net: rds: add include.sh to EXTRA_CLEAN
The include.sh file is generated when building the net/rds selftests,
but there is no rule to delete it with the clean target. Add the file to
EXTRA_CLEAN in order to remove it when required.

Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241005-net-selftests-gitignore-v2-2-3a0b2876394a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08 08:16:34 -07:00
Javier Carrasco
9c4beb2dfe selftests: net: add msg_oob to gitignore
This executable is missing from the corresponding gitignore file.
Add msg_oob to the net gitignore list.

Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241005-net-selftests-gitignore-v2-1-3a0b2876394a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08 08:16:32 -07:00
Juergen Gross
bf56c41016 x86/xen: mark boot CPU of PV guest in MSR_IA32_APICBASE
Recent topology checks of the x86 boot code uncovered the need for
PV guests to have the boot cpu marked in the APICBASE MSR.

Fixes: 9d22c96316 ("x86/topology: Handle bogus ACPI tables correctly")
Reported-by: Niels Dettenbach <nd@syndicat.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2024-10-08 16:18:57 +02:00
Christian König
32fda56506 drm/radeon: always set GEM function pointer
Make sure to always set the GEM function pointer even for in kernel
allocations. This fixes a NULL pointer deref caused by switching to GEM
references.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: fd69ef0502 ("drm/radeon: use GEM references instead of TTMs")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 01b64bc063)
2024-10-08 10:18:34 -04:00
Billy Tsai
a6191a3d18 gpio: aspeed: Use devm_clk api to manage clock source
Replace of_clk_get with devm_clk_get_enabled to manage the clock source.

Fixes: 5ae4cb94b3 ("gpio: aspeed: Add debounce support")
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com>
Link: https://lore.kernel.org/r/20241008081450.1490955-3-billy_tsai@aspeedtech.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-10-08 16:01:58 +02:00
Billy Tsai
1bb5a99e1f gpio: aspeed: Add the flush write to ensure the write complete.
Performing a dummy read ensures that the register write operation is fully
completed, mitigating any potential bus delays that could otherwise impact
the frequency of bitbang usage. E.g., if the JTAG application uses GPIO to
control the JTAG pins (TCK, TMS, TDI, TDO, and TRST), and the application
sets the TCK clock to 1 MHz, the GPIO's high/low transitions will rely on
a delay function to ensure the clock frequency does not exceed 1 MHz.
However, this can lead to rapid toggling of the GPIO because the write
operation is POSTed and does not wait for a bus acknowledgment.

Fixes: 361b79119a ("gpio: Add Aspeed driver")
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com>
Link: https://lore.kernel.org/r/20241008081450.1490955-2-billy_tsai@aspeedtech.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-10-08 16:01:58 +02:00
Yonatan Maman
835745a377 nouveau/dmem: Fix vulnerability in migrate_to_ram upon copy error
The `nouveau_dmem_copy_one` function ensures that the copy push command is
sent to the device firmware but does not track whether it was executed
successfully.

In the case of a copy error (e.g., firmware or hardware failure), the
copy push command will be sent via the firmware channel, and
`nouveau_dmem_copy_one` will likely report success, leading to the
`migrate_to_ram` function returning a dirty HIGH_USER page to the user.

This can result in a security vulnerability, as a HIGH_USER page that may
contain sensitive or corrupted data could be returned to the user.

To prevent this vulnerability, we allocate a zero page. Thus, in case of
an error, a non-dirty (zero) page will be returned to the user.

Fixes: 5be73b6908 ("drm/nouveau/dmem: device memory helpers for SVM")
Signed-off-by: Yonatan Maman <Ymaman@Nvidia.com>
Co-developed-by: Gal Shalom <GalShalom@Nvidia.com>
Signed-off-by: Gal Shalom <GalShalom@Nvidia.com>
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Cc: stable@vger.kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008115943.990286-3-ymaman@nvidia.com
2024-10-08 14:23:38 +02:00
Yonatan Maman
04e0481526 nouveau/dmem: Fix privileged error in copy engine channel
When `nouveau_dmem_copy_one` is called, the following error occurs:

[272146.675156] nouveau 0000:06:00.0: fifo: PBDMA9: 00000004 [HCE_PRIV]
ch 1 00000300 00003386

This indicates that a copy push command triggered a Host Copy Engine
Privileged error on channel 1 (Copy Engine channel). To address this
issue, modify the Copy Engine channel to allow privileged push commands

Fixes: 6de125383a ("drm/nouveau/fifo: expose runlist topology info on all chipsets")
Signed-off-by: Yonatan Maman <Ymaman@Nvidia.com>
Co-developed-by: Gal Shalom <GalShalom@Nvidia.com>
Signed-off-by: Gal Shalom <GalShalom@Nvidia.com>
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008115943.990286-2-ymaman@nvidia.com
2024-10-08 14:23:35 +02:00
Paolo Abeni
f15b8d6eb6 Merge branch 'net-dsa-b53-assorted-jumbo-frame-fixes'
Jonas Gorski says:

====================
net: dsa: b53: assorted jumbo frame fixes

While investigating the capabilities of BCM63XX's integrated switch and
its DMA engine, I noticed a few issues in b53's jumbo frame code.

Mostly a confusion of MTU vs frame length, but also a few missing cases
for 100M switches.

Tested on BCM63XX and BCM53115 with intel 1G and realtek 1G NICs,
which support MTUs of 9000 or slightly above, but significantly less
than the 9716/9720 supported by BCM53115, so I couldn't verify the
actual maximum frame length.

Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
---
====================

Link: https://patch.msgid.link/20241004-b53_jumbo_fixes-v1-0-ce1e54aa7b3c@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:42:31 +02:00
Jonas Gorski
2f3dcd0d39 net: dsa: b53: fix jumbo frames on 10/100 ports
All modern chips support and need the 10_100 bit set for supporting jumbo
frames on 10/100 ports, so instead of enabling it only for 583XX enable
it for everything except bcm63xx, where the bit is writeable, but does
nothing.

Tested on BCM53115, where jumbo frames were dropped at 10/100 speeds
without the bit set.

Fixes: 6ae5834b98 ("net: dsa: b53: add MTU configuration support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:42:27 +02:00
Jonas Gorski
e4b294f88a net: dsa: b53: allow lower MTUs on BCM5325/5365
While BCM5325/5365 do not support jumbo frames, they do support slightly
oversized frames, so do not error out if requesting a supported MTU for
them.

Fixes: 6ae5834b98 ("net: dsa: b53: add MTU configuration support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:42:27 +02:00
Jonas Gorski
ca8c1f71c1 net: dsa: b53: fix max MTU for BCM5325/BCM5365
BCM5325/BCM5365 do not support jumbo frames, so we should not report a
jumbo frame mtu for them. But they do support so called "oversized"
frames up to 1536 bytes long by default, so report an appropriate MTU.

Fixes: 6ae5834b98 ("net: dsa: b53: add MTU configuration support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:42:27 +02:00
Jonas Gorski
680a8217dc net: dsa: b53: fix max MTU for 1g switches
JMS_MAX_SIZE is the ethernet frame length, not the MTU, which is payload
without ethernet headers.

According to the datasheets maximum supported frame length for most
gigabyte swithes is 9720 bytes, so convert that to the expected MTU when
using VLAN tagged frames.

Fixes: 6ae5834b98 ("net: dsa: b53: add MTU configuration support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:42:27 +02:00
Jonas Gorski
42fb3acf68 net: dsa: b53: fix jumbo frame mtu check
JMS_MIN_SIZE is the full ethernet frame length, while mtu is just the
data payload size. Comparing these two meant that mtus between 1500 and
1518 did not trigger enabling jumbo frames.

So instead compare the set mtu ETH_DATA_LEN, which is equal to
JMS_MIN_SIZE - ETH_HLEN - ETH_FCS_LEN;

Also do a check that the requested mtu is actually greater than the
minimum length, else we do not need to enable jumbo frames.

In practice this only introduced a very small range of mtus that did not
work properly. Newer chips allow 2000 byte large frames by default, and
older chips allow 1536 bytes long, which is equivalent to an mtu of
1514. So effectivly only mtus of 1515~1517 were broken.

Fixes: 6ae5834b98 ("net: dsa: b53: add MTU configuration support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:42:27 +02:00
Paolo Abeni
60ed96bd1e Merge branch 'fix-ti-am65-cpsw-nuss-module-removal'
Nicolas Pitre says:

====================
fix ti-am65-cpsw-nuss module removal

Fix issues preventing rmmod of ti-am65-cpsw-nuss from working properly.

v3:

  - more patch submission minutiae

v2: https://lore.kernel.org/netdev/20241003172105.2712027-2-nico@fluxnic.net/T/

  - conform to netdev patch submission customs
  - address patch review trivias

v1: https://lore.kernel.org/netdev/20240927025301.1312590-2-nico@fluxnic.net/T/
====================

Link: https://patch.msgid.link/20241004041218.2809774-1-nico@fluxnic.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:30:32 +02:00
Nicolas Pitre
03c96bc9d3 net: ethernet: ti: am65-cpsw: avoid devm_alloc_etherdev, fix module removal
Usage of devm_alloc_etherdev_mqs() conflicts with
am65_cpsw_nuss_cleanup_ndev() as the same struct net_device instances
get unregistered twice. Switch to alloc_etherdev_mqs() and make sure
am65_cpsw_nuss_cleanup_ndev() unregisters and frees those net_device
instances properly.

With this, it is finally possible to rmmod the driver without oopsing
the kernel.

Fixes: 93a7653031 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: Roger Quadros <roger@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:30:30 +02:00
Nicolas Pitre
47f9605484 net: ethernet: ti: am65-cpsw: prevent WARN_ON upon module removal
In am65_cpsw_nuss_remove(), move the call to am65_cpsw_unregister_devlink()
after am65_cpsw_nuss_cleanup_ndev() to avoid triggering the
WARN_ON(devlink_port->type != DEVLINK_PORT_TYPE_NOTSET) in
devl_port_unregister(). Makes it coherent with usage in
m65_cpsw_nuss_register_ndevs()'s cleanup path.

Fixes: 58356eb31d ("net: ti: am65-cpsw-nuss: Add devlink support")
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 10:30:30 +02:00
Lorenzo Bianconi
3dc6e998d1 net: airoha: Update tx cpu dma ring idx at the end of xmit loop
Move the tx cpu dma ring index update out of transmit loop of
airoha_dev_xmit routine in order to not start transmitting the packet
before it is fully DMA mapped (e.g. fragmented skbs).

Fixes: 23020f0493 ("net: airoha: Introduce ethernet support for EN7581 SoC")
Reported-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241004-airoha-eth-7581-mapping-fix-v1-1-8e4279ab1812@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 17:29:22 -07:00
Christian Marangi
f50b5d74c6 net: phy: Remove LED entry from LEDs list on unregister
Commit c938ab4da0 ("net: phy: Manual remove LEDs to ensure correct
ordering") correctly fixed a problem with using devm_ but missed
removing the LED entry from the LEDs list.

This cause kernel panic on specific scenario where the port for the PHY
is torn down and up and the kmod for the PHY is removed.

On setting the port down the first time, the assosiacted LEDs are
correctly unregistered. The associated kmod for the PHY is now removed.
The kmod is now added again and the port is now put up, the associated LED
are registered again.
On putting the port down again for the second time after these step, the
LED list now have 4 elements. With the first 2 already unregistered
previously and the 2 new one registered again.

This cause a kernel panic as the first 2 element should have been
removed.

Fix this by correctly removing the element when LED is unregistered.

Reported-by: Daniel Golle <daniel@makrotopia.org>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Cc: stable@vger.kernel.org
Fixes: c938ab4da0 ("net: phy: Manual remove LEDs to ensure correct ordering")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241004182759.14032-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 17:15:50 -07:00
Jakub Kicinski
f61060fb29 bluetooth pull request for net:
- RFCOMM: FIX possible deadlock in rfcomm_sk_state_change
  - hci_conn: Fix UAF in hci_enhanced_setup_sync
  - btusb: Don't fail external suspend requests
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmcAVtUZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKfgwD/4qQ6WOCaRXeGl3qv1Whbzn
 LaEGLU9KKHeuuIsVQDzIS7lWITKwzyGTz/A/V7BjsbskfKTiUZOaFq46y1KHbnHu
 uXxdd5pIyqL7TwqVgKvMRpfnzbSJojnuyM5uqBBNMkXz9HYlEV6fuxb3IwBxNTNG
 bq7vTtAM2CG6OnsMtJQDNjsOpXpNA6MBeYR/kTtVQTRt5zLqSTbxneHRhHlB0tsS
 7BIjJ8lPmepbZKx5IxNWY+tnLz1YUNjXNN6DcIZY0tfP1R052dm/CN5jxn+H+hm1
 zyuV9T6BPqlftCXHjbakUjMwlOTHbubJkinseFT8tNhI8p0oauQh/KrE+VDvfZDT
 hEe5Hxoxh+2WN7/yzKZZwEleuxuMaPVtQQ7PUheSQBMusdYk/VcOML1N6gvWhboC
 2EtQmZhE/4pkGQ/5qHM0yMMmpE0zufkP+GQdIbsM1gapHGgFFEqFUKXtn7jSf4EF
 n9q+vviOhjVNPV0jb+iO8x419QQG4mHeL5DFqBc23w8juPwu/P8Aw7TTNPH4PKO9
 1786a7Bw3SK0T9EjnmDylU0jaKTKsrN/o+HlJhbHrryymOidHb/Rgi56tHm71NbC
 F6T+7RhXUuNOMlLII/SdYqHmpq9xZ4Zl2Ajtpad7QpixThdhKcfP+/DCPVFEjXTo
 OYXccG05hNfGSg5qBHuSNQ==
 =uwxN
 -----END PGP SIGNATURE-----

Merge tag 'for-net-2024-10-04' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - RFCOMM: FIX possible deadlock in rfcomm_sk_state_change
 - hci_conn: Fix UAF in hci_enhanced_setup_sync
 - btusb: Don't fail external suspend requests

* tag 'for-net-2024-10-04' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: btusb: Don't fail external suspend requests
  Bluetooth: hci_conn: Fix UAF in hci_enhanced_setup_sync
  Bluetooth: RFCOMM: FIX possible deadlock in rfcomm_sk_state_change
====================

Link: https://patch.msgid.link/20241004210124.4010321-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 17:05:20 -07:00
Christophe JAILLET
83211ae164 net: ethernet: adi: adin1110: Fix some error handling path in adin1110_read_fifo()
If 'frame_size' is too small or if 'round_len' is an error code, it is
likely that an error code should be returned to the caller.

Actually, 'ret' is likely to be 0, so if one of these sanity checks fails,
'success' is returned.

Return -EINVAL instead.

Fixes: bc93e19d08 ("net: ethernet: adi: Add ADIN1110 support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://patch.msgid.link/8ff73b40f50d8fa994a454911b66adebce8da266.1727981562.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 16:49:43 -07:00
Jakub Kicinski
5546da79e6 Revert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled"
This reverts commit b514c47ebf.

The commit describes that we don't have to sync the page when
recycling, and it tries to optimize that case. But we do need
to sync after allocation. Recycling side should be changed to
pass the right sync size instead.

Fixes: b514c47ebf ("net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/20241004070846.2502e9ea@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Furong Xu <0x1207@gmail.com>
Link: https://patch.msgid.link/20241004142115.910876-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 16:47:07 -07:00
Anatolij Gustschin
5c14e51d2d net: dsa: lan9303: ensure chip reset and wait for READY status
Accessing device registers seems to be not reliable, the chip
revision is sometimes detected wrongly (0 instead of expected 1).

Ensure that the chip reset is performed via reset GPIO and then
wait for 'Device Ready' status in HW_CFG register before doing
any register initializations.

Cc: stable@vger.kernel.org
Fixes: a1292595e0 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
Signed-off-by: Anatolij Gustschin <agust@denx.de>
[alex: reworked using read_poll_timeout()]
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20241004113655.3436296-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 16:38:02 -07:00
Ignat Korchagin
6310831433 net: explicitly clear the sk pointer, when pf->create fails
We have recently noticed the exact same KASAN splat as in commit
6cd4a78d96 ("net: do not leave a dangling sk pointer, when socket
creation fails"). The problem is that commit did not fully address the
problem, as some pf->create implementations do not use sk_common_release
in their error paths.

For example, we can use the same reproducer as in the above commit, but
changing ping to arping. arping uses AF_PACKET socket and if packet_create
fails, it will just sk_free the allocated sk object.

While we could chase all the pf->create implementations and make sure they
NULL the freed sk object on error from the socket, we can't guarantee
future protocols will not make the same mistake.

So it is easier to just explicitly NULL the sk pointer upon return from
pf->create in __sock_create. We do know that pf->create always releases the
allocated sk object on error, so if the pointer is not NULL, it is
definitely dangling.

Fixes: 6cd4a78d96 ("net: do not leave a dangling sk pointer, when socket creation fails")
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Cc: stable@vger.kernel.org
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241003170151.69445-1-ignat@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 16:21:59 -07:00
Filipe Manana
6ef8fbce01 btrfs: fix missing error handling when adding delayed ref with qgroups enabled
When adding a delayed ref head, at delayed-ref.c:add_delayed_ref_head(),
if we fail to insert the qgroup record we don't error out, we ignore it.
In fact we treat it as if there was no error and there was already an
existing record - we don't distinguish between the cases where
btrfs_qgroup_trace_extent_nolock() returns 1, meaning a record already
existed and we can free the given record, and the case where it returns
a negative error value, meaning the insertion into the xarray that is
used to track records failed.

Effectively we end up ignoring that we are lacking qgroup record in the
dirty extents xarray, resulting in incorrect qgroup accounting.

Fix this by checking for errors and return them to the callers.

Fixes: 3cce39a8ca ("btrfs: qgroup: use xarray to track dirty extents in transaction")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-07 23:22:30 +02:00
Luca Stefani
69313850dc btrfs: add cancellation points to trim loops
There are reports that system cannot suspend due to running trim because
the task responsible for trimming the device isn't able to finish in
time, especially since we have a free extent discarding phase, which can
trim a lot of unallocated space. There are no limits on the trim size
(unlike the block group part).

Since trime isn't a critical call it can be interrupted at any time,
in such cases we stop the trim, report the amount of discarded bytes and
return an error.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-07 23:21:56 +02:00