linux/kernel
Dongli Zhang a6c11c0a52 genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
interrupt affinity reconfiguration via procfs. Instead, the change is
deferred until the next instance of the interrupt being triggered on the
original CPU.

When the interrupt next triggers on the original CPU, the new affinity is
enforced within __irq_move_irq(). A vector is allocated from the new CPU,
but the old vector on the original CPU remains and is not immediately
reclaimed. Instead, apicd->move_in_progress is flagged, and the reclaiming
process is delayed until the next trigger of the interrupt on the new CPU.

Upon the subsequent triggering of the interrupt on the new CPU,
irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
remains online. Subsequently, the timer on the old CPU iterates over its
vector_cleanup list, reclaiming old vectors.

However, a rare scenario arises if the old CPU is outgoing before the
interrupt triggers again on the new CPU.

In that case irq_force_complete_move() is not invoked on the outgoing CPU
to reclaim the old apicd->prev_vector because the interrupt isn't currently
affine to the outgoing CPU, and irq_needs_fixup() returns false. Even
though __vector_schedule_cleanup() is later called on the new CPU, it
doesn't reclaim apicd->prev_vector; instead, it simply resets both
apicd->move_in_progress and apicd->prev_vector to 0.

As a result, the vector remains unreclaimed in vector_matrix, leading to a
CPU vector leak.

To address this issue, move the invocation of irq_force_complete_move()
before the irq_needs_fixup() call to reclaim apicd->prev_vector, if the
interrupt is currently or used to be affine to the outgoing CPU.

Additionally, reclaim the vector in __vector_schedule_cleanup() as well,
following a warning message, although theoretically it should never see
apicd->move_in_progress with apicd->prev_cpu pointing to an offline CPU.

Fixes: f0383c24b4 ("genirq/cpuhotplug: Add support for cleaning up move in progress")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240522220218.162423-1-dongli.zhang@oracle.com
2024-05-23 21:51:50 +02:00
..
bpf The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
cgroup Misc fixes: 2024-05-19 11:38:15 -07:00
configs hardening updates for 6.10-rc1 2024-05-13 14:14:05 -07:00
debug kdb: Simplify management of tmpbuffer in kdb_read() 2024-04-26 17:13:31 +01:00
dma dma-mapping updates for Linux 6.10 2024-05-20 10:23:39 -07:00
entry entry: Respect changes to system call number by trace_sys_enter() 2024-03-12 13:23:32 +01:00
events The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
futex printk: Change type of CONFIG_BASE_SMALL to bool 2024-05-06 17:39:09 +02:00
gcov gcov: annotate struct gcov_iterator with __counted_by 2023-10-18 14:43:22 -07:00
irq genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline 2024-05-23 21:51:50 +02:00
kcsan kcsan, compiler_types: Introduce __data_racy type qualifier 2024-05-07 11:39:50 -07:00
livepatch livepatch: Rename KLP_* to KLP_TRANSITION_* 2024-05-09 15:48:01 +02:00
locking - Core Frameworks 2024-05-22 10:49:54 -07:00
module Driver core changes for 6.10-rc1 2024-05-22 12:13:40 -07:00
power getting rid of bogus set_blocksize() uses, switching it 2024-05-21 08:34:51 -07:00
printk TTY/Serial changes for 6.10-rc1 2024-05-22 11:53:02 -07:00
rcu Merge branches 'fixes.2024.04.15a', 'misc.2024.04.12a', 'rcu-sync-normal-improve.2024.04.15a', 'rcu-tasks.2024.04.15a' and 'rcutorture.2024.04.15a' into rcu-merge.2024.04.15a 2024-05-01 13:04:02 +02:00
sched bitmap patches for 6.10 2024-05-21 15:29:01 -07:00
time sysctl changes for v6.10-rc1 2024-05-17 17:31:24 -07:00
trace Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
.gitignore
acct.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
async.c async: Use a dedicated unbound workqueue with raised min_active 2024-02-09 11:13:59 -10:00
audit_fsnotify.c
audit_tree.c fsnotify: create a wrapper fsnotify_find_inode_mark() 2024-04-04 16:24:16 +02:00
audit_watch.c fsnotify: create a wrapper fsnotify_find_inode_mark() 2024-04-04 16:24:16 +02:00
audit.c audit: use KMEM_CACHE() instead of kmem_cache_create() 2024-01-25 10:12:22 -05:00
audit.h audit: correct audit_filter_inodes() definition 2023-07-21 12:17:25 -04:00
auditfilter.c audit: remove unnecessary assignment in audit_dupe_lsm_field() 2024-01-25 09:59:27 -05:00
auditsc.c audit,io_uring: io_uring openat triggers audit reference count underflow 2023-10-13 18:34:46 +02:00
backtracetest.c backtracetest: Convert from tasklet to BH workqueue 2024-02-05 13:22:34 -10:00
bounds.c bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS 2024-04-29 08:29:29 -07:00
capability.c lsm: constify the 'target' parameter in security_capget() 2023-08-08 16:48:47 -04:00
cfi.c
compat.c
configs.c
context_tracking.c context_tracking: Make context_tracking_key __ro_after_init 2024-03-22 11:18:18 +01:00
cpu_pm.c
cpu.c cgroup: Changes for v6.10 2024-05-15 17:06:08 -07:00
crash_core.c Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
crash_reserve.c crash: add prefix for crash dumping messages 2024-05-08 08:41:26 -07:00
cred.c cred: Use KMEM_CACHE() instead of kmem_cache_create() 2024-02-23 17:33:31 -05:00
delayacct.c delayacct: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:54 +02:00
dma.c
elfcorehdr.c crash: remove dependency of FA_DUMP on CRASH_DUMP 2024-02-23 17:48:22 -08:00
exec_domain.c
exit.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
exit.h exit: add internal include file with helpers 2023-09-21 12:03:50 -06:00
extable.c
fail_function.c
fork.c mm: rename mm_put_huge_zero_page to mm_put_huge_zero_folio 2024-04-25 20:56:20 -07:00
freezer.c Linux 6.7-rc6 2023-12-23 15:52:13 +01:00
gen_kheaders.sh
groups.c groups: Convert group_info.usage to refcount_t 2023-09-29 11:28:39 -07:00
hung_task.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
iomem.c kernel/iomem.c: remove __weak ioremap_cache helper 2023-08-21 13:37:28 -07:00
irq_work.c
jump_label.c jump_label,module: Don't alloc static_key_mod for __ro_after_init keys 2024-03-22 11:18:16 +01:00
kallsyms_internal.h kallsyms: Avoid weak references for kallsyms symbols 2024-05-02 19:48:26 +09:00
kallsyms_selftest.c mm: vmalloc: enable memory allocation profiling 2024-04-25 20:55:57 -07:00
kallsyms_selftest.h
kallsyms.c kallsyms: Avoid weak references for kallsyms symbols 2024-05-02 19:48:26 +09:00
kcmp.c file: convert to SLAB_TYPESAFE_BY_RCU 2023-10-19 11:02:48 +02:00
Kconfig.freezer
Kconfig.hz
Kconfig.kexec crash: clean up kdump related config items 2024-02-23 17:48:22 -08:00
Kconfig.locks
Kconfig.preempt
kcov.c kcov: avoid clang out-of-range warning 2024-04-25 21:07:04 -07:00
kexec_core.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
kexec_elf.c
kexec_file.c crash: add a new kexec flag for hotplug support 2024-04-23 14:59:01 +10:00
kexec_internal.h crash: remove dependency of FA_DUMP on CRASH_DUMP 2024-02-23 17:48:22 -08:00
kexec.c crash: add a new kexec flag for hotplug support 2024-04-23 14:59:01 +10:00
kheaders.c
kprobes.c kprobe/ftrace: fix build error due to bad function definition 2024-05-17 19:17:55 -07:00
ksyms_common.c
ksysfs.c vmlinux: Avoid weak reference to notes section 2024-05-02 19:48:26 +09:00
kthread.c kunit: Handle test faults 2024-05-06 14:22:02 -06:00
latencytop.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
Makefile crash: split crash dumping code out from kexec_core.c 2024-02-23 17:48:22 -08:00
module_signature.c
notifier.c
nsproxy.c pidfd: add pidfs 2024-03-01 12:23:37 +01:00
numa.c kernel/numa.c: Move logging out of numa.h 2023-12-20 19:26:30 -05:00
padata.c padata: Disable BH when taking works lock on MT path 2024-04-12 15:07:51 +08:00
panic.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
params.c params: Fix multi-line comment style 2023-12-01 09:51:44 -08:00
pid_namespace.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
pid_sysctl.h kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
pid.c pidfs: remove config option 2024-03-13 12:53:53 -07:00
profile.c profiling: Remove create_prof_cpu_mask(). 2024-04-27 11:17:48 -07:00
ptrace.c ptrace_attach: shift send(SIGSTOP) into ptrace_set_stopped() 2024-02-22 15:38:52 -08:00
range.c
reboot.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
regset.c regset: use kvzalloc() for regset_get_alloc() 2024-04-25 21:07:03 -07:00
relay.c kernel: relay: remove relay_file_splice_read dead code, doesn't work 2023-12-29 12:22:27 -08:00
resource_kunit.c
resource.c Quite a lot of kexec work this time around. Many singleton patches in 2024-01-09 11:46:20 -08:00
rseq.c
scftorture.c scftorture: Pause testing after memory-allocation failure 2023-07-14 15:02:57 -07:00
scs.c
seccomp.c sysctl changes for v6.10-rc1 2024-05-17 17:31:24 -07:00
signal.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
smp.c CSD lock commits for v6.7 2023-10-30 17:56:53 -10:00
smpboot.c kthread: add kthread_stop_put 2023-10-04 10:41:57 -07:00
smpboot.h
softirq.c softirq: Fix suspicious RCU usage in __do_softirq() 2024-04-29 05:03:51 +02:00
stackleak.c sysctl changes for v6.10-rc1 2024-05-17 17:31:24 -07:00
stacktrace.c stacktrace: fix kernel-doc typo 2023-12-29 12:22:29 -08:00
static_call_inline.c
static_call.c
stop_machine.c
sys_ni.c lsm/stable-6.8 PR 20240105 2024-01-09 12:57:46 -08:00
sys.c RISC-V Patches for the 6.10 Merge Window, Part 1 2024-05-22 09:56:00 -07:00
sysctl-test.c
sysctl.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
task_work.c task_work: add kerneldoc annotation for 'data' argument 2023-09-19 13:21:32 -07:00
taskstats.c taskstats: fill_stats_for_tgid: use for_each_thread() 2023-10-04 10:41:57 -07:00
torture.c torture: Print out torture module parameters 2023-09-24 17:24:01 +02:00
tracepoint.c
tsacct.c
ucount.c sysctl changes for v6.10-rc1 2024-05-17 17:31:24 -07:00
uid16.c
uid16.h
umh.c umh: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
up.c smp: Change function signatures to use call_single_data_t 2023-09-13 14:59:24 +02:00
user_namespace.c user_namespace: remove unnecessary NULL values from kbuf 2024-02-22 15:38:52 -08:00
user-return-notifier.c
user.c printk: Change type of CONFIG_BASE_SMALL to bool 2024-05-06 17:39:09 +02:00
usermode_driver.c
utsname_sysctl.c kernel misc: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:53 +02:00
utsname.c
vhost_task.c
vmcore_info.c mm: free up PG_slab 2024-04-25 20:56:00 -07:00
watch_queue.c watch_queue: fix kcalloc() arguments order 2023-12-21 13:17:54 +01:00
watchdog_buddy.c watchdog/hardlockup: move SMP barriers from common code to buddy code 2023-06-19 16:25:28 -07:00
watchdog_perf.c kernel/watchdog_perf.c: tidy up kerneldoc 2024-05-08 08:41:29 -07:00
watchdog.c Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
workqueue_internal.h workqueue: Drop the special locking rule for worker->flags and worker_pool->flags 2023-08-07 15:57:22 -10:00
workqueue.c Merge branch 'for-6.10' into test-merge-for-6.10 2024-05-15 11:40:33 -10:00