linux/kernel/irq
Thomas Gleixner e43b3b5854 genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs
Managed interrupts can end up in a stale state on CPU hotplug. If the
interrupt is not targeting a single CPU, i.e. the affinity mask spawns
multiple CPUs then the following can happen:

After boot:

dstate:   0x01601200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  0

After offlining CPU 31 - 24

dstate:   0x01a31000
            IRQD_IRQ_DISABLED
            IRQD_IRQ_MASKED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_AFFINITY_MANAGED
            IRQD_MANAGED_SHUTDOWN
node:     0
affinity: 24-31
effectiv: 24
pending:  0

Now CPU 25 gets onlined again, so it should get the effective interrupt
affinity for this interruopt, but due to the x86 interrupt affinity setter
restrictions this ends up after restarting the interrupt with:

dstate:   0x01601300
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_SETAFFINITY_PENDING
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  24-31

So the interrupt is still affine to CPU 24, which was the last CPU to go
offline of that affinity set and the move to an online CPU within 24-31,
in this case 25, is pending. This mechanism is x86/ia64 specific as those
architectures cannot move interrupts from thread context and do this when
an interrupt is actually handled. So the move is set to pending.

Whats worse is that offlining CPU 25 again results in:

dstate:   0x01601300
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_SETAFFINITY_PENDING
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  24-31

This means the interrupt has not been shut down, because the outgoing CPU
is not in the effective affinity mask, but of course nothing notices that
the effective affinity mask is pointing at an offline CPU.

In the case of restarting a managed interrupt the move restriction does not
apply, so the affinity setting can be made unconditional. This needs to be
done _before_ the interrupt is started up as otherwise the condition for
moving it from thread context would not longer be fulfilled.

With that change applied onlining CPU 25 after offlining 31-24 results in:

dstate:   0x01600200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 25
pending:  

And after offlining CPU 25:

dstate:   0x01a30000
            IRQD_IRQ_DISABLED
            IRQD_IRQ_MASKED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_MANAGED
            IRQD_MANAGED_SHUTDOWN
node:     0
affinity: 24-31
effectiv: 25
pending:  

which is the correct and expected result.

Fixes: 761ea388e8 ("genirq: Handle managed irqs gracefully in irq_startup()")
Reported-by: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: axboe@kernel.dk
Cc: linux-scsi@vger.kernel.org
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: mpe@ellerman.id.au
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: keith.busch@intel.com
Cc: peterz@infradead.org
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710042208400.2406@nanos
2017-10-09 13:26:48 +02:00
..
affinity.c pci-v4.13-changes 2017-07-08 15:51:57 -07:00
autoprobe.c genirq: Add force argument to irq_startup() 2017-06-22 18:21:24 +02:00
chip.c genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs 2017-10-09 13:26:48 +02:00
cpuhotplug.c genirq/cpuhotplug: Add sanity check for effective affinity mask 2017-10-09 13:26:48 +02:00
debug.h
debugfs.c genirq/debugfs: Triggering of interrupts from userspace 2017-08-18 10:36:24 +02:00
devres.c irq/generic-chip: Provide devm_irq_setup_generic_chip() 2017-06-21 15:53:11 +02:00
dummychip.c Merge branch 'linus' into irq/core 2015-06-05 22:25:01 +02:00
generic-chip.c irq/generic-chip: Don't replace domain's name 2017-09-28 12:18:59 +02:00
handle.c There has been a fair amount of activity in the docs tree this time 2017-07-03 21:13:25 -07:00
internals.h genirq: Fix for_each_action_of_desc() macro 2017-08-14 12:10:37 +02:00
ipi.c genirq/ipi: Fixup checks against nr_cpu_ids 2017-08-20 10:49:05 +02:00
irq_sim.c genirq/irq_sim: Add a devres variant of irq_sim_init() 2017-08-16 16:40:02 +02:00
irqdesc.c genirq: Make sparse_irq_lock protect what it should protect 2017-09-07 09:30:38 +02:00
irqdomain.c irqdomain: Add __rcu annotations to radix tree accessors 2017-09-25 21:23:44 +02:00
Kconfig genirq: Add handle_fasteoi_{level,edge}_irq flow handlers 2017-08-18 11:21:41 +02:00
Makefile genirq/irq_sim: Add a simple interrupt simulator framework 2017-08-16 16:40:02 +02:00
manage.c genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs 2017-10-09 13:26:48 +02:00
migration.c genirq: Provide irq_fixup_move_pending() 2017-06-22 18:21:13 +02:00
msi.c genirq/msi: Fix populating multiple interrupts 2017-09-06 11:41:20 +02:00
pm.c genirq/PM: Properly pretend disabled state when force resuming interrupts 2017-07-17 22:32:20 +02:00
proc.c genirq/proc: Avoid uninitalized variable warning 2017-08-25 22:40:26 +02:00
resend.c genirq: Remove irq argument from irq flow handlers 2015-09-16 15:47:51 +02:00
settings.h genirq: Add flag to force mask in disable_irq[_nosync]() 2015-10-11 11:33:42 +02:00
spurious.c genirq: Clarify logic calculating bogus irqreturn_t values 2017-02-16 15:32:19 +01:00
timings.c genirq/timings: Add infrastructure for estimating the next interrupt arrival time 2017-06-24 11:44:39 +02:00