Xen project Mailing List

Re: [PATCH v5 2/5] arm/irq: Migrate IRQs during CPU up/down operations

To: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Tue, 3 Mar 2026 15:55:09 -0800 (PST)

Cc: Mykyta Poturai <Mykyta_Poturai@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>

Delivery-date: Tue, 03 Mar 2026 23:55:34 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, 5 Feb 2026, Bertrand Marquis wrote: > > On 5 Feb 2026, at 14:23, Mykyta Poturai <Mykyta_Poturai@xxxxxxxx> wrote: > > > > On 04.02.26 16:20, Bertrand Marquis wrote: > >> Hi Mykyta. > >> > >>> On 13 Jan 2026, at 09:45, Mykyta Poturai <Mykyta_Poturai@xxxxxxxx> wrote: > >>> > >>> Move IRQs from dying CPU to the online ones when a CPU is getting > >>> offlined. When onlining, rebalance all IRQs in a round-robin fashion. > >>> Guest-bound IRQs are already handled by scheduler in the process of > >>> moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen > >>> itself. > >>> > >>> Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx> > >>> --- > >>> v4->v5: > >>> * handle CPU onlining as well > >>> * more comments > >>> * fix crash when ESPI is disabled > >>> * don't assume CPU 0 is a boot CPU > >>> * use insigned int for irq number > >>> * remove assumption that all irqs a bound to CPU 0 by default from the > >>> commit message > >>> > >>> v3->v4: > >>> * patch introduced > >>> --- > >>> xen/arch/arm/include/asm/irq.h | 2 ++ > >>> xen/arch/arm/irq.c | 54 ++++++++++++++++++++++++++++++++++ > >>> xen/arch/arm/smpboot.c | 6 ++++ > >>> 3 files changed, 62 insertions(+) > >>> > >>> diff --git a/xen/arch/arm/include/asm/irq.h > >>> b/xen/arch/arm/include/asm/irq.h > >>> index 09788dbfeb..a0250bac85 100644 > >>> --- a/xen/arch/arm/include/asm/irq.h > >>> +++ b/xen/arch/arm/include/asm/irq.h > >>> @@ -126,6 +126,8 @@ bool irq_type_set_by_domain(const struct domain *d); > >>> void irq_end_none(struct irq_desc *irq); > >>> #define irq_end_none irq_end_none > >>> > >>> +void rebalance_irqs(unsigned int from, bool up); > >>> + > >>> #endif /* _ASM_HW_IRQ_H */ > >>> /* > >>> * Local variables: > >>> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c > >>> index 7204bc2b68..a32dc729f8 100644 > >>> --- a/xen/arch/arm/irq.c > >>> +++ b/xen/arch/arm/irq.c > >>> @@ -158,6 +158,58 @@ static int init_local_irq_data(unsigned int cpu) > >>> return 0; > >>> } > >>> > >>> +static int cpu_next; > >>> + > >>> +static void balance_irq(int irq, unsigned int from, bool up) > >>> +{ > >>> + struct irq_desc *desc = irq_to_desc(irq); > >>> + unsigned long flags; > >>> + > >>> + ASSERT(!cpumask_empty(&cpu_online_map)); > >>> + > >>> + spin_lock_irqsave(&desc->lock, flags); > >>> + if ( likely(!desc->action) ) > >>> + goto out; > >>> + > >>> + if ( likely(test_bit(_IRQ_GUEST, &desc->status) || > >>> + test_bit(_IRQ_MOVE_PENDING, &desc->status)) ) > >>> + goto out; > >>> + > >>> + /* > >>> + * Setting affinity to a mask of multiple CPUs causes the GIC > >>> drivers to > >>> + * select one CPU from that mask. If the dying CPU was included in > >>> the IRQ's > >>> + * affinity mask, we cannot determine exactly which CPU the > >>> interrupt is > >>> + * currently routed to, as GIC drivers lack a concrete get_affinity > >>> API. So > >>> + * to be safe we must reroute it to a new, definitely online, CPU. > >>> In the > >>> + * case of CPU going down, we move only the interrupt that could > >>> reside on > >>> + * it. Otherwise, we rearrange all interrupts in a round-robin > >>> fashion. > >>> + */ > >>> + if ( !up && !cpumask_test_cpu(from, desc->affinity) ) > >>> + goto out; > >> > >> I am a bit lost here on what you are trying to do in the case where > >> a cpu is coming up here, it feels like you are trying to change the > >> affinity of all interrupts in this case to cycle everything. > >> Is it really what is expected ? > >> If affinity was set by a VM on its interrupts, I would not expect > >> Xen to round-robin everything each time a cpu comes up. > >> > > > > The idea is to evenly spread interrupts between CPUs when the new ones > > are being brought online. This is needed to prevent Xen-bound IRQs from > > piling up on CPU 0 when other cores are being offlined and then onlined > > back. It shouldn’t mess with guest affinities, as the code skips > > everything that is assigned to guests and leaves it to be handled by the > > scheduler/vgic. Performance-wise, it should also be okay, as from what > > I’ve seen, there are not many interrupts used by Xen, and I expect CPU > > hotplug operations to be fairly infrequent. > > My fear here is a bit that by removing and adding a cpu we will completely > change irq affinities. I am not so sure that those kind of random assignments > are compatible with embedded or safety use cases and here there is no way > to configure this. > > @Julien, Stefano and Michal: What do you think here ? I think Bertrand's concern is valid. We need to make sure that balance_irq is not called in "normal" or safety use-cases. I am going by the assumption that pCPU offline/online is not going to happen in safety use-cases. So as long as balance_irq is not called unless CPU offline/online happens, then I think we are good. It should definitely not happen on normal boot. There is also the other thought that while I do not like the idea of spreading the IRQ evenly, I also cannot suggest something better (unless we want to keep track of which IRQ used to belong to which dead CPU, but I think that is undesirable).

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.