[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 2/5] arm/irq: Migrate IRQs during CPU up/down operations



On Thu, 5 Feb 2026, Bertrand Marquis wrote:
> > On 5 Feb 2026, at 14:23, Mykyta Poturai <Mykyta_Poturai@xxxxxxxx> wrote:
> > 
> > On 04.02.26 16:20, Bertrand Marquis wrote:
> >> Hi Mykyta.
> >> 
> >>> On 13 Jan 2026, at 09:45, Mykyta Poturai <Mykyta_Poturai@xxxxxxxx> wrote:
> >>> 
> >>> Move IRQs from dying CPU to the online ones when a CPU is getting
> >>> offlined. When onlining, rebalance all IRQs in a round-robin fashion.
> >>> Guest-bound IRQs are already handled by scheduler in the process of
> >>> moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen
> >>> itself.
> >>> 
> >>> Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx>
> >>> ---
> >>> v4->v5:
> >>> * handle CPU onlining as well
> >>> * more comments
> >>> * fix crash when ESPI is disabled
> >>> * don't assume CPU 0 is a boot CPU
> >>> * use insigned int for irq number
> >>> * remove assumption that all irqs a bound to CPU 0 by default from the
> >>>  commit message
> >>> 
> >>> v3->v4:
> >>> * patch introduced
> >>> ---
> >>> xen/arch/arm/include/asm/irq.h |  2 ++
> >>> xen/arch/arm/irq.c             | 54 ++++++++++++++++++++++++++++++++++
> >>> xen/arch/arm/smpboot.c         |  6 ++++
> >>> 3 files changed, 62 insertions(+)
> >>> 
> >>> diff --git a/xen/arch/arm/include/asm/irq.h 
> >>> b/xen/arch/arm/include/asm/irq.h
> >>> index 09788dbfeb..a0250bac85 100644
> >>> --- a/xen/arch/arm/include/asm/irq.h
> >>> +++ b/xen/arch/arm/include/asm/irq.h
> >>> @@ -126,6 +126,8 @@ bool irq_type_set_by_domain(const struct domain *d);
> >>> void irq_end_none(struct irq_desc *irq);
> >>> #define irq_end_none irq_end_none
> >>> 
> >>> +void rebalance_irqs(unsigned int from, bool up);
> >>> +
> >>> #endif /* _ASM_HW_IRQ_H */
> >>> /*
> >>>  * Local variables:
> >>> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> >>> index 7204bc2b68..a32dc729f8 100644
> >>> --- a/xen/arch/arm/irq.c
> >>> +++ b/xen/arch/arm/irq.c
> >>> @@ -158,6 +158,58 @@ static int init_local_irq_data(unsigned int cpu)
> >>>     return 0;
> >>> }
> >>> 
> >>> +static int cpu_next;
> >>> +
> >>> +static void balance_irq(int irq, unsigned int from, bool up)
> >>> +{
> >>> +    struct irq_desc *desc = irq_to_desc(irq);
> >>> +    unsigned long flags;
> >>> +
> >>> +    ASSERT(!cpumask_empty(&cpu_online_map));
> >>> +
> >>> +    spin_lock_irqsave(&desc->lock, flags);
> >>> +    if ( likely(!desc->action) )
> >>> +        goto out;
> >>> +
> >>> +    if ( likely(test_bit(_IRQ_GUEST, &desc->status) ||
> >>> +                test_bit(_IRQ_MOVE_PENDING, &desc->status)) )
> >>> +        goto out;
> >>> +
> >>> +    /*
> >>> +     * Setting affinity to a mask of multiple CPUs causes the GIC 
> >>> drivers to
> >>> +     * select one CPU from that mask. If the dying CPU was included in 
> >>> the IRQ's
> >>> +     * affinity mask, we cannot determine exactly which CPU the 
> >>> interrupt is
> >>> +     * currently routed to, as GIC drivers lack a concrete get_affinity 
> >>> API. So
> >>> +     * to be safe we must reroute it to a new, definitely online, CPU. 
> >>> In the
> >>> +     * case of CPU going down, we move only the interrupt that could 
> >>> reside on
> >>> +     * it. Otherwise, we rearrange all interrupts in a round-robin 
> >>> fashion.
> >>> +     */
> >>> +    if ( !up && !cpumask_test_cpu(from, desc->affinity) )
> >>> +        goto out;
> >> 
> >> I am a bit lost here on what you are trying to do in the case where
> >> a cpu is coming up here, it feels like you are trying to change the
> >> affinity of all interrupts in this case to cycle everything.
> >> Is it really what is expected ?
> >> If affinity was set by a VM on its interrupts, I would not expect
> >> Xen to round-robin everything each time a cpu comes up.
> >> 
> > 
> > The idea is to evenly spread interrupts between CPUs when the new ones 
> > are being brought online. This is needed to prevent Xen-bound IRQs from 
> > piling up on CPU 0 when other cores are being offlined and then onlined 
> > back. It shouldn’t mess with guest affinities, as the code skips 
> > everything that is assigned to guests and leaves it to be handled by the 
> > scheduler/vgic. Performance-wise, it should also be okay, as from what 
> > I’ve seen, there are not many interrupts used by Xen, and I expect CPU 
> > hotplug operations to be fairly infrequent.
> 
> My fear here is a bit that by removing and adding a cpu we will completely
> change irq affinities. I am not so sure that those kind of random assignments
> are compatible with embedded or safety use cases and here there is no way
> to configure this.
> 
> @Julien, Stefano and Michal: What do you think here ?

I think Bertrand's concern is valid. We need to make sure that
balance_irq is not called in "normal" or safety use-cases. I am going by
the assumption that pCPU offline/online is not going to happen in safety
use-cases. So as long as balance_irq is not called unless CPU
offline/online happens, then I think we are good. It should definitely
not happen on normal boot.                                            
 
There is also the other thought that while I do not like the idea of
spreading the IRQ evenly, I also cannot suggest something better (unless
we want to keep track of which IRQ used to belong to which dead CPU, but
I think that is undesirable).

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.