|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 4/8] arm/irq: Migrate IRQs from dyings CPUs
On 16.11.25 13:24, Julien Grall wrote:
> Hi,
>
> On 12/11/2025 10:51, Mykyta Poturai wrote:
>> Move IRQs from dying CPU to the online ones.
>> Guest-bound IRQs are already handled by scheduler in the process of
>> moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen
>> itself.
>>
>> If IRQ is to be migrated, it's affinity is set to a mask of all online
>> CPUs. With current GIC implementation, this means they are routed to a
>> random online CPU. This may cause extra moves if multiple cores are
>> disabled in sequence, but should prevent all interrupts from piling up
>> on CPU0 in case of repeated up-down cycles on different cores.
>
> Wouldn't they eventually all move to CPU0 in the case of suspend/resume
> or if all the CPUs but CPU0 are turned off and then off? If so,
> shouldn't we try to rebalance the interrupts?
>
In case of disabling/enabling all cores in a sequence, yes. This was
designed with the idea of achieving some balancing when
enabling/disabling some cores for power saving reasons. I agree that
proper balancing should be implemented, but it is a complex task on its
own and requires a substantial amount of testing on different hardware
to prove it is close to optimal. So I think implementing it is out of
scope for what I’m trying to do here.
If this would be okay, I can implement a relatively simple solution of
just adding onlined CPUs back to the affinity mask for now. I think it
should improve the situation for the “switching all cores” case.
>>
>> IRQs from CPU 0 are never migrated, as dying CPU 0 means we are either
>> shutting down compeletely or entering system suspend.
>
> I can't find a place where __cpu_disable() is called on CPU0. Do you
> have any pointer? In any case, I am not sure I want to bake that
> assumption in more places of the code.
>
I assume it would be called when suspend is implemented. In any case, I
will rework this to replace the hard check for CPU 0 with the “is it the
last CPU online” one.
>>
>> Considering that all Xen-used IRQs are currently allocated during init
>> on CPU 0, and setup_irq uses smp_processor_id for the initial affinity.
>
> Looking at the SMMU driver, we seems to request IRQs at the time the
> device is attached. So are you sure about this?
>
Indeed, I have missed that one. I will remove those statements then.
>> This change is not strictly required for correct operation for now, but
>> it should future-proof cpu hotplug and system suspend support in case
>> some kind if IRQ balancing is implemented later.
>>
>> Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx>
>>
>> v3->v4:
>> * patch introduced
>> ---
>> xen/arch/arm/include/asm/irq.h | 2 ++
>> xen/arch/arm/irq.c | 39 ++++++++++++++++++++++++++++++++++
>> xen/arch/arm/smpboot.c | 2 ++
>> 3 files changed, 43 insertions(+)
>>
>> diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/
>> asm/irq.h
>> index 09788dbfeb..6e6e27bb80 100644
>> --- a/xen/arch/arm/include/asm/irq.h
>> +++ b/xen/arch/arm/include/asm/irq.h
>> @@ -126,6 +126,8 @@ bool irq_type_set_by_domain(const struct domain *d);
>> void irq_end_none(struct irq_desc *irq);
>> #define irq_end_none irq_end_none
>> +void evacuate_irqs(unsigned int from);
>> +
>> #endif /* _ASM_HW_IRQ_H */
>> /*
>> * Local variables:
>> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
>> index 28b40331f7..b383d71930 100644
>> --- a/xen/arch/arm/irq.c
>> +++ b/xen/arch/arm/irq.c
>> @@ -158,6 +158,45 @@ static int init_local_irq_data(unsigned int cpu)
>> return 0;
>> }
>> +static void evacuate_irq(int irq, unsigned int from)
>
> Any reason why the 'irq' is signed?
>
>> +{
>> + struct irq_desc *desc = irq_to_desc(irq);
>> + unsigned long flags;
>> +
>> + /* Don't move irqs from CPU 0 as it is always last to be disabled */
>
> Per above, I am not convinced that we should special case CPU 0. But if
> we do, then shouldn't this be part of evacuate_irqs() so we don't
> pointlessly go through all the IRQs?
>
>> + if ( from == 0 )
>> + return;
>> +
>> + ASSERT(!cpumask_empty(&cpu_online_map));
>> + ASSERT(!cpumask_test_cpu(from, &cpu_online_map));
>> +
>> + spin_lock_irqsave(&desc->lock, flags);
>> + if ( likely(!desc->action) )
>> + goto out;
>> +
>> + if ( likely(test_bit(_IRQ_GUEST, &desc->status) ||
>> + test_bit(_IRQ_MOVE_PENDING, &desc->status)) )
>> + goto out;
>> +
>> + if ( cpumask_test_cpu(from, desc->affinity) )
>> + irq_set_affinity(desc, &cpu_online_map);
>
> I think it would be worth explaining why we are routing to any CPU
> online rather than checking whether the affinity has other online CPUs.
>
> Just to note, I don't have strong opinion either way. It mainly needs to
> be documented.
>
>> +
>> +out:
>> + spin_unlock_irqrestore(&desc->lock, flags);
>> + return;
>> +}
>> +
>> +void evacuate_irqs(unsigned int from)
>> +{
>> + int irq;
> > +> + for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ )
>> + evacuate_irq(irq, from);
>> +
>> + for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ )
>
> AFAICT, irq_to_desc() would not be able to cope with ESPI interrupts
> when CONFIG_GICV3_ESPI is not set. Has this been tested?
>
>> + evacuate_irq(irq, from);
>> +}
>> +
>> static int cpu_callback(struct notifier_block *nfb, unsigned long
>> action,
>> void *hcpu)
>> {
>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>> index 7f3cfa812e..46b24783dd 100644
>> --- a/xen/arch/arm/smpboot.c
>> +++ b/xen/arch/arm/smpboot.c
>> @@ -425,6 +425,8 @@ void __cpu_disable(void)
>> smp_mb();
>> + evacuate_irqs(cpu);
>
> I think it would be worth explaining why evacuate_irqs() is called this
> late in the process.
>
> > +> /* Return to caller; eventually the IPI mechanism will
> unwind and the
>> * scheduler will drop to the idle loop, which will call
>> stop_cpu(). */
>> }
>
> Cheers,
>
--
Mykyta
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |