Xen project Mailing List

Re: [PATCH v4 4/8] arm/irq: Migrate IRQs from dyings CPUs

To: Julien Grall <julien@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Mykyta Poturai <Mykyta_Poturai@xxxxxxxx>

Date: Fri, 12 Dec 2025 10:01:39 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mPuWIy/T7UiUAJC/3wAm/ckHpnwrVOgpYzB70m31Vh0=; b=XVsZT657sO/km6nTf90xlkaUL/TuXH0jWvkI10rdvHyy7MoLBcQENAVJy9rmIrBU50Y1VoJzqahA2dx7CkWGJmsGpp2Kpnndx170+rKkhtCKzOChJ8W7Cb5u2O2pfxzFvImAtE5k5CZkayECxeRsXNfGamWONcAuD8HmVWCeTzHAMFR51xMurGzMxizNNy4cQixjK7TpnAy8f7j3rhGcR2wCDvVtfa6fto4w81vauUHtz9zx4AK1GzsWmFq14Z6CXDqgzuMzQxHnHrKV5blccglxbXCB3mdxF3FBggQnqt5pVKbeew7uU8YUIWWfDY0Xjfw+X5ZsyeaJdpqhjpkC/w==

Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=yIJVfLFIGCnObSVbjndoEKpPEyIpSkaWEG4rtepdiGu9nzss7DtD0D/O/W6wt2SqrOlw7RjyCcrYPj7Wuf1stN97tcVP/PJIR+NA85Pwu8u6uvQRRzviQv8I7LALkQwFx1WTjMA2Knc+d5Rxf8KvDUqFVdrZz3mhIRi+dmuIMJy/Q7GDL5OOEBGPfmq/N30EeK2MZaH/jrWURoIX9V7wzPEvKUg9jLK/Onc/S0dvt2Jpi4A2BJyDQnz4Sg0osXlufPH8Xij9QTvL2rsT8a9nbB14mUuTH5CRxz3RaKvgSHsG52Now3HZKE/+dGOP22+NfSIS7/cpIDGm5GjyARBnpQ==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>

Delivery-date: Fri, 12 Dec 2025 10:02:00 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHcU8JYvjT2yDm5JUiekiWBeQTl27T1L7qAgCjFXQA=

Thread-topic: [PATCH v4 4/8] arm/irq: Migrate IRQs from dyings CPUs

On 16.11.25 13:24, Julien Grall wrote: > Hi, > > On 12/11/2025 10:51, Mykyta Poturai wrote: >> Move IRQs from dying CPU to the online ones. >> Guest-bound IRQs are already handled by scheduler in the process of >> moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen >> itself. >> >> If IRQ is to be migrated, it's affinity is set to a mask of all online >> CPUs. With current GIC implementation, this means they are routed to a >> random online CPU. This may cause extra moves if multiple cores are >> disabled in sequence, but should prevent all interrupts from piling up >> on CPU0 in case of repeated up-down cycles on different cores. > > Wouldn't they eventually all move to CPU0 in the case of suspend/resume > or if all the CPUs but CPU0 are turned off and then off? If so, > shouldn't we try to rebalance the interrupts? > In case of disabling/enabling all cores in a sequence, yes. This was designed with the idea of achieving some balancing when enabling/disabling some cores for power saving reasons. I agree that proper balancing should be implemented, but it is a complex task on its own and requires a substantial amount of testing on different hardware to prove it is close to optimal. So I think implementing it is out of scope for what I’m trying to do here. If this would be okay, I can implement a relatively simple solution of just adding onlined CPUs back to the affinity mask for now. I think it should improve the situation for the “switching all cores” case. >> >> IRQs from CPU 0 are never migrated, as dying CPU 0 means we are either >> shutting down compeletely or entering system suspend. > > I can't find a place where __cpu_disable() is called on CPU0. Do you > have any pointer? In any case, I am not sure I want to bake that > assumption in more places of the code. > I assume it would be called when suspend is implemented. In any case, I will rework this to replace the hard check for CPU 0 with the “is it the last CPU online” one. >> >> Considering that all Xen-used IRQs are currently allocated during init >> on CPU 0, and setup_irq uses smp_processor_id for the initial affinity. > > Looking at the SMMU driver, we seems to request IRQs at the time the > device is attached. So are you sure about this? > Indeed, I have missed that one. I will remove those statements then. >> This change is not strictly required for correct operation for now, but >> it should future-proof cpu hotplug and system suspend support in case >> some kind if IRQ balancing is implemented later. >> >> Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx> >> >> v3->v4: >> * patch introduced >> --- >> xen/arch/arm/include/asm/irq.h | 2 ++ >> xen/arch/arm/irq.c | 39 ++++++++++++++++++++++++++++++++++ >> xen/arch/arm/smpboot.c | 2 ++ >> 3 files changed, 43 insertions(+) >> >> diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/ >> asm/irq.h >> index 09788dbfeb..6e6e27bb80 100644 >> --- a/xen/arch/arm/include/asm/irq.h >> +++ b/xen/arch/arm/include/asm/irq.h >> @@ -126,6 +126,8 @@ bool irq_type_set_by_domain(const struct domain *d); >> void irq_end_none(struct irq_desc *irq); >> #define irq_end_none irq_end_none >> +void evacuate_irqs(unsigned int from); >> + >> #endif /* _ASM_HW_IRQ_H */ >> /* >> * Local variables: >> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c >> index 28b40331f7..b383d71930 100644 >> --- a/xen/arch/arm/irq.c >> +++ b/xen/arch/arm/irq.c >> @@ -158,6 +158,45 @@ static int init_local_irq_data(unsigned int cpu) >> return 0; >> } >> +static void evacuate_irq(int irq, unsigned int from) > > Any reason why the 'irq' is signed? > >> +{ >> + struct irq_desc *desc = irq_to_desc(irq); >> + unsigned long flags; >> + >> + /* Don't move irqs from CPU 0 as it is always last to be disabled */ > > Per above, I am not convinced that we should special case CPU 0. But if > we do, then shouldn't this be part of evacuate_irqs() so we don't > pointlessly go through all the IRQs? > >> + if ( from == 0 ) >> + return; >> + >> + ASSERT(!cpumask_empty(&cpu_online_map)); >> + ASSERT(!cpumask_test_cpu(from, &cpu_online_map)); >> + >> + spin_lock_irqsave(&desc->lock, flags); >> + if ( likely(!desc->action) ) >> + goto out; >> + >> + if ( likely(test_bit(_IRQ_GUEST, &desc->status) || >> + test_bit(_IRQ_MOVE_PENDING, &desc->status)) ) >> + goto out; >> + >> + if ( cpumask_test_cpu(from, desc->affinity) ) >> + irq_set_affinity(desc, &cpu_online_map); > > I think it would be worth explaining why we are routing to any CPU > online rather than checking whether the affinity has other online CPUs. > > Just to note, I don't have strong opinion either way. It mainly needs to > be documented. > >> + >> +out: >> + spin_unlock_irqrestore(&desc->lock, flags); >> + return; >> +} >> + >> +void evacuate_irqs(unsigned int from) >> +{ >> + int irq; > > +> + for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ ) >> + evacuate_irq(irq, from); >> + >> + for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ ) > > AFAICT, irq_to_desc() would not be able to cope with ESPI interrupts > when CONFIG_GICV3_ESPI is not set. Has this been tested? > >> + evacuate_irq(irq, from); >> +} >> + >> static int cpu_callback(struct notifier_block *nfb, unsigned long >> action, >> void *hcpu) >> { >> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c >> index 7f3cfa812e..46b24783dd 100644 >> --- a/xen/arch/arm/smpboot.c >> +++ b/xen/arch/arm/smpboot.c >> @@ -425,6 +425,8 @@ void __cpu_disable(void) >> smp_mb(); >> + evacuate_irqs(cpu); > > I think it would be worth explaining why evacuate_irqs() is called this > late in the process. > > > +> /* Return to caller; eventually the IPI mechanism will > unwind and the >> * scheduler will drop to the idle loop, which will call >> stop_cpu(). */ >> } > > Cheers, > -- Mykyta

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.