Xen project Mailing List

Re: [PATCH 04/16] xen/cpu: prevent disable_nonboot_cpus crash on ARM64

From: Mykola Kvach <xakep.amatop@xxxxxxxxx>

Date: Fri, 21 Mar 2025 11:48:32 +0200

Cc: Julien Grall <julien@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Mykola Kvach <mykola_kvach@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Fri, 21 Mar 2025 09:48:52 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Mar 13, 2025 at 5:43 PM Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 11.03.2025 21:47, Julien Grall wrote: > > Hi Mykola, > > > > On 05/03/2025 09:11, Mykola Kvach wrote: > >> If we call disable_nonboot_cpus on ARM64 with system_state set > >> to SYS_STATE_suspend, the following assertion will be triggered: > >> > >> ``` > >> (XEN) [ 25.582712] Disabling non-boot CPUs ... > >> (XEN) [ 25.587032] Assertion '!in_irq() && (local_irq_is_enabled() || > >> num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714 > >> [...] > >> (XEN) [ 25.975069] Xen call trace: > >> (XEN) [ 25.978353] [<00000a000022e098>] xfree+0x130/0x1a4 (PC) > >> (XEN) [ 25.984314] [<00000a000022e08c>] xfree+0x124/0x1a4 (LR) > >> (XEN) [ 25.990276] [<00000a00002747d4>] release_irq+0xe4/0xe8 > >> (XEN) [ 25.996152] [<00000a0000278588>] > >> time.c#cpu_time_callback+0x44/0x60 > >> (XEN) [ 26.003150] [<00000a000021d678>] notifier_call_chain+0x7c/0xa0 > >> (XEN) [ 26.009717] [<00000a00002018e0>] > >> cpu.c#cpu_notifier_call_chain+0x24/0x48 > >> (XEN) [ 26.017148] [<00000a000020192c>] cpu.c#_take_cpu_down+0x28/0x34 > >> (XEN) [ 26.023801] [<00000a0000201944>] cpu.c#take_cpu_down+0xc/0x18 > >> (XEN) [ 26.030281] [<00000a0000225c5c>] > >> stop_machine.c#stopmachine_action+0xbc/0xe4 > >> (XEN) [ 26.038057] [<00000a00002264bc>] > >> tasklet.c#do_tasklet_work+0xb8/0x100 > >> (XEN) [ 26.045229] [<00000a00002268a4>] do_tasklet+0x68/0xb0 > >> (XEN) [ 26.051018] [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194 > >> (XEN) [ 26.057585] [<00000a0000277e30>] start_secondary+0x21c/0x220 > >> (XEN) [ 26.063978] [<00000a0000361258>] 00000a0000361258 > >> ``` > >> > >> This happens because before invoking take_cpu_down via the stop_machine_run > >> function on the target CPU, stop_machine_run requests > >> the STOPMACHINE_DISABLE_IRQ state on that CPU. Releasing memory in > >> the release_irq function then triggers the assertion: > >> > >> /* > >> * Heap allocations may need TLB flushes which may require IRQs to be > >> * enabled (except when only 1 PCPU is online). > >> */ > >> #define ASSERT_ALLOC_CONTEXT() > >> > >> This patch introduces a new tasklet to perform the CPU_DYING call chain for > >> a particular CPU. However, we cannot call take_cpu_down from the tasklet > >> because the __cpu_disable function disables local IRQs, causing the system > >> to crash inside spin_lock_irq, which is called after the tasklet function > >> invocation inside do_tasklet_work: > >> > >> void _spin_lock_irq(spinlock_t *lock) > >> { > >> ASSERT(local_irq_is_enabled()); > >> > >> To resolve this, take_cpu_down is split into two parts. The first part > >> triggers > >> the CPU_DYING call chain, while the second part, __cpu_disable, is invoked > >> from > >> stop_machine_run. > > > > Rather than modifying common code, have you considered allocating from > > the IRQ action from the percpu area? This would also reduce the number > > of possible failure when bringup up a pCPU. > > I'd go further and question whether release_irq() really wants calling when > suspending. At least on x86, a requirement is that upon resume the same > number and kinds of CPUs will come back up. Hence the system will look the > same, including all the interrupts that are in use. Plus resume will be > faster if things are left set up during suspend. I tried that approach and encountered another issue. - in the case of the hardware domain, it triggered a domain watchdog; - in the case of domU, it caused a crash inside the Linux kernel due to an RCU stall. Both scenarios suggest that something is wrong with IRQ delivery to the guest. It might be necessary to revisit the entire logic related to GIC resume/suspend instead. > > Jan Best regards, Mykola

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.