[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch 22/37] arm64: smp: Switch to hotplug core state synchronization



On Sat, Apr 15, 2023 at 01:44:49AM +0200, Thomas Gleixner wrote:
> Switch to the CPU hotplug core state tracking and synchronization
> mechanim. No functional change intended.
> 
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx

I gave this a spin on arm64 (in a 64-vCPU VM on an M1 host), and it seems to
work fine with a bunch of vCPUs being hotplugged off and on again randomly.

FWIW:

Tested-by: Mark Rutland <mark.rutland@xxxxxxx>

I also hacked the code to have the dying CPU spin forever before the call to
cpuhp_ap_report_dead(). In that case I see a warning, and that we don't call
arch_cpuhp_cleanup_dead_cpu(), and that the CPU is marked as offline (per
/sys/devices/system/cpu/$N/online).

As a tangent/aside, we might need to improve that for confidential compute
architectures, and we might want to generically track cpus which might still be
using kernel text/data. On arm64 we ensure that via our cpu_kill() callback
(which'll use PSCI CPU_AFFINITY_INFO), but I'm not sure if TDX and/or SEV-SNP
have a similar mechanism.

Otherwise, a malicious hypervisor can pause a vCPU just before it leaves the
kernel (e.g. immediately after the arch_cpuhp_cleanup_dead_cpu() call), wait
for a kexec (or resuse of stack memroy), and unpause the vCPU to cause things
to blow up.

Thanks,
Mark.

> ---
>  arch/arm64/Kconfig           |    1 +
>  arch/arm64/include/asm/smp.h |    2 +-
>  arch/arm64/kernel/smp.c      |   14 +++++---------
>  3 files changed, 7 insertions(+), 10 deletions(-)
> 
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -216,6 +216,7 @@ config ARM64
>       select HAVE_KPROBES
>       select HAVE_KRETPROBES
>       select HAVE_GENERIC_VDSO
> +     select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
>       select IRQ_DOMAIN
>       select IRQ_FORCED_THREADING
>       select KASAN_VMALLOC if KASAN
> --- a/arch/arm64/include/asm/smp.h
> +++ b/arch/arm64/include/asm/smp.h
> @@ -99,7 +99,7 @@ static inline void arch_send_wakeup_ipi_
>  
>  extern int __cpu_disable(void);
>  
> -extern void __cpu_die(unsigned int cpu);
> +static inline void __cpu_die(unsigned int cpu) { }
>  extern void cpu_die(void);
>  extern void cpu_die_early(void);
>  
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -333,17 +333,13 @@ static int op_cpu_kill(unsigned int cpu)
>  }
>  
>  /*
> - * called on the thread which is asking for a CPU to be shutdown -
> - * waits until shutdown has completed, or it is timed out.
> + * Called on the thread which is asking for a CPU to be shutdown after the
> + * shutdown completed.
>   */
> -void __cpu_die(unsigned int cpu)
> +void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu)
>  {
>       int err;
>  
> -     if (!cpu_wait_death(cpu, 5)) {
> -             pr_crit("CPU%u: cpu didn't die\n", cpu);
> -             return;
> -     }
>       pr_debug("CPU%u: shutdown\n", cpu);
>  
>       /*
> @@ -370,8 +366,8 @@ void cpu_die(void)
>  
>       local_daif_mask();
>  
> -     /* Tell __cpu_die() that this CPU is now safe to dispose of */
> -     (void)cpu_report_death();
> +     /* Tell cpuhp_bp_sync_dead() that this CPU is now safe to dispose of */
> +     cpuhp_ap_report_dead();
>  
>       /*
>        * Actually shutdown the CPU. This must never fail. The specific hotplug
> 



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.