[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] RE: [PATCH] Fix cache flush bug of cpu offline
BTW, some history coding here at kernel side: ----------------------------------- commit ce5f68246bf2385d6174856708d0b746dc378f20 Author: H. Peter Anvin <hpa@xxxxxxxxxxxxxxx> Date: Mon Sep 20 13:04:45 2010 -0700 x86, hotplug: In the MWAIT case of play_dead, CLFLUSH the cache line When we're using MWAIT for play_dead, explicitly CLFLUSH the cache line before executing MONITOR. This is a potential workaround for the Xeon 7400 erratum AAI65 after having a spurious wakeup and returning around the loop. "Potential" here because it is not certain that that erratum could actually trigger; however, the CLFLUSH should be harmless. Signed-off-by: H. Peter Anvin <hpa@xxxxxxxxxxxxxxx> Acked-by: Venkatesh Pallipadi <venki@xxxxxxxxxx> Cc: Asit Mallick <asit.k.mallick@xxxxxxxxx> Cc: Arjan van de Ven <arjan@xxxxxxxxxxxxxxxx> Cc: Len Brown <lenb@xxxxxxxxxx> --------------------------------------- commit a68e5c94f7d3dd64fef34dd5d97e365cae4bb42a Author: H. Peter Anvin <hpa@xxxxxxxxxxxxxxx> Date: Fri Sep 17 17:06:46 2010 -0700 x86, hotplug: Move WBINVD back outside the play_dead loop On processors with hyperthreading, when only one thread is offlined the other thread can cause a spurious wakeup on the idled thread. We do not want to re-WBINVD when that happens. Ideally, we should simply skip WBINVD unless we're the last thread on a particular core to shut down, but there might be similar issues elsewhere in the system. Thus, revert to previous behavior of only WBINVD outside the loop. Partly as a result, remove the mb()'s around it: they are not necessary since wbinvd() is a serializing instruction, but they were intended to make sure the compiler didn't do any funny loop optimizations. Reported-by: Asit Mallick <asit.k.mallick@xxxxxxxxx> Signed-off-by: H. Peter Anvin <hpa@xxxxxxxxxxxxxxx> Cc: Arjan van de Ven <arjan@xxxxxxxxxxxxxxxx> Cc: Len Brown <lenb@xxxxxxxxxx> Cc: Venkatesh Pallipadi <venki@xxxxxxxxxx> Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> -------------------------------------- commit ea53069231f9317062910d6e772cca4ce93de8c8 Author: H. Peter Anvin <hpa@xxxxxxxxxxxxxxx> Date: Fri Sep 17 15:39:11 2010 -0700 x86, hotplug: Use mwait to offline a processor, fix the legacy case The code in native_play_dead() has a number of problems: 1. We should use MWAIT when available, to put ourselves into a deeper sleep state. 2. We use the existence of CLFLUSH to determine if WBINVD is safe, but that is totally bogus -- WBINVD is 486+, whereas CLFLUSH is a much later addition. 3. We should do WBINVD inside the loop, just in case of something like setting an A bit on page tables. Pointed out by Arjan van de Ven. This code is based in part of a previous patch by Venki Pallipadi, but unlike that patch this one keeps all the detection code local instead of pre-caching a bunch of information. We're shutting down the CPU; there is absolutely no hurry. This patch moves all the code to C and deletes the global wbinvd_halt() which is broken anyway. Originally-by: Venkatesh Pallipadi <venkatesh.pallipadi@xxxxxxxxx> Signed-off-by: H. Peter Anvin <hpa@xxxxxxxxxxxxxxx> Reviewed-by: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> Cc: Len Brown <lenb@xxxxxxxxxx> Cc: Venkatesh Pallipadi <venki@xxxxxxxxxx> Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> --------------------------------------- Thanks, Jinsong Keir Fraser wrote: > On 11/03/2011 14:34, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote: > >> Fix cache flush bug of cpu offline >> >> Current xen cpu offline logic flush cache too early, which >> potentially break cache coherency. wbinvd should be the last ops >> before cpu going into dead, otherwise cache may be dirty, i.e, >> something like setting an A bit on page tables. Pointed out by Arjan >> van de Ven. > > The position still seems a bit arbitrary. In the first hunk below, > why is it safe to wbinvd() outside the for-loop and before reading > cx->entry_method, but not before reading from processor_powers[]? It > would be neater if we could put the wbinvd() in a wrapper function > for calling *dead_idle. > > -- Keir > >> Signed-off-by: Liu, Jinsong <jinsong.liu@xxxxxxxxx> >> >> diff -r 2dc3c1cc1bba xen/arch/x86/acpi/cpu_idle.c >> --- a/xen/arch/x86/acpi/cpu_idle.c Mon Mar 07 05:31:46 2022 +0800 >> +++ b/xen/arch/x86/acpi/cpu_idle.c Thu Mar 10 23:40:51 2022 +0800 >> @@ -562,11 +562,14 @@ static void acpi_dead_idle(void) >> if ( (cx = &power->states[power->count-1]) == NULL ) >> goto default_halt; >> >> + /* >> + * cache must be flashed as the last ops before cpu going into >> dead, + * otherwise, cpu may dead with dirty data breaking cache >> coherency, + * leading to strange errors. >> + */ >> + wbinvd(); >> for ( ; ; ) >> { >> - if ( !power->flags.bm_check && cx->type == ACPI_STATE_C3 ) >> - ACPI_FLUSH_CPU_CACHE(); >> - >> switch ( cx->entry_method ) >> { >> case ACPI_CSTATE_EM_FFH: >> @@ -584,6 +587,7 @@ static void acpi_dead_idle(void) } >> >> default_halt: >> + wbinvd(); >> for ( ; ; ) >> halt(); >> } >> diff -r 2dc3c1cc1bba xen/arch/x86/domain.c >> --- a/xen/arch/x86/domain.c Mon Mar 07 05:31:46 2022 +0800 >> +++ b/xen/arch/x86/domain.c Thu Mar 10 23:40:51 2022 +0800 >> @@ -93,6 +93,12 @@ static void default_idle(void) >> >> static void default_dead_idle(void) >> { >> + /* >> + * cache must be flashed as the last ops before cpu going into >> dead, + * otherwise, cpu may dead with dirty data breaking cache >> coherency, + * leading to strange errors. >> + */ >> + wbinvd(); >> for ( ; ; ) >> halt(); >> } >> @@ -100,7 +106,6 @@ static void play_dead(void) >> static void play_dead(void) >> { >> local_irq_disable(); >> - wbinvd(); >> >> /* >> * NOTE: After cpu_exit_clear, per-cpu variables are no longer >> accessible, _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |