[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen4.2 S3 regression?
>>> On 21.09.12 at 20:42, Keir Fraser <keir@xxxxxxx> wrote: > On 21/09/2012 19:20, "Ben Guthro" <ben@xxxxxxxxxx> wrote: > >> >> >> On Fri, Sep 21, 2012 at 2:47 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >>> >>> That's because CPU1 is stuck in cpu_init() (in the infinite loop after >>> printing "CPU#1 already initialized!"), as Keir pointed out yesterday. >>> >> >> I've done some more tracing on this, and instrumented cpu_init(), > cpu_uninit() >> - and found something I cannot quite explain. >> I was most interested in the cpu_initialized mask, set just above these two >> functions (and only used in those two functions) >> >> I convert cpu_initialized to a string, using cpumask_scnprintf - and print > it >> out when it is read, or written in these two functions. >> >> When CPU1 is being torn down, the cpumask bit gets cleared for CPU1, and I > am >> able to print this to the console to verify. >> However, when the machine is returning from S3, and going through cpu_init - >> the bit is set again. >> >> Could this be an issue of caches not being flushed? >> >> I see that the last thing done before acpi_enter_sleep_state actually >> writes PM1A_CONTROL / PM1B_CONTROL to enter S3 is a ACPI_FLUSH_CPU_CACHE() >> >> This analysis seems unlikely, at this point...but I'm not sure what to make > of >> the data other than a cache issue. >> >> Am I "barking up the wrong tree" here? > > Perhaps not. Try dumping it immediately before and after the actual S3 > sleep. Since you probably can't print to serial line at that point, you > could just take a copy of the bitmap and print them both shortly after S3 > resume. Then if it still looks bad, or the problem magically resolves with > the extra printing, you can suspect cache flush a bit more strongly. > However, WBINVD (which is what ACPI_FLUSH_CPU_CACHE() is) should be enough. CPU0 issuing WBINVD might not be enough; other CPUs should probably also do so unconditionally (currently they do this only when using one of the advanced halt forms in acpi_dead_idle()). While one would think that a halted CPU would not only continue to keep its cache up-to-date, but also eventually write back its dirty cache lines, I don't think the latter is actually guaranteed, so if the CPU ends up getting the INIT before the line was written back, the modification could get lost. But of course this theory depends on Ben's system actually using the default halt mechanism rather than one of the advanced ones. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |