[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] small adjustment to asm constraints for c/s19400



On Monday, March 30, 2009 5:13 PM Jan Beulich wrote:

>>>> "Lu, Guanqun" <guanqun.lu@xxxxxxxxx> 30.03.09 10:56 >>>
>> On Monday, March 30, 2009 4:45 PM Keir Fraser wrote:
>> 
>>> On 30/03/2009 05:45, "Lu, Guanqun" <guanqun.lu@xxxxxxxxx> wrote:
>>> 
>>>> With this adjustment or previous patch 19400, S3 still fails on 64
>>>> xen / 32 dom0. Do you have any idea what will cause this problem?
>>> 
>>> Does it still fail at the LTR? Does the alternative fix of settign
>>> the type to 9 rather than 11 work?
>>> 
>>>  -- Keir
>> 
>> It doesn't fail at the LTR.
>> Xen resumes back, but dom0 hangs there(technically speaking, it
>> doen't hang, eip changes, but can't come back to shell console), I
>> used xen serial line to dump the dom0 registers, 
>> the eip it runs to from time to time is 'general_protection' in dom0.
> 
> Since under normal circumstances GP faults are rare, why don't you
> just print out the faulting vCPU's state from Xen at the point where
> the fault gets forwarded to the guest? That'll tell you exactly what
> the guest is dying on.
> 
> Jan

Hi Jan,

The point when S3 resumes back, it's running in compat mode.
But the code doesn't differentiate it, as in xen/arch/x86/acpi/wakeup_prot.S.
Maybe we lose some context during the switches which may causes the problem.

When it resumes back, hitting '0' on xen serial console gives us:

(XEN) '0' pressed -> dumping Dom0's registers
(XEN) *** Dumping Dom0 vcpu#0 state: ***
(XEN) ----[ Xen-3.4-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    0061:[<00000000c02a9208>]
(XEN) RFLAGS: 0000000000200246   EM: 0   CONTEXT: pv guest
(XEN) rax: 00000000000000af   rbx: 0000000000000000   rcx: 00000000bfe7d8e4
(XEN) rdx: 00000000bfe7d864   rsi: 0000000000000008   rdi: 0000000000d3cff4
(XEN) rbp: 00000000bfe7d790   rsp: 00000000c6aa3fe0   r8:  0000000000000000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 00000000bd6e4000   cr2: 00000000b7fbe000
(XEN) ds: 007b   es: 007b   fs: 0000   gs: 0033   ss: 0069   cs: 0061
(XEN) Guest stack trace from esp=c6aa3fe0:
(XEN)   00000000 b7f9d405 00000073 00210246 bfe7d790 0000007b 00000000 00000000
(XEN)   c6aa1080 c02fd6a0 00000000 00000000 00000000 00000000 ffffffff 00000000
(XEN)   c0126a47 00000000 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   ffffffff ffffffbf ffffffff ffffffff 00000000 00000000 ffffffff ffffffff
(XEN)   ffffffff ffffffff 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   ffffffff ffffffff ffffffff ffffffff 00000000 00000000 ffffffff ffffffff
(XEN)   ffbfffff ffffffff 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   fffffeff ffffffff ffffffff ffffffff 00000000 00000000 ffffffff ffffffff
(XEN)   ffffffff ffffffff 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   ffffffff ffffffff ffffffff ffffffff 00000000 00000000 ffffffff ffffffff
(XEN)   ffffffff ffffffff 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   ffffffff ffffffff ffffffff ffffffff 00000000 00000000 ffffffff ffffffff
(XEN)   bfffffff ffffffff 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   fffbffff ffffffff ffefffff ffffffff 00000000 00000000 ffffffff ffffffff
(XEN)   ff7fffff ffffffff 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   ffffffff ffffffff ffffffff ffffffff 00000000 00000000 ffffffff ffffffff
(XEN)   ffffffff ffffffff 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   fff7dfff ffffffff ffffffff ffffffff 00000000 00000000 ffffffff ffffffff
(XEN)   ffffffff fffffff7 00000000 00000000 00000000 00000000 00000000 00000000
(XEN)   bfffffff ffffffff ffffffff ffffffff 00000000 00000000 ffffffff ffffffff

Comparing the eip with the dom0 symbol table,
0xc02a9208 -> general_protection
0xc02fd6a0 -> default_exec_domain
0xc0126a47 -> do_no_restart_syscall
>From the stack, such invocation chain is observed.

Any idea what's happening there?

And I also dump the MSR just before and after S3, there're no difference.
patch:

diff -r d5ddc782bc49 xen/arch/x86/acpi/power.c
--- a/xen/arch/x86/acpi/power.c Mon Mar 30 16:48:26 2009 +0100
+++ b/xen/arch/x86/acpi/power.c Tue Mar 31 12:26:12 2009 +0800
@@ -42,6 +42,28 @@ struct acpi_sleep_info acpi_sinfo;

 void do_suspend_lowlevel(void);

+static void dump_msr_registers(void)
+{
+    unsigned long msr;
+
+    rdmsrl(MSR_EFER, msr);
+    printk("<0> EFER: %lx", msr);
+    rdmsrl(MSR_STAR, msr);
+    printk("<0> STAR: %lx", msr);
+    rdmsrl(MSR_LSTAR, msr);
+    printk("<0> LSTAR: %lx", msr);
+    rdmsrl(MSR_CSTAR, msr);
+    printk("<0> CSTAR: %lx", msr);
+    rdmsrl(MSR_SYSCALL_MASK, msr);
+    printk("<0> SYSCALL_MASK: %lx", msr);
+    rdmsrl(MSR_FS_BASE, msr);
+    printk("<0> FS_BASE: %lx", msr);
+    rdmsrl(MSR_GS_BASE, msr);
+    printk("<0> GS_BASE: %lx", msr);
+    rdmsrl(MSR_SHADOW_GS_BASE, msr);
+    printk("<0> SH_GS_BASE: %lx", msr);
+}
+
 static int device_power_down(void)
 {
     console_suspend();
@@ -173,6 +195,7 @@ static int enter_state(u32 state)

     console_start_sync();
     printk("Entering ACPI S%d state.\n", state);
+    dump_msr_registers();

     local_irq_save(flags);
     spin_debug_disable();
@@ -208,6 +231,7 @@ static int enter_state(u32 state)
     device_power_up();

     printk(XENLOG_INFO "Finishing wakeup from ACPI S%d state.\n", state);
+    dump_msr_registers();

     if ( (state == ACPI_STATE_S3) && error )
         panic("Memory integrity was lost on resume (%d)\n", error);

the result is:
[before] (XEN)  EFER: d01<0> STAR: e023e00800000000<0> LSTAR: 
ffff828c8029b000<0> CSTAR: ffff828c8029b080<0> SYSCALL_MASK: 34700<0> FS_BASE: 
0<0> GS_BASE: b7f4a6c0<0> SH_GS_BASE: 0
[after] (XEN)  EFER: d01<0> STAR: e023e00800000000<0> LSTAR: 
ffff828c8029b000<0> CSTAR: ffff828c8029b080<0> SYSCALL_MASK: 34700<0> FS_BASE: 
0<0> GS_BASE: b7f4a6c0<0> SH_GS_BASE: 0
 
They're the same.

And aslo I dump GDT, IDT, LDT and TR register and tss struct, there are no 
difference seen.


So can you give some advice some more registers should be dumped?
Or maybe our S3 code doesn't take the 32dom0 into consideration from the 
beginning, is there anyway to fix this?



Thanks!

-- 
Guanqun
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.