[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen master hangs



On 30/07/2015 22:32, Doug Goldstein wrote:
> On Tue, Jul 28, 2015 at 4:22 PM, Konrad Rzeszutek Wilk
> <konrad.wilk@xxxxxxxxxx> wrote:
>> On Tue, Jul 28, 2015 at 11:27:57AM -0500, Doug Goldstein wrote:
>>> On Tue, Jul 28, 2015 at 10:01 AM, Konrad Rzeszutek Wilk
>>> <konrad.wilk@xxxxxxxxxx> wrote:
>>>> On Tue, Jul 28, 2015 at 09:30:59AM -0500, Doug Goldstein wrote:
>>>>> On Mon, Jul 27, 2015 at 4:11 PM, Doug Goldstein <cardoe@xxxxxxxxxx> wrote:
>>>>>> On Mon, Jul 27, 2015 at 4:55 AM, Andrew Cooper
>>>>>> <andrew.cooper3@xxxxxxxxxx> wrote:
>>>>>>> On 24/07/15 17:52, Doug Goldstein wrote:
>>>>>>>
>>>>> <snip>
>>>>>
>>>>> (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Not tainted ]----
>>>>> (XEN) CPU:    3
>>>>> (XEN) RIP:    e008:[<00000000cea9727b>] 00000000cea9727b
>>>>> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor (d0v3)
>>>>> (XEN) rax: 00000000cea9727b   rbx: ffff830216c7fe48   rcx: 
>>>>> 000000000000001f
>>>>> (XEN) rdx: 00000000d674ded0   rsi: 000000319697ec80   rdi: 
>>>>> ffff83021df64080
>>>>> (XEN) rbp: ffff830216c7fda8   rsp: ffff830216c7fd20   r8:  
>>>>> ffff830216c7fe68
>>>>> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 
>>>>> 00000000db002700
>>>>> (XEN) r12: ffff830216c7fe68   r13: 0000000000000000   r14: 
>>>>> ffff83021df64080
>>>>> (XEN) r15: 0000000211c13000   cr0: 0000000080050033   cr4: 
>>>>> 00000000001526e0
>>>>> (XEN) cr3: 0000000216cb7000   cr2: 00000000cea9727b
>>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>>>>> (XEN) Xen stack trace from rsp=ffff830216c7fd20:
>>>>> (XEN)    00000000d674cd77 0000000211c13000 ffffffff81ce0a20 
>>>>> ffff82d0802aabc4
>>>>> (XEN)    ffff83021df64080 8000000000000013 00000000ffffffa1 
>>>>> 0000000216cb7000
>>>>> (XEN)    ffff82d08023f5a1 ffff830216c7fe48 ffff830216c7fde8 
>>>>> 000000319697ec80
>>>>> (XEN)    ffff82d08023f56c ffff830216cc8340 0000000000000003 
>>>>> ffff830216c7fde8
>>>>> (XEN)    0000000000000206 0000000000000400 0000000000000292 
>>>>> ffff830128e0dd80
>>>>> (XEN)    ffff830216c7ff18 ffffffff81ce0a20 ffff82d0802aabc4 
>>>>> ffff830216c78000
>>>>> (XEN)    ffff82d080320080 ffff830216c7fef8 ffff82d0801673ba 
>>>>> ffff830216cc8108
>>>>> (XEN)    0027b02880108237 0000000000000000 ffff880209de7d18 
>>>>> 0000000000000002
>>>>> (XEN)    ffff830216c7fe38 ffff82d08012ce6a ffff830216c7fe58 
>>>>> ffff82d08018ad0c
>>>>> (XEN)    0300000100000031 0000000000000008 0000000000000000 
>>>>> 0000000000000400
>>>>> (XEN)    ffff8802038c0c00 0000000000000000 0000000000000000 
>>>>> 0000000000000000
>>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>>> 0000000000000000
>>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
>>>>> 0000000000000000
>>>>> (XEN)    0000000000000000 ffff830216c7fef8 ffff8300d65fd000 
>>>>> ffffffff81ce0a20
>>>>> (XEN)    0000000000000000 ffffffff8167e290 0000000000000000 
>>>>> ffff880209de7da8
>>>>> (XEN)    ffff82d08023abda ffffffff810010ea 0000000000000007 
>>>>> ffff8802038a85bc
>>>>> (XEN)    ffff8802038a83dc ffff8802038a85be 0000000000000700 
>>>>> ffff880209de7da8
>>>>> (XEN)    ffff8802038c0c00 0000000000000246 ffff88020f0d77c0 
>>>>> ffff880209de7de8
>>>>> (XEN)    00000000000175a0 0000000000000007 ffffffff810010ea 
>>>>> 0000000000000000
>>>>> (XEN)    ffff8802038c0c00 ffff880209de7d18 0001010000000000 
>>>>> ffffffff810010ea
>>>>> (XEN) Xen call trace:
>>>>> (XEN)    [<00000000cea9727b>] 00000000cea9727b
>>>>> (XEN)    [<ffff82d08023f5a1>] efi_runtime_call+0x64e/0x80a
>>>>> (XEN)    [<ffff82d08023f56c>] efi_runtime_call+0x619/0x80a
>>>>> (XEN)    [<ffff82d0801673ba>] do_platform_op+0xb76/0x1a14
>>>>> (XEN)    [<ffff82d08012ce6a>] _spin_lock+0x11/0x52
>>>>> (XEN)    [<ffff82d08018ad0c>] stime2tsc+0x78/0x82
>>>>> (XEN)    [<ffff82d08023abda>] lstar_enter+0xda/0x134
>>>>> (XEN)
>>>>> (XEN) Pagetable walk from 00000000cea9727b:
>>>>> (XEN)  L4[0x000] = 0000000216cb6063 ffffffffffffffff
>>>>> (XEN)  L3[0x003] = 00000000cfca4063 ffffffffffffffff
>>>>> (XEN)  L2[0x075] = 00000000ce9ff063 ffffffffffffffff
>>>>> (XEN)  L1[0x097] = 80000000cea97163 00000000000d15e5
>>>>> (XEN)
>>>>> (XEN) ****************************************
>>>>> (XEN) Panic on CPU 3:
>>>>> (XEN) FATAL PAGE FAULT
>>>>> (XEN) [error_code=0011]
>>>>> (XEN) Faulting linear address: 00000000cea9727b
>>>>> (XEN) ****************************************
>>>>> (XEN)
>>>>> (XEN) Reboot in five seconds...
>>>> We added a bunch of overrides that may help.
>>>>
>>>> The address looks to be for:
>>>>  (XEN)  00000cea97000-00000ceaacfff type=4 attr=000000000000000f
>>>>
>>>> Which is of EfiBootServicesData
>>>>
>>>> Try on the Xen.efi command line to use /mapbs
>>>>
>>>> You may have to install first EFI Shell Manager and then from there
>>>> execute the xen.efi /mapbs
>>>>
>>> So I installed ShellBinPkg from TianoCore and got their UEFI Shell
>>> 2.1. I've kicked it off with both "xen-4.6-unstable.efi /mapbs" and
>>> "xen-4.6-unstable.efi -mapbs" but it doesn't make a difference. It
>>> fails the same way as above.
>> OK, there is one more thing which I sadly have to use on my Lenovo
>> T420. See the inline patch and attached patch.
>>
>> I end up doing xen.efi -basevideo -mapbs -noexitboot
>>
>>
>> From 938d98b7a7e4b3b6c7a05b2f20046e457750dec1 Mon Sep 17 00:00:00 2001
>> From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
>> Date: Tue, 27 Jan 2015 14:04:30 -0500
>> Subject: [PATCH 2/2] EFI/early: Implement GetNextVariableName and /query and
>>  /noexitboot parameters
>>
>> In the early EFI boot we will enumerate up to five EFI variables
>> to make sure it works.
>>
>> The /query will just enumerate them and then quit. Helps in
>> troubleshooting and redirecting the output to a file (xen.efi /query > q).
>>
>> The /noexitboot will inhibit Xen from calling ExitBootServices.
>> This allows on Lenovo ThinkPad x230, T420 to use GetNextVariableName
>> in 1-1 mapping mode.
> <snip>
>
> Would it be acceptable to create a quirks table and then we can add
> different systems to the quirk?
>
> e.g.
>
> struct efi_quirks {
>     /* something identifying to the system */
>     uint32_t quirks;
> } efi_quirks_table = {
>  { /* Lenovo T420 / T430 v2.64 through v2.68 */, EFI_QUIRK_MAPBS |
> EFI_QUIRK_NOEXITBS },
>  { 0, 0 }
> };
>
> #define EFI_QUIRK_MAPBS 0x1
> #define EFI_QURIK_NOEXITBS 0x2
>
> I'm not sure what I can use to identify the system early on. I see the
> EFI FW vendor and EFI FW revision. I'm not sure if that would be
> enough. In the case of using that it would likely have to be a range
> of revisions.
>
> Thoughts?

I am very much in favour of this.  Based on XenServer testing, far more
systems are buggy than working, and turning on every workaround by
default is unlikely to end well.

At the end of the day, user of Xen want it to JustWork.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.