[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Resend: Linux 4.11-rc7: kernel BUG at drivers/xen/events/events_base.c:1221



On 25/04/17 13:28, Sander Eikelenboom wrote:
> On 25/04/17 13:00, Juergen Gross wrote:
>> On 25/04/17 12:33, Sander Eikelenboom wrote:
>>> On 25/04/17 09:01, Juergen Gross wrote:
>>>> On 25/04/17 08:57, Sander Eikelenboom wrote:
>>>>> On 25/04/17 08:42, Juergen Gross wrote:
>>>>>> On 25/04/17 08:35, Sander Eikelenboom wrote:
>>>>>>> (XEN) [2017-04-24 21:20:53.203] d0v0 Unhandled invalid opcode 
>>>>>>> fault/trap [#6, ec=ffffffff]
>>>>>>> (XEN) [2017-04-24 21:20:53.203] domain_crash_sync called from entry.S: 
>>>>>>> fault at ffff82d080358f70 entry.o#create_bounce_frame+0x145/0x154
>>>>>>> (XEN) [2017-04-24 21:20:53.203] Domain 0 (vcpu#0) crashed on cpu#0:
>>>>>>> (XEN) [2017-04-24 21:20:53.203] ----[ Xen-4.9-unstable  x86_64  debug=y 
>>>>>>>   Not tainted ]----
>>>>>>> (XEN) [2017-04-24 21:20:53.203] CPU:    0
>>>>>>> (XEN) [2017-04-24 21:20:53.203] RIP:    e033:[<ffffffff8255a485>]
>>>>>>
>>>>>> Can you please tell us symbol+offset for RIP?
>>>>>>
>>>>>> Juergen
>>>>>>
>>>>>
>>>>> Sure:
>>>>> # addr2line -e vmlinux-4.11.0-rc8-20170424-linus-doflr-xennext-boris+ 
>>>>> ffffffff8255a485
>>>>> linux-linus/arch/x86/xen/enlighten_pv.c:288
>>>>>
>>>>> Which is:
>>>>> static bool __init xen_check_xsave(void)
>>>>> {
>>>>>         unsigned int err, eax, edx;
>>>>>
>>>>>         /*
>>>>>          * Xen 4.0 and older accidentally leaked the host XSAVE flag into 
>>>>> guest
>>>>>          * view, despite not being able to support guests using the
>>>>>          * functionality. Probe for the actual availability of XSAVE by 
>>>>> seeing
>>>>>          * whether xgetbv executes successfully or raises #UD.
>>>>>          */
>>>>> HERE -->    asm volatile("1: .byte 0x0f,0x01,0xd0\n\t" /* xgetbv */    
>>>>>                      "xor %[err], %[err]\n"
>>>>>                      "2:\n\t"
>>>>>                      ".pushsection .fixup,\"ax\"\n\t"
>>>>>                      "3: movl $1,%[err]\n\t"
>>>>>                      "jmp 2b\n\t"
>>>>>                      ".popsection\n\t"
>>>>>                      _ASM_EXTABLE(1b, 3b)
>>>>>                      : [err] "=r" (err), "=a" (eax), "=d" (edx)
>>>>>                      : "c" (0));
>>>>>
>>>>>         return err == 0;
>>>>
>>>> I hoped so. :-)
>>>>
>>>> I posted a patch to repair this some minutes ago. Would you mind to try
>>>> it? See:
>>>>
>>>> https://lists.xen.org/archives/html/xen-devel/2017-04/msg02925.html
>>>>
>>>>
>>>> Juergen
>>>
>>> Hmm next up seems to be a hanging dom0 kernel somewhat later during boot, 
>>> with not too many clues.
>>> (any output of xen debug-keys that could be of interest ?)
>>>
>> ...
>>> [    0.000000] ACPI: Early table checksum verification disabled
>>> [    0.000000] ACPI: RSDP 0x00000000000FB100 000014 (v00 ACPIAM)
>>> [    0.000000] ACPI: RSDT 0x00000000AFF90000 000048 (v01 MSI    OEMSLIC  
>>> 20100913 MSFT 00000097)
>>> [    0.000000] ACPI: FACP 0x00000000AFF90200 000084 (v01 7640MS A7640100 
>>> 20100913 MSFT 00000097)
>>> [    0.000000] ACPI: DSDT 0x00000000AFF905E0 009427 (v01 A7640  A7640100 
>>> 00000100 INTL 20051117)
>>> [    0.000000] ACPI: FACS 0x00000000AFF9E000 000040
>>> [    0.000000] ACPI: APIC 0x00000000AFF90390 000088 (v01 7640MS A7640100 
>>> 20100913 MSFT 00000097)
>>> [    0.000000] ACPI: MCFG 0x00000000AFF90420 00003C (v01 7640MS OEMMCFG  
>>> 20100913 MSFT 00000097)
>>> [    0.000000] ACPI: SLIC 0x00000000AFF90460 000176 (v01 MSI    OEMSLIC  
>>> 20100913 MSFT 00000097)
>>> [    0.000000] ACPI: OEMB 0x00000000AFF9E040 000072 (v01 7640MS A7640100 
>>> 20100913 MSFT 00000097)
>>> [    0.000000] ACPI: SRAT 0x00000000AFF9A5E0 000108 (v03 AMD    FAM_F_10 
>>> 00000002 AMD  00000001)
>>> [    0.000000] ACPI: HPET 0x00000000AFF9A6F0 000038 (v01 7640MS OEMHPET  
>>> 20100913 MSFT 00000097)
>>> [    0.000000] ACPI: IVRS 0x00000000AFF9A730 000110 (v01 AMD    RD890S   
>>> 00202031 AMD  00000000)
>>> [    0.000000] ACPI: SSDT 0x00000000AFF9A840 000DA4 (v01 A M I  POWERNOW 
>>> 00000001 AMD  00000001)
>>> [    0.000000] ACPI: Local APIC address 0xfee00000
>>> [    0.000000] Setting APIC ro
>>
>> Hmm, this seems to be only the first part of a message.
>>
>> Could you try debug-key "0" (probably multiple times) and have a look
>> where dom0 vcpu 0 is spending its time?
>>
>>
>> Juergen
>>
> 
> Here you are:
> 
> [    0.000000] ACPI: Early table checksum verification disabled
> [    0.000000] ACPI: RSDP 0x00000000000FB100 000014 (v00 ACPIAM)
> [    0.000000] ACPI: RSDT 0x00000000AFF90000 000048 (v01 MSI    OEMSLIC  
> 20100913 MSFT 00000097)
> [    0.000000] ACPI: FACP 0x00000000AFF90200 000084 (v01 7640MS A7640100 
> 20100913 MSFT 00000097)
> [    0.000000] ACPI: DSDT 0x00000000AFF905E0 009427 (v01 A7640  A7640100 
> 00000100 INTL 20051117)
> [    0.000000] ACPI: FACS 0x00000000AFF9E000 000040
> [    0.000000] ACPI: APIC 0x00000000AFF90390 000088 (v01 7640MS A7640100 
> 20100913 MSFT 00000097)
> [    0.000000] ACPI: MCFG 0x00000000AFF90420 00003C (v01 7640MS OEMMCFG  
> 20100913 MSFT 00000097)
> [    0.000000] ACPI: SLIC 0x00000000AFF90460 000176 (v01 MSI    OEMSLIC  
> 20100913 MSFT 00000097)
> [    0.000000] ACPI: OEMB 0x00000000AFF9E040 000072 (v01 7640MS A7640100 
> 20100913 MSFT 00000097)
> [    0.000000] ACPI: SRAT 0x00000000AFF9A5E0 000108 (v03 AMD    FAM_F_10 
> 00000002 AMD  00000001)
> [    0.000000] ACPI: HPET 0x00000000AFF9A6F0 000038 (v01 7640MS OEMHPET  
> 20100913 MSFT 00000097)
> [    0.000000] ACPI: IVRS 0x00000000AFF9A730 000110 (v01 AMD    RD890S   
> 00202031 AMD  00000000)
> [    0.000000] ACPI: SSDT 0x00000000AFF9A840 000DA4 (v01 A M I  POWERNOW 
> 00000001 AMD  00000001)
> [    0.000000] ACPI: Local APIC address 0xfee00000
> [    0.000000] Setting AP(XEN) [2017-04-25 11:11:35.568] *** Serial input -> 
> Xen (type 'CTRL-a' three times to switch input to DOM0)
> (XEN) [2017-04-25 11:11:37.000] '0' pressed -> dumping Dom0's registers
> (XEN) [2017-04-25 11:11:37.001] *** Dumping Dom0 vcpu#0 state: ***
> (XEN) [2017-04-25 11:11:37.001] RIP:    e033:[<ffffffff81cccde9>]
> (XEN) [2017-04-25 11:11:45.422] RIP:    e033:[<ffffffff81cccde9>]
> (XEN) [2017-04-25 11:11:56.132] RIP:    e033:[<ffffffff81ccc740>]
> (XEN) [2017-04-25 11:12:02.474] RIP:    e033:[<ffffffff81cccde9>]
> (XEN) [2017-04-25 11:12:06.224] RIP:    e033:[<ffffffff81ccc740>]

Thanks. If you just could translate those again to symbol+offset?
And the suspicious addresses on the stack as well:

ffffffff81023bfd
ffffffff81cc4620
ffffffff81ccb060


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.