[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 104131: regressions - FAIL



On January 16, 2017 1:26 PM, Tian, Kevin wrote:
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> Sent: Thursday, January 12, 2017 8:26 PM
>>
>> >>> On 12.01.17 at 13:15, <andrew.cooper3@xxxxxxxxxx> wrote:
>> > On 12/01/17 12:07, Xuquan (Quan Xu) wrote:
>> >> On January 12, 2017 5:14 PM, Andrew Cooper wrote:
>> >>> On 12/01/2017 06:46, osstest service owner wrote:
>> >>>> flight 104131 xen-unstable real [real]
>> >>>> http://logs.test-lab.xenproject.org/osstest/logs/104131/
>> >>>>
>> >>>> Regressions :-(
>> >>>>
>> >>>> Tests which did not succeed and are blocking, including tests
>> >>>> which could not be run:
>> >>>>  test-amd64-i386-xl-qemuu-debianhvm-amd64 16 guest-stop   fail
>> >>> REGR. vs. 104119
>> >>>
>> >>> Jan 12 01:25:17.397607 (XEN) Assertion 'intack.vector >=
>> >>> pt_vector' failed at
>> >>> intr.c:321
>> >>> Jan 12 01:25:37.133596 (XEN) ----[ Xen-4.9-unstable  x86_64
>> >>> debug=y Not tainted ]----
>> >>> Jan 12 01:25:37.141577 (XEN) CPU:    14
>> >>> Jan 12 01:25:37.141607 (XEN) RIP:    e008:[<ffff82d0801ef7fc>]
>> >>> vmx_intr_assist+0x35e/0x51d
>> >>> Jan 12 01:25:37.149617 (XEN) RFLAGS: 0000000000010202
>CONTEXT:
>> >>> hypervisor (d15v0)
>> >>> Jan 12 01:25:37.149655 (XEN) rax: 0000000000000038   rbx:
>> >>> ffff830079e1e000   rcx: 0000000000000030
>> >>> Jan 12 01:25:37.157582 (XEN) rdx: 0000000000000000   rsi:
>> >>> 0000000000000030   rdi: ffff830079e1e000
>> >>> Jan 12 01:25:37.165584 (XEN) rbp: ffff83047de2ff08   rsp:
>ffff83047de2fea8
>> >>> r8:  ffff82c00022f000
>> >>> Jan 12 01:25:37.173579 (XEN) r9:  ffff8301b63ede80   r10:
>> >>> ffff830176386560   r11: 000001955ee79bd0
>> >>> Jan 12 01:25:37.181582 (XEN) r12: 0000000000003002   r13:
>> >>> 0000000000003002   r14: 0000000000000030
>> >>> Jan 12 01:25:37.189584 (XEN) r15: ffff83023fec2000   cr0:
>> >>> 0000000080050033   cr4: 00000000003526e0
>> >>> Jan 12 01:25:37.197572 (XEN) cr3: 0000000232edb000   cr2:
>> >>> 0000000002487034
>> >>> Jan 12 01:25:37.205569 (XEN) ds: 0000   es: 0000   fs: 0000   gs:
>0000
>> >>> ss: 0000   cs: e008
>> >>> Jan 12 01:25:37.205606 (XEN) Xen code around <ffff82d0801ef7fc>
>> >>> (vmx_intr_assist+0x35e/0x51d):
>> >>> Jan 12 01:25:37.213575 (XEN)  41 0f b6 f6 39 f0 7e 02 <0f> 0b 48
>> >>> 89 df e8 51
>> >>> 20 00 00 b8 10 08 00 00 0f Jan 12 01:25:37.221561 (XEN) Xen stack
>> >>> trace
>> >> >from rsp=ffff83047de2fea8:
>> >>> Jan 12 01:25:37.229600 (XEN)    ffff82d08031aa80 00000038ffffffff
>> >>> ffff83047de2ffff ffff83023fec2000
>> >>> Jan 12 01:25:37.237594 (XEN)    ffff83047de2fef8 ffff82d080130cb6
>> >>> ffff830079e1e000 ffff830079e1e000
>> >>> Jan 12 01:25:37.245588 (XEN)    ffff83007bae2000
>000000000000000e
>> >>> ffff830233117000 ffff83023fec2000
>> >>> Jan 12 01:25:37.253594 (XEN)    ffff83047de2fdc0 ffff82d0801fdeb1
>> >>> 0000000000000004 00000000000000c2
>> >>> Jan 12 01:25:37.261584 (XEN)    0000000000000020
>0000000000000007
>> >>> ffff8800e8d28000 ffffffff81add0a0
>> >>> Jan 12 01:25:37.269607 (XEN)    0000000000000246
>0000000000000000
>> >>> ffff880142400008 0000000000000004
>> >>> Jan 12 01:25:37.277580 (XEN)    0000000000000036
>0000000000000000
>> >>> 00000000000003f8 00000000000003f8
>> >>> Jan 12 01:25:37.285584 (XEN)    ffffffff81add0a0 0000beef0000beef
>> >>> ffffffff813899a4 000000bf0000beef
>> >>> Jan 12 01:25:37.293567 (XEN)    0000000000000002
>ffff880147c03e08
>> >>> 000000000000beef 1cec835356e5beef
>> >>> Jan 12 01:25:37.293606 (XEN)    085d8b002674beef
>01dcb38b000cbeef
>> >>> 8914458d3174beef 2444c7100000000e
>> >>> Jan 12 01:25:37.301586 (XEN)    ffff830079e1e000
>00000031bfc37600
>> >>> 00000000003526e0
>> >>> Jan 12 01:25:37.309607 (XEN) Xen call trace:
>> >>> Jan 12 01:25:37.309639 (XEN)    [<ffff82d0801ef7fc>]
>> >>> vmx_intr_assist+0x35e/0x51d
>> >>> Jan 12 01:25:37.317591 (XEN)    [<ffff82d0801fdeb1>]
>> >>> vmx_asm_vmexit_handler+0x41/0x120
>> >>> Jan 12 01:25:37.325598 (XEN)
>> >>> Jan 12 01:25:37.325624 (XEN)
>> >>> Jan 12 01:25:37.325647 (XEN)
>> >>> ****************************************
>> >>> Jan 12 01:25:37.333653 (XEN) Panic on CPU 14:
>> >>> Jan 12 01:25:37.333684 (XEN) Assertion 'intack.vector >=
>> >>> pt_vector' failed at
>> >>> intr.c:321 Jan 12 01:25:37.341571 (XEN)
>> >>> ****************************************
>> >>> Jan 12 01:25:37.341603 (XEN)
>> >>> Jan 12 01:25:37.341626 (XEN) Reboot in five seconds...
>> >>> Jan 12 01:25:37.349566 (XEN) Resetting with ACPI MEMORY or I/O
>> >>> RESET_REG.
>> >>>
>> >>> This is caused by "x86/apicv: fix RTC periodic timer and apicv
>> >>> issue".  It is not a deterministic issue, as it appears to have
>> >>> survived a week of testing already, but there is clearly something still
>problematic with the code.
>> >>>
>> >>
>> >> Andrew,
>> >> If you have, could you give more information?
>> >
>> > No further information sorry.  This was found by the automated test
>system.
>>
>> But some can be gathered:
>>
>> > Full logs are available from
>> > http://logs.test-lab.xenproject.org/osstest/logs/104131/test-amd64-i
>> > 386-xl-q
>> > emuu-debianhvm-amd64/
>> > but I doubt any of them will help in diagnosing the issue any further.
>> >
>> >> Such as the value of intack.vector / pt_vector..
>>
>> At leastb one of the two values is likely to live in a register, and
>> hence its value would be available in the dump. Just takes looking at
>> the disassembly.
>>
>> >> I guess, the reason may be that the intack.vector is ' uint8_t ' and the
>pt_vector is 'int'..
>>
>> That would be odd.
>>
>> >> Or there is a corner case that intack.vector is __not__ the highest
>priority vector..
>>
>> That's what I'm afraid of, and why I had asked to add the ASSERT().
>>
>
>I cannot come up a valid reason for such situation (intack.vector is 0x30
>while pt_vector is 0x38 from Chao's data). pt_update_irq is invoked before
>checking highest pending IRRs so pt_vector should be honored anyway.
>One possible reason is that being some reason pt_vector is not in vIRR at
>that point (due to some bug in the path from PIR to vIRR). However I didn't
>catch such bug simply by looking at code. We need reproduce this problem
>in developer side to find out actual reason. Andrew it'd be helpful if you
>can help Quan/Chao to find out more test environment info.
>

I'll continue to follow up this issue..
However I don't have enough CPU-v3 machine for test it(occupied by another 
project).. I hope Chao could build some test environment.. 


Quan 




>One thing noted though. The original patch from Quan is actually
>orthogonal to this ASSERT. Regardless of whether intack.vector is larger or
>smaller than pt_vector, we always require the trick as long as pt_vector is
>not the one being currently programmed to RVI. Then do we want to revert
>the whole commit until the problem is finally fixed, or OK to just remove
>ASSERT (or replace with WARN_ON with more debug info) to unblock test
>system before the fix is ready?
>
>Thanks
>Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.