[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 106504: regressions - FAIL
On Tue, Mar 07, 2017 at 07:11:22AM -0700, Jan Beulich wrote: >>>> On 07.03.17 at 05:24, <chao.gao@xxxxxxxxx> wrote: >> On Tue, Mar 07, 2017 at 02:16:50AM -0700, Jan Beulich wrote: >>>>>> On 07.03.17 at 06:52, <osstest-admin@xxxxxxxxxxxxxx> wrote: >>>> flight 106504 xen-unstable real [real] >>>> http://logs.test-lab.xenproject.org/osstest/logs/106504/ >>>> >>>> Regressions :-( >>>> >>>> Tests which did not succeed and are blocking, >>>> including tests which could not be run: >>>> [...] >>>> test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 16 guest-stop fail REGR. >>>> vs. >>>> 106482 >>> >>>Here we go: >>> >>>(XEN) d15v0: intack: 02:48 pt: 38 >>>(XEN) vIRR: 00000000 00000000 00000000 00000000 00000000 00000000 00010000 >> 00000000 >>>(XEN) PIR: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >> 00000000 >>>(XEN) Assertion 'intack.vector >= pt_vector' failed at intr.c:360 >>>(XEN) ----[ Xen-4.9-unstable x86_64 debug=y Not tainted ]---- >>>(XEN) CPU: 0 >>>(XEN) RIP: e008:[<ffff82d0802039e8>] vmx_intr_assist+0x5fa/0x61a >>>(XEN) RFLAGS: 0000000000010292 CONTEXT: hypervisor (d15v0) >>>(XEN) rax: ffff82d0804754a8 rbx: ffff83007f375680 rcx: 0000000000000000 >>>(XEN) rdx: ffff83007cd3ffff rsi: 000000000000000a rdi: ffff82d0803316d8 >>>(XEN) rbp: ffff83007cd3ff08 rsp: ffff83007cd3fea8 r8: ffff830277db8000 >>>(XEN) r9: 0000000000000001 r10: 0000000000000000 r11: 0000000000000001 >>>(XEN) r12: 00000000ffffffff r13: ffff82d0802b5b02 r14: ffff82d0802b5b02 >>>(XEN) r15: ffff83027d82e000 cr0: 0000000080050033 cr4: 00000000001526e0 >>>(XEN) cr3: 0000000259135000 cr2: 000000000164f034 >>>(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 >>>(XEN) Xen code around <ffff82d0802039e8> (vmx_intr_assist+0x5fa/0x61a): >>>(XEN) fb ff ff e9 49 fc ff ff <0f> 0b 89 ce 48 89 df e8 2a 21 00 00 e9 49 >>>fe >> ff >>>(XEN) Xen stack trace from rsp=ffff83007cd3fea8: >>>(XEN) ffff82d08044ab00 00000038ffffffff ffff83007cd3ffff ffff83027d82e000 >>>(XEN) ffff83007cd3fef8 ffff82d080133a3d ffff83007f375000 ffff83007f375000 >>>(XEN) ffff83007f7fc000 ffff83026df78000 0000000000000000 ffff83027d82e000 >>>(XEN) ffff83007cd3fdb0 ffff82d080213191 0000000000000004 00000000000000c2 >>>(XEN) 0000000000000020 0000000000000002 ffff880029994000 ffffffff81ade0a0 >>>(XEN) 0000000000000246 0000000000000000 ffff88002d000008 0000000000000004 >>>(XEN) 000000000000006c 0000000000000000 00000000000003f8 00000000000003f8 >>>(XEN) ffffffff81ade0a0 0000beef0000beef ffffffff81389ac4 000000bf0000beef >>>(XEN) 0000000000000002 ffff88002f403e08 000000000000beef 000000000000beef >>>(XEN) 000000000000beef 000000000000beef 000000000000beef 0000000000000000 >>>(XEN) ffff83007f375000 0000000000000000 00000000001526e0 >>>(XEN) Xen call trace: >>>(XEN) [<ffff82d0802039e8>] vmx_intr_assist+0x5fa/0x61a >>>(XEN) [<ffff82d080213191>] vmx_asm_vmexit_handler+0x41/0x120 >>>(XEN) >>>(XEN) >>>(XEN) **************************************** >>>(XEN) Panic on CPU 0: >>>(XEN) Assertion 'intack.vector >= pt_vector' failed at intr.c:360 >>>(XEN) **************************************** >>> >>>I didn't make an attempt at interpreting this yet, but I wonder if it >>>is more than coincidence that - just like the first time the ASSERT() >>>triggered - this is again a guest-stop of a qemuu-debianhvm. >>> >> >> Cc: xuquan. >> >> Exciting! I have been monitoring osstest for about one months through >> a python script. But I always crawl the flights one time a day. >> >> From the output, the pt_vector is 0x38 and the intack.vector is >> 0x30. these two values are same with they were in the first time. >> And only one bit 0x30 is set in vIRR. PIR is NULL. So maybe >> our suspicion that PIR is not synced to vIRR is wrong. The 0x38 bit >> is not present in vIRR is strange. Is it possible that we clear the 0x38 bit >> just after we return from pt_update_irq()? > >That would be done how? > >> Or, just like I suspected that >> it is caused by pt_update_irq() sets 0x30 but wrongly returns 0x38. > >Same here, and as expressed earlier: I'm lacking a plausible theory >on how this could be happening. In particular ... > >> Do you think it worths a try to disable guest's LAPIC timer and >> force it use IRQ0 along with changing RTE very frequently? > >... if this is the LAPIC timer, then the RTE isn't being read afaics >(pt_irq_vector() should be taking its very first return path in that >case). Nor am I aware that any Linux version would move around >one of its timer interrupts very frequently. But then again 0x30 >or 0x38 wouldn't be use for the LAPIC timer anyway, but rather >a vector in the fixed range (0xEF on 4.10). So I think part of the >problem is to understand which timer's vector we're dealing with >here. > I have written a xtf test case (many codes are from hvmloader) to trigger this assertion. The test case is in attachments. Bottom is the output of this test. This test initializes PIT channel0 to generate periodic timer interrupt at 1000hz per second. The timer interrupt is delivered to vCPU0. And vCPU1 is used to change IOAPIC RTE 2 frequently. The assertion can be triggered by guest. To fix assertion failure, I propose to remove this assertion for the reason below: 1. Operations in this test case are very intrusive and abnormal. It updates RTE frequently without disabling interrupt source. In this case, I think software can't assume hardware works correctly. 2. If we remove this assertion(means we admit pt_vector may be different from (or bigger than) the vector we set in vIRR in a rare case), the side effect is that we won't decrease the counter pt->ending_intr_nr in pt_intr_post() and one more timer interrupt in number is injected to guest. 3. We read RTE 3 times. 1st happens when we set vIRR. 2nd happens when pt_update_irq() returns. 3rd happens in pt_intr_post(). If guest changes the vector in RTE during the window, it will also incur losing or getting more periodic timer interrupt. (d1) [ 1409.741660] --- Xen Test Framework --- (d1) [ 1409.741869] Environment: HVM 32bit (No paging) (d1) [ 1409.741964] Test periodic-timer (d1) [ 1409.742077] activate cpu1 (XEN) [ 1423.581228] d1v0: intack: 02:48 pt: 38 (XEN) [ 1423.581234] vIRR: 00000000 00000000 00000000 00000000 00000000 00000000 00010000 00000000 (XEN) [ 1423.581246] PIR: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (XEN) [ 1423.581286] Assertion 'intack.vector >= pt_vector' failed at intr.c:360 (XEN) [ 1423.581294] ----[ Xen-4.9-unstable x86_64 debug=y Not tainted ]---- (XEN) [ 1423.581370] CPU: 58 (XEN) [ 1423.581375] RIP: e008:[<ffff82d0801fe405>] vmx_intr_assist+0x605/0x625 (XEN) [ 1423.581389] RFLAGS: 0000000000010296 CONTEXT: hypervisor (d1v0) (XEN) [ 1423.581398] rax: ffff830837e0402c rbx: ffff83006a093680 rcx: 0000000000000000 (XEN) [ 1423.581404] rdx: ffff831075e17fff rsi: 000000000000000a rdi: ffff82d08032f6b8 (XEN) [ 1423.581410] rbp: ffff831075e17f08 rsp: ffff831075e17e98 r8: ffff83083e000000 (XEN) [ 1423.581416] r9: 0000000000000001 r10: 0000000000000000 r11: 0000000000000001 (XEN) [ 1423.581422] r12: 00000000ffffffff r13: ffff82d0802a4c31 r14: ffff82c000408200 (XEN) [ 1423.581427] r15: 0000000000004016 cr0: 000000008005003b cr4: 00000000003526e0 (XEN) [ 1423.581432] cr3: 000000081e2bf000 cr2: 0000000000000000 (XEN) [ 1423.581437] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) [ 1423.581446] Xen code around <ffff82d0801fe405> (vmx_intr_assist+0x605/0x625): (XEN) [ 1423.581450] fb ff ff e9 5e fc ff ff <0f> 0b 89 ce 48 89 df e8 03 21 00 00 e9 62 fe ff (XEN) [ 1423.581470] Xen stack trace from rsp=ffff831075e17e98: (XEN) [ 1423.581473] ffff831075e17f08 ffff82d08034c700 ffff82d000000030 ffffffffffffffff (XEN) [ 1423.581483] ffff831075e17fff 0000000000000000 ffff831075e17ef8 ffff82d0801340ff (XEN) [ 1423.581491] ffff83006a093000 ffff83006a093000 ffff83006a093000 ffff830837e04148 (XEN) [ 1423.581500] 0000014b740caab6 0000000001c9c380 ffff831075e17e28 ffff82d08020da51 (XEN) [ 1423.581509] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) [ 1423.581515] 0000000000000000 00000000fee00000 0000000000000000 0000000000000000 (XEN) [ 1423.581522] 0000000000000000 0000000000000000 0000000000000004 000000000010260d (XEN) [ 1423.581529] 0000000000000001 0000000000000000 0000000000000000 0000beef0000beef (XEN) [ 1423.581536] 0000000000102928 000000bf0000beef 0000000000000206 0000000000115fa0 (XEN) [ 1423.581544] 000000000000beef 000000000000beef 000000000000beef 000000000000beef (XEN) [ 1423.581551] 000000000000beef 000000000000003a ffff83006a093000 00000037b7a91900 (XEN) [ 1423.581559] 00000000003526e0 (XEN) [ 1423.581564] Xen call trace: (XEN) [ 1423.581570] [<ffff82d0801fe405>] vmx_intr_assist+0x605/0x625 (XEN) [ 1423.581580] [<ffff82d08020da51>] vmx_asm_vmexit_handler+0x41/0x120 (XEN) [ 1423.581584] (XEN) [ 1423.827761] (XEN) [ 1423.829753] **************************************** (XEN) [ 1423.835210] Panic on CPU 58: (XEN) [ 1423.838591] Assertion 'intack.vector >= pt_vector' failed at intr.c:360 (XEN) [ 1423.845698] **************************************** Attachment:
periodic-timer-test _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |