[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 106504: regressions - FAIL
On Wed, Mar 22, 2017 at 06:47:33AM -0600, Jan Beulich wrote: >>>> On 22.03.17 at 05:53, <chao.gao@xxxxxxxxx> wrote: >> I have written a xtf test case (many codes are from hvmloader) to >> trigger this assertion. The test case is in attachments. > >Thanks for doing this. > >> Bottom is the output >> of this test. This test initializes PIT channel0 to generate periodic timer >> interrupt at 1000hz per second. The timer interrupt is delivered to vCPU0. >> And >> vCPU1 is used to change IOAPIC RTE 2 frequently. > >Well, this is certainly helpful (due to some of the conclusions you >draw below), but it is very likely not what has caused the assertion >to trigger in osstest. So by removing the assertion (as you suggest >below) we then will have a silent, non-understood misbehavior. Agree. > >> The assertion can be triggered by guest. To fix assertion failure, >> I propose to remove this assertion for the reason below: > >Of course I agree that a guest triggerable assertion is bad, and >hence needs a correction somewhere. > >> 1. Operations in this test case are very intrusive and abnormal. It updates >> RTE frequently without disabling interrupt source. In this case, I think >> software can't assume hardware works correctly. > >I guess hardware behavior simply is unspecified in such a case, so >it's hard to judge whether it works "correctly". agree. > >> 2. If we remove this assertion(means we admit pt_vector may be different >> from (or bigger than) the vector we set in vIRR in a rare case), the side >> effect is that we won't decrease the counter pt->ending_intr_nr in >> pt_intr_post() and one more timer interrupt in number is injected to guest. > >Which is clearly wrong, afaict, as that may drive the guest clock >off (depending on how the guest OS does its accounting). Yes. > >> 3. We read RTE 3 times. 1st happens when we set vIRR. 2nd happens when >> pt_update_irq() returns. 3rd happens in pt_intr_post(). If guest changes >> the vector in RTE during the window, it will also incur losing or getting >> more periodic timer interrupt. > >Which raises the question whether latching the value read the first >time would address the issue you demonstrate with the test case. >Or alternatively deferring writes to take effect only once readers >are done with their perhaps multiple accesses? I think your solution is better. > >Can you get in touch with your chipset folks to find out whether >hardware has cases where multiple reads occur during the >processing of a single event? Yes, I will come back once I get how they handle similar processes. > >> (d1) [ 1409.741660] --- Xen Test Framework --- >> (d1) [ 1409.741869] Environment: HVM 32bit (No paging) >> (d1) [ 1409.741964] Test periodic-timer >> (d1) [ 1409.742077] activate cpu1 >> (XEN) [ 1423.581228] d1v0: intack: 02:48 pt: 38 > >I keep getting confused by my own mistake of getting the format >string wrong here (the above should be intack: 2:30 pt: 38). I.e. >I was about to complain that there's no use vector 48 in your >test code, when I remembered that it's being wrongly printed in >decimal. Sorry for my fault. > >Jan > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |