[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [osstest test] 110909: tolerable FAIL - PUSHED
On 21/06/2017 23:59, Ian Jackson wrote: > osstest service owner writes ("[osstest test] 110909: tolerable FAIL - > PUSHED"): >> flight 110909 osstest real [real] >> http://logs.test-lab.xenproject.org/osstest/logs/110909/ >> >> Failures :-/ but no regressions. > ... >> Tests which did not succeed, but are not blocking: > ... >> test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail like >> 110373 > This guest had ~31G of disk and 1.5G of RAM. > > The logfile > > > http://logs.test-lab.xenproject.org/osstest/logs/110909/test-amd64-i386-xl-qemuu-win7-amd64/15.ts-guest-localmigrate.log > > seems to show that the guest is paused (state "p") following the 9th > migration. This is weird, given that xl seems to say earlier > "migration target: Domain started successsfully", which message > follows the call to libxl_domain_unpause. > > I wonder if it is possible that the domain still appears paused > briefly after xl/libxlq tries to unpause it. That is, that > XEN_DOMINF_paused might be set in the return from > xc_domain_getinfolist even after the unpause domctl returns. > > By the time log collection runs, the domain seems unpaused. XEN_DOMINF_paused is a straight reflection of d->controller_pause_count. A domain is created with 1 reference count, requiring the toolstack to call DOMCTL_unpause_domain once to cause it to start executing. Other than that, it is strictly reference counted based on pause and unpause hypercalls from toolstack components (in this case, all in dom0). One issue which XenServer has found in combination with Introspection is that any toolstack entity which can call pause/unpause (even for a short period of time) can result in XEN_DOMINF_paused being sampled as being set. The fix ^W utterly gross hack for XenServer's purposes is https://github.com/xenserver/xen-4.7.pg/blob/master/master/xen-introspection-pause.patch but I don't yet have a sensible plan for how to fix this in general. One option would be to introduce hypercall pairs per toolstack component, but that doesn't scale sensibly. In this case, what condition causes the failure? Is it simply seeing the domain as paused (in which case, there will definitely be a low-probability false negative rate if anything else in dom0 uses domain pause), or is it some other failure which prompts for the paused state check? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |