[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL
On 06/06/17 13:59, Jan Beulich wrote: >>>> On 05.06.17 at 18:55, <osstest-admin@xxxxxxxxxxxxxx> wrote: >> flight 110009 xen-unstable real [real] >> http://logs.test-lab.xenproject.org/osstest/logs/110009/ >> >> Regressions :-( >> >> Tests which did not succeed and are blocking, >> including tests which could not be run: >> test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. >> vs. 109841 > So finally we have some output from the debugging code added by > 933f966bcd ("x86/mm: add temporary debugging code to > get_page_from_gfn_p2m()"), i.e. the migration heisenbug we hope > to hunt down: > > (XEN) d0v2: d7 dying (looking up 3e000) > ... > (XEN) Xen call trace: > (XEN) [<ffff82d0803150ef>] get_page_from_gfn_p2m+0x7b/0x416 > (XEN) [<ffff82d080268e88>] arch_do_domctl+0x51a/0x2535 > (XEN) [<ffff82d080206cf9>] do_domctl+0x17e4/0x1baf > (XEN) [<ffff82d080355896>] pv_hypercall+0x1ef/0x42d > (XEN) [<ffff82d0803594c6>] entry.o#test_all_events+0/0x30 > > which points at XEN_DOMCTL_getpageframeinfo3 handling code. > What business would the tool stack have invoking this domctl for > a dying domain? I'd expect all of these operations to be done > while the domain is still alive (perhaps paused), but none of them > to occur once domain death was initiated. http://logs.test-lab.xenproject.org/osstest/logs/110009/test-amd64-amd64-xl-qemut-win7-amd64/15.ts-guest-localmigrate.log is rather curious. Unfortunately, libxl doesn't annotate the source and destination logging lines when it merges them back together, and doesn't include the progress markers. I've manually rearranged them back to a logical order. libxl-save-helper: debug: starting save: Success xc: detail: fd 10, dom 7, max_iters 0, max_factor 0, flags 5, hvm 1 xc: info: Saving domain 7, type x86 HVM xc: error: Failed to get types for pfn batch (3 = No such process): Internal error xc: error: Save failed (3 = No such process): Internal error xc: error: Couldn't disable qemu log-dirty mode (3 = No such process): Internal error xc: error: Failed to clean up (3 = No such process): Internal error The first -ESRCH here is the result of XEN_DOMCTL_getpageframeinfo3 encountering a dying domain. The qemu logdirty error is because the libxl callback found that the qemu process it was expecting talk to doesn't exist. From http://logs.test-lab.xenproject.org/osstest/logs/110009/test-amd64-amd64-xl-qemut-win7-amd64/elbling1---var-log-xen-xl-win.guest.osstest.log libxl: debug: libxl_domain.c:747:domain_death_xswatch_callback: Domain 7:[evg=0x11f5af0] got=domaininfos[0] got->domain=7 libxl: debug: libxl_domain.c:773:domain_death_xswatch_callback: Domain 7:Exists shutdown_reported=1 dominf.flags=1010f libxl: debug: libxl_domain.c:693:domain_death_occurred: Domain 7:dying libxl: debug: libxl_domain.c:740:domain_death_xswatch_callback: [evg=0] all reported libxl: debug: libxl_domain.c:802:domain_death_xswatch_callback: domain death search done libxl: debug: libxl_event.c:1869:libxl__ao_complete: ao 0x11f8220: complete, rc=0 libxl: debug: libxl_event.c:1838:libxl__ao__destroy: ao 0x11f8220: destroy So it appears that the domain died while it was being migrated. I expect the daemonised xl process then proceeded to clean it up under the feet of the ongoing migration. http://logs.test-lab.xenproject.org/osstest/logs/110009/test-amd64-amd64-xl-qemut-win7-amd64/elbling1---var-log-xen-qemu-dm-win.guest.osstest.log.1 says Log-dirty: no command yet. reset requested in cpu_handle_ioreq. Issued domain 7 reboot So actually it looks like reboot might have been going on, which also explains why the guest was booting as domain 9 while domain 7 was having problems during migrate. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |