[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL



On 06/06/17 13:59, Jan Beulich wrote:
>>>> On 05.06.17 at 18:55, <osstest-admin@xxxxxxxxxxxxxx> wrote:
>> flight 110009 xen-unstable real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/110009/ 
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. 
>> vs. 109841
> So finally we have some output from the debugging code added by
> 933f966bcd ("x86/mm: add temporary debugging code to
> get_page_from_gfn_p2m()"), i.e. the migration heisenbug we hope
> to hunt down:
>
> (XEN) d0v2: d7 dying (looking up 3e000)
> ...
> (XEN) Xen call trace:
> (XEN)    [<ffff82d0803150ef>] get_page_from_gfn_p2m+0x7b/0x416
> (XEN)    [<ffff82d080268e88>] arch_do_domctl+0x51a/0x2535
> (XEN)    [<ffff82d080206cf9>] do_domctl+0x17e4/0x1baf
> (XEN)    [<ffff82d080355896>] pv_hypercall+0x1ef/0x42d
> (XEN)    [<ffff82d0803594c6>] entry.o#test_all_events+0/0x30
>
> which points at XEN_DOMCTL_getpageframeinfo3 handling code.
> What business would the tool stack have invoking this domctl for
> a dying domain? I'd expect all of these operations to be done
> while the domain is still alive (perhaps paused), but none of them
> to occur once domain death was initiated.

http://logs.test-lab.xenproject.org/osstest/logs/110009/test-amd64-amd64-xl-qemut-win7-amd64/15.ts-guest-localmigrate.log
is rather curious.  Unfortunately, libxl doesn't annotate the source and
destination logging lines when it merges them back together, and doesn't
include the progress markers.  I've manually rearranged them back to a
logical order.

libxl-save-helper: debug: starting save: Success
xc: detail: fd 10, dom 7, max_iters 0, max_factor 0, flags 5, hvm 1
xc: info: Saving domain 7, type x86 HVM
xc: error: Failed to get types for pfn batch (3 = No such process):
Internal error
xc: error: Save failed (3 = No such process): Internal error
xc: error: Couldn't disable qemu log-dirty mode (3 = No such process):
Internal error
xc: error: Failed to clean up (3 = No such process): Internal error

The first -ESRCH here is the result of XEN_DOMCTL_getpageframeinfo3
encountering a dying domain.  The qemu logdirty error is because the
libxl callback found that the qemu process it was expecting talk to
doesn't exist.

From
http://logs.test-lab.xenproject.org/osstest/logs/110009/test-amd64-amd64-xl-qemut-win7-amd64/elbling1---var-log-xen-xl-win.guest.osstest.log

libxl: debug: libxl_domain.c:747:domain_death_xswatch_callback: Domain
7:[evg=0x11f5af0]   got=domaininfos[0] got->domain=7
libxl: debug: libxl_domain.c:773:domain_death_xswatch_callback: Domain
7:Exists shutdown_reported=1 dominf.flags=1010f
libxl: debug: libxl_domain.c:693:domain_death_occurred: Domain 7:dying
libxl: debug: libxl_domain.c:740:domain_death_xswatch_callback: [evg=0]
all reported
libxl: debug: libxl_domain.c:802:domain_death_xswatch_callback: domain
death search done
libxl: debug: libxl_event.c:1869:libxl__ao_complete: ao 0x11f8220:
complete, rc=0
libxl: debug: libxl_event.c:1838:libxl__ao__destroy: ao 0x11f8220: destroy

So it appears that the domain died while it was being migrated.  I
expect the daemonised xl process then proceeded to clean it up under the
feet of the ongoing migration.

http://logs.test-lab.xenproject.org/osstest/logs/110009/test-amd64-amd64-xl-qemut-win7-amd64/elbling1---var-log-xen-qemu-dm-win.guest.osstest.log.1
says

Log-dirty: no command yet.
reset requested in cpu_handle_ioreq.
Issued domain 7 reboot

So actually it looks like reboot might have been going on, which also
explains why the guest was booting as domain 9 while domain 7 was having
problems during migrate.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.