[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 108068: regressions - FAIL

Jan Beulich writes ("Re: [Xen-devel] [xen-unstable test] 108068: regressions - 
> On 01.05.17 at 20:49, <osstest-admin@xxxxxxxxxxxxxx> wrote:
> This has been recurring for the last few flights, but I wonder whether
> 2017-05-01 13:18:52 Z executing ssh ... root@ readlink 
> /dev/italia0-vg/win.guest.osstest-disk 
> 2017-05-01 13:18:52 Z executing ssh ... root@ lvdisplay --colon 
> /dev/italia0-vg/win.guest.osstest-disk 
> 2017-05-01 13:18:53 Z lvdisplay output says device is still open: 
> /dev/italia0-vg/win.guest.osstest-disk:italia0-vg:3:1:-1:2:20480000:2500:-1:0:-1:253:2
> 2017-05-01 13:18:53 Z executing ssh ... root@ umount 
> /dev/italia0-vg/win.guest.osstest-disk 
> umount: /dev/italia0-vg/win.guest.osstest-disk: not mounted
> 2017-05-01 13:18:53 Z command nonzero waitstatus 8192: timeout 60 ssh -o 
> StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=100 -o 
> ServerAliveInterval=100 -o PasswordAuthentication=no -o 
> ChallengeResponseAuthentication=no -o 
> UserKnownHostsFile=tmp/t.known_hosts_108068.test-amd64-i386-xl-qemut-winxpsp3-vcpus1
>  root@ umount /dev/italia0-vg/win.guest.osstest-disk 
> status 8192 at Osstest/TestSupport.pm line 442.
> indicates an environmental problem rather than a
> software-under-test one (the more that the single commit
> being tested can't possibly influence host or guest behavior).

This is almost certainly not an environmental problem.  What seems to
be happening is that the guest shutdown/teardown is going wrong


shows this:

2017-05-01 13:18:27 Z executing ssh ... root@ xl shutdown -wF 
Shutting down domain 17
PV control interface not available: sending ACPI power button event.
Waiting for 1 domains
Domain 17 has been shut down, reason code 1
2017-05-01 13:18:36 Z executing ssh ... root@ xl list 
2017-05-01 13:18:36 Z guest win.guest.osstest state is psr 

So the guest has been shut down in the sense that xl shutdown -w
has exited (-w means to wait for the shutdown), but not in the sense
that the domain has been destroyed.

osstest spends 14 seconds checking that the guest doesn't respond to
ping (this is probably a bit pointless, TBH):

2017-05-01 13:18:50 Z ping down 

Then the next step tries to start the guest.  But it finds that the
backing block device is in use.  The command that fails is there so
that this test script can be re-run in certain ad-hoc by-hand tests:
it is trying to unmount the block device, on the theory that if it is
shown as open in LVM, that is probably because it's mounted.  The
unmount fails.

The underlying problem is that the block backend still has the guest
block device open.  Indeed, during the logs capture we see


the guest is still there:

Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   511     4     r-----     913.9
win.guest.osstest                           18  1536     1     r-----      16.0

(that's at 2017-05-01 13:18:56)

I think the guest that was shut down was domid 17 and this new one is
domid 18.  This logfile


shows domid 17 shutting down and then this message

 Done. Rebooting now

and then it seems to start the domain again.

Is it possible that something has changed which means that Windows
(sometimes?) doesn't respond to an ACPI power button event by shutting
down, but by rebooting ?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.