[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Commit moratorium to staging
> -----Original Message----- > From: Roger Pau Monne > Sent: 02 November 2017 09:42 > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> > Cc: Ian Jackson <Ian.Jackson@xxxxxxxxxx>; Lars Kurth > <lars.kurth@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; Julien Grall > <julien.grall@xxxxxxxxxx>; committers@xxxxxxxxxxxxxx; xen-devel <xen- > devel@xxxxxxxxxxxxxxxxxxxx> > Subject: Re: [Xen-devel] Commit moratorium to staging > > On Thu, Nov 02, 2017 at 09:20:10AM +0000, Paul Durrant wrote: > > > -----Original Message----- > > > From: Roger Pau Monne > > > Sent: 02 November 2017 09:15 > > > To: Roger Pau Monne <roger.pau@xxxxxxxxxx> > > > Cc: Ian Jackson <Ian.Jackson@xxxxxxxxxx>; Lars Kurth > > > <lars.kurth@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; Julien Grall > > > <julien.grall@xxxxxxxxxx>; Paul Durrant <Paul.Durrant@xxxxxxxxxx>; > > > committers@xxxxxxxxxxxxxx; xen-devel <xen- > devel@xxxxxxxxxxxxxxxxxxxx> > > > Subject: Re: [Xen-devel] Commit moratorium to staging > > > > > > On Wed, Nov 01, 2017 at 04:17:10PM +0000, Roger Pau Monné wrote: > > > > On Wed, Nov 01, 2017 at 02:07:48PM +0000, Ian Jackson wrote: > > > > > * Affected hosts differ from unaffected hosts according to cpuid. > > > > > Roger has repro'd the bug on an unaffected host by masking out > > > > > certain cpuid bits. There are 6 implicated bits and he is working > > > > > to narrow that down. > > > > > > > > I'm currently trying to narrow this down and make sure the above is > > > > accurate. > > > > > > So I was wrong with this, I guess I've run the tests on the wrong > > > host. Even when masking the different cpuid bits in the guest the > > > tests still succeeds. > > > > > > AFAICT the test fail or succeed reliably depending on the host > > > hardware. I don't really have many ideas about what to do next, but I > > > think it would be useful to create a manual osstest flight that runs > > > the win16 job in all the different hosts in the colo. I would also > > > capture the normal information that Xen collects after each test (xl > > > info, /proc/cpuid, serial logs...). > > > > > > Is there anything else not captured by ts-logs-capture that would be > > > interesting in order to help debug the issue? > > > > Does the shutdown reliably complete prior to migrate and then only fail > intermittently after a localhost migrate? > > AFAICT yes, but it can also be added to the test in order to be sure. > > > It might be useful to know what cpuid info is seen by the guest before and > after migrate. > > Is there anyway to get that from windows in an automatic way? If not I > could test that with a Debian guest. In fact it might even be a good > thing for Linux based guest to be added to the regular migration tests > in order to make sure cpuid bits don't change across migrations. > I found this for windows: https://www.cpuid.com/downloads/cpu-z/cpu-z_1.81-en.exe It can generate a text or html report as well as being run interactively. But you may get more mileage from using a debian HVM guest. I guess it may also be useful is we can get a scan of available MSRs and content before and after migrate too. > > Another datapoint... does the shutdown fail if you insert a delay of a > > couple > of minutes between the migrate and the shutdown? > > Sometimes, after a variable number of calls to xl shutdown ... the > guest usually ends up shutting down. > Hmm. I wonder whether the guest is actually healthy after the migrate. One could imagine a situation where the storage device model (IDE in our case I guess) gets stuck in some way but recovers after a timeout in the guest storage stack. Thus, if you happen to try shut down while it is still stuck Windows starts trying to shut down but can't. Try after the timeout though and it can. In the past we did make attempts to support Windows without PV drivers in XenServer but xenrt would never reliably pass VM lifecycle tests using emulated devices. That was with qemu trad, but I wonder whether upstream qemu is actually any better particularly if using older device models such as IDE and RTL8139 (which are probably largely unmodified from trad). Paul > Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |