 
	
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] HVM domains crash after upgrade from XEN 4.5.1 to 4.5.2
 >>> On 13.11.15 at 00:00, <ariel.atom2@xxxxxxxxxx> wrote: > Am 12.11.15 um 17:43 schrieb Andrew Cooper: >> On 12/11/15 14:29, Atom2 wrote: >>> Hi Andrew, >>> thanks for your reply. Answers are inline further down. >>> >>> Am 12.11.15 um 14:01 schrieb Andrew Cooper: >>>> On 12/11/15 12:52, Jan Beulich wrote: >>>>>>>> On 12.11.15 at 02:08, <ariel.atom2@xxxxxxxxxx> wrote: >>>>>> After the upgrade HVM domUs appear to no longer work - regardless >>>>>> of the >>>>>> dom0 kernel (tested with both 3.18.9 and 4.1.7 as the dom0 kernel); PV >>>>>> domUs, however, work just fine as before on both dom0 kernels. >>>>>> >>>>>> xl dmesg shows the following information after the first crashed HVM >>>>>> domU which is started as part of the machine booting up: >>>>>> [...] >>>>>> (XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest >>>>>> state (0). >>>>>> (XEN) ************* VMCS Area ************** >>>>>> (XEN) *** Guest State *** >>>>>> (XEN) CR0: actual=0x0000000000000039, shadow=0x0000000000000011, >>>>>> gh_mask=ffffffffffffffff >>>>>> (XEN) CR4: actual=0x0000000000002050, shadow=0x0000000000000000, >>>>>> gh_mask=ffffffffffffffff >>>>>> (XEN) CR3: actual=0x0000000000800000, target_count=0 >>>>>> (XEN) target0=0000000000000000, target1=0000000000000000 >>>>>> (XEN) target2=0000000000000000, target3=0000000000000000 >>>>>> (XEN) RSP = 0x0000000000006fdc (0x0000000000006fdc) RIP = >>>>>> 0x0000000100000000 (0x0000000100000000) >>>>> Other than RIP looking odd for a guest still in non-paged protected >>>>> mode I can't seem to spot anything wrong with guest state. >>>> odd? That will be the source of the failure. >>>> >>>> Out of long mode, the upper 32bit of %rip should all be zero, and it >>>> should not be possible to set any of them. >>>> >>>> I suspect that the guest has exited for emulation, and there has been a >>>> bad update to %rip. The alternative (which I hope is not the case) is >>>> that there is a hardware errata which allows the guest to accidentally >>>> get it self into this condition. >>>> >>>> Are you able to rerun with a debug build of the hypervisor? >>> [snip] >>> Another question is whether prior to enabling the debug USE flag it >>> might make sense to re-compile with gcc-4.8.5 (please see my previous >>> list reply) to rule out any compiler related issues. Jan, Andrew - >>> what are your thoughts? >> First of all, check whether the compiler makes a difference on 4.5.2 > Hi Andrew, > I changed the compiler and there was no change to the better: > Unfortunately the HVM domU is still crashing with a similar error > message as soon as it is being started. >> If both compiles result in a guest crashing in that manner, test a debug >> Xen to see if any assertions/errors are encountered just before the >> guest crashes. >> > As the compiler did not make any difference, I enabled the debug USE > flag, re-compiled (using gcc-4.9.3), and rebooted using a serial console > to capture output. Unfortunately I did not get very far and things > become even stranger: This time the system did not even finnish the boot > process, but rather hard-stopped pretty early with a message reading > "Panic on CPU 3: DOUBLE FAULT -- system shutdown". The captured logfile > is attached as "serial log.txt". > > As this happened immediately after the CPU microcode update, I thought > there might be a connection and disabled the microcode update. After the > next reboot it seemed as if the boot process got a bit further as > evidenced by a few more lines in the log file (those between lines 136 > and 197 in the second log file named "serial log no ucode.txt"), but in > the end it finnished off with an identical error message (only the CPU # > was different this time, but that number seems to change between boots > anyways). > > I hope that makes some sense to you. Not really, other than now even more suspecting bad hardware or something fundamentally wrong with your build. Did you retry with a freshly built 4.5.1? Could you alternatively try with a known good build of 4.5.2 (e.g. from osstest)? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel 
 
 
 | 
|  | Lists.xenproject.org is hosted with RackSpace, monitoring our |