[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
Hi The reproduction should be pretty simple: Apply the patch to enable altp2m unconditionally: d->arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1; d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_REASON] = SHUTDOWN_reboot; + d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] = 1; + vpic_init(d); rc = vioapic_init(d); For the guest we use one state file ( Windows 10 ) from which the guests are restored with libvirt. Simply restore and destroy several guests (5-7 in our current setup) in fast succession (every guest has about 1-2minutes runtime). The amount of guest-VMs seems to correlate with the time until the crash occurs, but other, random factors seem to be more important. More VMs => the crash happens faster. Is the following debug-setup possible? L0: Xen / VMWare L1: Xen with altp2m enabled L2: Several guest-VMs being constantly restored / destroyed Then periodically take snapshots until the hypervisor panics and try to debug from the latest snapshot on. > -----Ursprüngliche Nachricht----- > Von: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx] > Gesendet: Montag, 22. August 2016 13:58 > An: Mayer, Kevin <Kevin.Mayer@xxxxxxxx>; JBeulich@xxxxxxxx > Cc: xen-devel@xxxxxxxxxxxxx > Betreff: Re: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault > > On 19/08/16 11:01, Kevin.Mayer@xxxxxxxx wrote: > > Hi > > > > I took another look at Xen and a new crashdump. > > The last successful __vmwrite should be in static void > > vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...] > > __vmwrite(SECONDARY_VM_EXEC_CONTROL, > > v->arch.hvm_vmx.secondary_exec_control); > > [...] > > After this the altp2m_vcpu_destroy wakes up the vcpu and is then > finished. > > > > In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can > overwritten (but is not reached in our case as far as I can see): > > if ( nvcpu->nv_n1vmcx ) > > v->arch.hvm_vmx.vmcs = nvcpu->nv_n1vmcx; > > > > In conclusion: > > When destroying a domain the altp2m_vcpu_destroy(v); path seems to > mess up the vmcs which ( only ) sometimes leads to a failed __vmwrite in > vmx_fpu_leave. > > That is as far as I can get with my understanding of the Xen code. > > > > Do you guys have any additional ideas what I could test / analyse? > > Do you have easy reproduction instructions you could share? Sadly, this is > looking like an issue which isn't viable to debug over email. > > ~Andrew ____________ Virus checked by G Data MailSecurity Version: AVA 25.7981 dated 22.08.2016 Virus news: www.antiviruslab.com _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |