[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6
>>> On 15.01.16 at 22:39, <konrad.wilk@xxxxxxxxxx> wrote: > On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote: >> Since we can (I hope) pretty much exclude a paging type, the >> ASSERT() must have triggered because of vapic_pg being NULL. >> That might be verifiable without extra printk()s, just by checking >> the disassembly (assuming the value sits in a register). In which >> case vapic_gpfn would be of interest too. > > The vapic_gpfn is 0xffffffffffff. > > To be exact: > > nvmx_update_virtual_apic_address:vCPU0 0xffffffffffffffff(vAPIC) 0x0(APIC), > 0x0(TPR) ctrl=b5b9effe > > Based on this: > > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c > index cb6f9b8..8a0abfc 100644 > --- a/xen/arch/x86/hvm/vmx/vvmx.c > +++ b/xen/arch/x86/hvm/vmx/vvmx.c > @@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct vcpu > *v) > > vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) >> > PAGE_SHIFT; > vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt, > P2M_ALLOC); > - ASSERT(vapic_pg && !p2m_is_paging(p2mt)); > + if ( !vapic_pg ) { > + printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR) > ctrl=%x\n", __func__,v->vcpu_id, > + __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR), > + __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR), > + __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD), > + ctrl); > + } > + ASSERT(vapic_pg); > + ASSERT(vapic_pg && !p2m_is_paging(p2mt)); > __vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg)); > put_page(vapic_pg); > } Interesting: I can't see VIRTUAL_APIC_PAGE_ADDR to be written with all ones anywhere, neither for the real VMCS nor for the virtual one (page_to_maddr() can't, afaict, return such a value). Could you check where the L1 guest itself is writing that value, or whether it fails to initialize that field and it happens to start out as all ones? >> What looks odd to me is the connection between >> CPU_BASED_TPR_SHADOW being set and the use of a (valid) >> virtual APIC page: Wouldn't this rather need to depend on >> SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, just like in >> nvmx_update_apic_access_address()? > > Could be. I added in an read for the secondary control: > > nvmx_update_virtual_apic_address:vCPU2 0xffffffffffffffff(vAPIC) 0x0(APIC), > 0x0(TPR) ctrl=b5b9effe sec=0 > > So trying your recommendation: > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c > index cb6f9b8..d291c91 100644 > --- a/xen/arch/x86/hvm/vmx/vvmx.c > +++ b/xen/arch/x86/hvm/vmx/vvmx.c > @@ -686,8 +686,8 @@ static void nvmx_update_virtual_apic_address(struct vcpu > *v) > struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v); > u32 ctrl; > > - ctrl = __n2_exec_control(v); > - if ( ctrl & CPU_BASED_TPR_SHADOW ) > + ctrl = __n2_secondary_exec_control(v); > + if ( ctrl & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES ) > { > p2m_type_t p2mt; > unsigned long vapic_gpfn; > > > Got me: > (XEN) stdvga.c:151:d1v0 leaving stdvga mode > (XEN) stdvga.c:147:d1v0 entering stdvga and caching modes > (XEN) stdvga.c:520:d1v0 leaving caching mode > (XEN) vvmx.c:2491:d1v0 Unknown nested vmexit reason 80000021. > (XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest state Interesting. I've just noticed that a similar odd looking (to me) dependency exists in construct_vmcs(). Perhaps I've overlooked something in the SDM. In any event I think some words from the VMX maintainers would be quite nice here. Sadly the VMCS dump doesn't include the two APIC related addresses... Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |