[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH V4 03/15] x86/pv: Rewrite how building PV dom0 handles domheap mappings
Hi, I've been trying to run this series for a while, but it crashes very frequentyly starting from the patch that generalizes the mapcache. I think I've tracked it down to this patch. On Mon Nov 11, 2024 at 1:11 PM GMT, Elias El Yandouzi wrote: > From: Hongyan Xia <hongyxia@xxxxxxxxxx> > > Building a PV dom0 is allocating from the domheap but uses it like the > xenheap. Use the pages as they should be. > > Signed-off-by: Hongyan Xia <hongyxia@xxxxxxxxxx> > Signed-off-by: Julien Grall <jgrall@xxxxxxxxxx> > Signed-off-by: Elias El Yandouzi <eliasely@xxxxxxxxxx> > > ---- > Changes in V4: > * Reduce the scope of l{1,2,4}start_mfn variables > * Make the macro `UNMAP_MAP_AND_ADVANCE` return the new virtual > address > > Changes in V3: > * Fold following patch 'x86/pv: Map L4 page table for shim domain' > > Changes in V2: > * Clarify the commit message > * Break the patch in two parts > > Changes since Hongyan's version: > * Rebase > * Remove spurious newline > > diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c > index 18b7a3e4e025..b03df609cadb 100644 > --- a/xen/arch/x86/pv/dom0_build.c > +++ b/xen/arch/x86/pv/dom0_build.c > @@ -382,6 +382,7 @@ static int __init dom0_construct(struct domain *d, > l3_pgentry_t *l3tab = NULL, *l3start = NULL; > l2_pgentry_t *l2tab = NULL, *l2start = NULL; > l1_pgentry_t *l1tab = NULL, *l1start = NULL; > + mfn_t l3start_mfn = INVALID_MFN; > > /* > * This fully describes the memory layout of the initial domain. All > @@ -719,22 +720,34 @@ static int __init dom0_construct(struct domain *d, > v->arch.pv.event_callback_cs = FLAT_COMPAT_KERNEL_CS; > } > > +#define UNMAP_MAP_AND_ADVANCE(mfn_var, virt_var, maddr) ({ \ > + do { \ > + unmap_domain_page(virt_var); \ > + mfn_var = maddr_to_mfn(maddr); \ > + maddr += PAGE_SIZE; \ > + virt_var = map_domain_page(mfn_var); \ > + } while ( false ); \ > + virt_var; \ > +}) > + > if ( !compat ) > { > + mfn_t l4start_mfn; > maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l4_page_table; > - l4start = l4tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE; > + l4tab = UNMAP_MAP_AND_ADVANCE(l4start_mfn, l4start, mpt_alloc); In here l4start is mapped on the idle domain perdomain area, but... > clear_page(l4tab); > - init_xen_l4_slots(l4tab, _mfn(virt_to_mfn(l4start)), > - d, INVALID_MFN, true); > - v->arch.guest_table = pagetable_from_paddr(__pa(l4start)); > + init_xen_l4_slots(l4tab, l4start_mfn, d, INVALID_MFN, true); > + v->arch.guest_table = pagetable_from_mfn(l4start_mfn); > } > else > { > /* Monitor table already created by switch_compat(). */ > - l4start = l4tab = __va(pagetable_get_paddr(v->arch.guest_table)); > + mfn_t l4start_mfn = pagetable_get_mfn(v->arch.guest_table); > + l4start = l4tab = map_domain_page(l4start_mfn); > /* See public/xen.h on why the following is needed. */ > maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l3_page_table; > l3start = __va(mpt_alloc); mpt_alloc += PAGE_SIZE; > + UNMAP_MAP_AND_ADVANCE(l3start_mfn, l3start, mpt_alloc); > } > > l4tab += l4_table_offset(v_start); > @@ -743,15 +756,17 @@ static int __init dom0_construct(struct domain *d, > { > if ( !((unsigned long)l1tab & (PAGE_SIZE-1)) ) > { > + mfn_t l1start_mfn; > maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l1_page_table; > - l1start = l1tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE; > + l1tab = UNMAP_MAP_AND_ADVANCE(l1start_mfn, l1start, mpt_alloc); > clear_page(l1tab); > if ( count == 0 ) > l1tab += l1_table_offset(v_start); > if ( !((unsigned long)l2tab & (PAGE_SIZE-1)) ) > { > + mfn_t l2start_mfn; > maddr_to_page(mpt_alloc)->u.inuse.type_info = > PGT_l2_page_table; > - l2start = l2tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE; > + l2tab = UNMAP_MAP_AND_ADVANCE(l2start_mfn, l2start, > mpt_alloc); > clear_page(l2tab); > if ( count == 0 ) > l2tab += l2_table_offset(v_start); > @@ -761,19 +776,19 @@ static int __init dom0_construct(struct domain *d, > { > maddr_to_page(mpt_alloc)->u.inuse.type_info = > PGT_l3_page_table; > - l3start = __va(mpt_alloc); mpt_alloc += PAGE_SIZE; > + UNMAP_MAP_AND_ADVANCE(l3start_mfn, l3start, > mpt_alloc); > } > l3tab = l3start; > clear_page(l3tab); > if ( count == 0 ) > l3tab += l3_table_offset(v_start); > - *l4tab = l4e_from_paddr(__pa(l3start), L4_PROT); > + *l4tab = l4e_from_mfn(l3start_mfn, L4_PROT); > l4tab++; > } > - *l3tab = l3e_from_paddr(__pa(l2start), L3_PROT); > + *l3tab = l3e_from_mfn(l2start_mfn, L3_PROT); > l3tab++; > } > - *l2tab = l2e_from_paddr(__pa(l1start), L2_PROT); > + *l2tab = l2e_from_mfn(l1start_mfn, L2_PROT); > l2tab++; > } > if ( count < initrd_pfn || count >= initrd_pfn + PFN_UP(initrd_len) ) > @@ -792,27 +807,32 @@ static int __init dom0_construct(struct domain *d, > > if ( compat ) > { > - l2_pgentry_t *l2t; > - > /* Ensure the first four L3 entries are all populated. */ > for ( i = 0, l3tab = l3start; i < 4; ++i, ++l3tab ) > { > if ( !l3e_get_intpte(*l3tab) ) > { > + mfn_t l2start_mfn; > maddr_to_page(mpt_alloc)->u.inuse.type_info = > PGT_l2_page_table; > - l2tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE; > - clear_page(l2tab); > - *l3tab = l3e_from_paddr(__pa(l2tab), L3_PROT); > + UNMAP_MAP_AND_ADVANCE(l2start_mfn, l2start, mpt_alloc); > + clear_page(l2start); > + *l3tab = l3e_from_mfn(l2start_mfn, L3_PROT); > } > if ( i == 3 ) > l3e_get_page(*l3tab)->u.inuse.type_info |= PGT_pae_xen_l2; > } > > - l2t = map_l2t_from_l3e(l3start[3]); > - init_xen_pae_l2_slots(l2t, d); > - unmap_domain_page(l2t); > + UNMAP_DOMAIN_PAGE(l2start); > + l2start = map_l2t_from_l3e(l3start[3]); > + init_xen_pae_l2_slots(l2start, d); > } > > +#undef UNMAP_MAP_AND_ADVANCE > + > + UNMAP_DOMAIN_PAGE(l1start); > + UNMAP_DOMAIN_PAGE(l2start); > + UNMAP_DOMAIN_PAGE(l3start); ... l4start is not unmapped here. This is a problem, because we're about to change the page tables into dom0's and start using its mapcache. IMO, we should be unmapping here, and remapping in dom0's context. Otherwise l4start becomes a transiently stale pointer. Any remaining pointer obtained via map_domain_page() is a dangling pointer after the mapcache+pagetable switch. > + > /* Pages that are part of page tables must be read only. */ > mark_pv_pt_pages_rdonly(d, l4start, vpt_start, nr_pt_pages, > &flush_flags); > > @@ -987,6 +1007,8 @@ static int __init dom0_construct(struct domain *d, > pv_shim_setup_dom(d, l4start, v_start, vxenstore_start, > vconsole_start, > vphysmap_start, si); > > + UNMAP_DOMAIN_PAGE(l4start); As it is, this unmap is operating on the wrong mapcache, I think. I don't quite understand why I see intermittent boot crashes and not constant ones, but this seems like a bug. What we want, I think, is: 1. Increase the scope of l4start_mfn to be function-level. 2. Do UNMAP_DOMAIN_PAGE(l4start) along with l1start, l2start and l3start. 3. Include a pair of map_domain_page() and UNMAP_DOMAIN_PAGE() within the conditional, surrounding pv_shim_setup_dom. > + > #ifdef CONFIG_COMPAT > if ( compat ) > xlat_start_info(si, pv_shim ? XLAT_start_info_console_domU I'll keep testing it in case I missed something, but this seems to work. Cheers, Alejandro
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |