[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.14] x86/hap: use get_gfn_type in hap_update_paging_modes



On Wed, Jun 17, 2020 at 2:25 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Tue, Jun 16, 2020 at 12:31:06PM -0700, Tamas K Lengyel wrote:
> > While forking VMs running a small RTOS systems (Zephyr) a Xen crash has been
> > observed due to a mm-lock order violation while copying the HVM CPU context
> > from the parent. This issue has been identified to be due to
> > hap_update_paging_modes getting a lock on the gfn using get_gfn. This call 
> > also
> > creates a shared entry in the fork's memory map for the cr3 gfn. The 
> > function
> > later calls hap_update_cr3 while holding the paging_lock, which results in 
> > the
> > lock-order violation in vmx_load_pdptrs when it tries to unshare the above 
> > entry.
> >
> > This issue has not affected VMs running other OS's as a call to 
> > vmx_load_pdptrs
> > is benign if PAE is not enabled or if EFER_LMA is set and returns before
> > trying to unshare and map the page.
> >
> > Using get_gfn_type to get a lock on the gfn avoids this problem as we can
> > populate the fork's gfn with a copied page instead of a shared entry if its
> > needed, thus avoiding the lock order violation while holding paging_lock.
> >
> > Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxx>
> > ---
> > The bug seems to have been present since commit 4cb6c4f4941, only discovered
> > now due to the heavy use of mem_sharing with VM forks. As this is a simple
> > bug-fix it would be nice to include it in the 4.14 release.
>
> I agree it seems like a candidate bugfix to be included. I've added
> Paul to the Cc so he is aware.
>
> > ---
> >  xen/arch/x86/mm/hap/hap.c | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> > index 7f84d0c6ea..9ae4c3ae6e 100644
> > --- a/xen/arch/x86/mm/hap/hap.c
> > +++ b/xen/arch/x86/mm/hap/hap.c
> > @@ -748,12 +748,19 @@ static void hap_update_paging_modes(struct vcpu *v)
> >      struct domain *d = v->domain;
> >      unsigned long cr3_gfn = v->arch.hvm.guest_cr[3] >> PAGE_SHIFT;
> >      p2m_type_t t;
> > +    p2m_query_t q = P2M_ALLOC;
> >
> > -    /* We hold onto the cr3 as it may be modified later, and
> > -     * we need to respect lock ordering. No need for
> > -     * checks here as they are performed by vmx_load_pdptrs
> > -     * (the potential user of the cr3) */
> > -    (void)get_gfn(d, cr3_gfn, &t);
> > +    /*
> > +     * We hold onto the cr3 as it may be modified later, and
> > +     * we need to respect lock ordering. Unshare here if we have to as to 
> > avoid
> > +     * a lock-order violation later while we are holding the paging_lock.
> > +     * Further checks are performed by vmx_load_pdptrs (the potential user 
> > of
> > +     * the cr3).
> > +     */
> > +    if ( hvm_pae_enabled(v) && !hvm_long_mode_active(v) )
> > +        q |= P2M_UNSHARE;
> > +
> > +    (void)get_gfn_type(d, cr3_gfn, &t, q);
>
> While there I think you can drop the cast to void.

Sure.

>
> In order for this to be more resilient, maybe it would be better to
> just use get_gfn_unshare directly and avoid checking what paging mode
> the guest is currently using?
>
> Or would that be too expensive in terms of performance for the not
> affected case?

That's what I originally considered sending in but yes, in the fuzzing
case it would mean a full-page copy for each iteration even on
unaffected cases instead of a one-time shared entry setup. That would
be a considerable waste.

>
> I feel like relying on the internals of vmx_load_pdptrs here is
> fragile.

I agree it's not ideal.

Tamas



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.