| 
    
 [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 04/12] libxenguest: avoid allocating unused deferred-pages bitmap
 On 25/06/2021 14:19, Jan Beulich wrote:
> Like for the dirty bitmap, it is unnecessary to allocate the deferred-
> pages bitmap when all that's ever going to happen is a single all-dirty
> run.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> The clearing of the bitmap at the end of suspend_and_send_dirty() also
> looks unnecessary - am I overlooking anything?
Yes. Remus and COLO.  You don't want accumulate successfully-sent
deferred pages over checkpoints, otherwise you'll eventually be sending
the entire VM every checkpoint.
Answering out of patch order...
> @@ -791,24 +797,31 @@ static int setup(struct xc_sr_context *c
>  {
>      xc_interface *xch = ctx->xch;
>      int rc;
> -    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> -                                    &ctx->save.dirty_bitmap_hbuf);
>  
>      rc = ctx->save.ops.setup(ctx);
>      if ( rc )
>          goto err;
>  
> -    dirty_bitmap = ctx->save.live || ctx->stream_type != XC_STREAM_PLAIN
> -        ? xc_hypercall_buffer_alloc_pages(
> -              xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)))
> -        : (void *)-1L;
> +    if ( ctx->save.live || ctx->stream_type != XC_STREAM_PLAIN )
> +    {
> +        DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> +                                        &ctx->save.dirty_bitmap_hbuf);
> +
> +        dirty_bitmap =
> +            xc_hypercall_buffer_alloc_pages(
> +                xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)));
> +        ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
> +
> +        if ( !dirty_bitmap || !ctx->save.deferred_pages )
> +            goto enomem;
> +    }
So this is better than the previous patch.  At least we've got a clean
NULL pointer now.
I could in principle get on board with the optimisation, except its not
safe (see below).
> --- a/tools/libs/guest/xg_sr_save.c
> +++ b/tools/libs/guest/xg_sr_save.c
> @@ -130,7 +130,7 @@ static int write_batch(struct xc_sr_cont
>                                                        
> ctx->save.batch_pfns[i]);
>  
>          /* Likely a ballooned page. */
> -        if ( mfns[i] == INVALID_MFN )
> +        if ( mfns[i] == INVALID_MFN && ctx->save.deferred_pages )
>          {
>              set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
>              ++ctx->save.nr_deferred_pages;
> @@ -196,8 +196,12 @@ static int write_batch(struct xc_sr_cont
>              {
>                  if ( rc == -1 && errno == EAGAIN )
>                  {
> -                    set_bit(ctx->save.batch_pfns[i], 
> ctx->save.deferred_pages);
> -                    ++ctx->save.nr_deferred_pages;
> +                    if ( ctx->save.deferred_pages )
> +                    {
> +                        set_bit(ctx->save.batch_pfns[i],
> +                                ctx->save.deferred_pages);
> +                        ++ctx->save.nr_deferred_pages;
> +                    }
These two blocks are the only two which modify deferred_pages.
It occurs to me that this means deferred_pages is PV-only, because of
the stub implementations of x86_hvm_pfn_to_gfn() and
x86_hvm_normalise_page().  Furthermore, this is likely to be true for
any HVM-like domains even on other architectures.
If these instead were hard errors when !deferred_pages, then that at
least get the logic into an acceptable state. 
However, the first hunk demonstrates that deferred_pages gets used even
in the non-live case.  In particular, it is sensitive to errors with the
guests' handling of its own P2M.  Also, I can't obviously spot anything
which will correctly fail migration if deferred pages survive the final
iteration.
~Andrew
 
 
  | 
  
![]()  | 
            
         Lists.xenproject.org is hosted with RackSpace, monitoring our  |