[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 07/10] xen/arm: Add handling write fault for dirty-page tracing



> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
> bounces@xxxxxxxxxxxxx] On Behalf Of Ian Campbell
> Sent: Sunday, August 18, 2013 7:16 AM
> To: Jaeyong Yoo
> Cc: 'Stefano Stabellini'; xen-devel@xxxxxxxxxxxxx
> Subject: Re: [Xen-devel] [PATCH v3 07/10] xen/arm: Add handling write
> fault for dirty-page tracing
> 
> On Thu, 2013-08-15 at 13:24 +0900, Jaeyong Yoo wrote:
> 
> > > Why don't we just context switch the slots for now, only for domains
> > > where log dirty is enabled, and then we can measure and see how bad it
> is etc.
> >
> >
> > Here goes the measurement results:
> 
> Wow, that was quick, thanks.

Your explanation with ascii art does help a lot. Thanks again!

> 
> > For better understanding of trade-off between vlpt and page-table walk
> > in dirty-page handling, let's consider the  following two cases:
> >  - Migrating a single domain at a time:
> >  - Migrating multiple domains concurrently:
> >
> > For each case, the metrics that we are going to see is the following:
> >  - page-table walk overhead: for handling a single dirty-page,
> >     page-table requires 6us and vlpt (improved version) requires 1.5us.
> >     From this, we consider 4.5 us for pure overhead compared to vlpt.
> >     And it happens every dirty-pages.
> 
> map_domain_page is has a hash table structure in which the PTE entires are
> reference counted, however we don't clear the pte when the ref reaches 0
> so if we immediately use it again we don't need to flush. But we may need
> to flush if there is a hash table collision. So in practice there will be
> a bit more overhead, I'm not sure how significant that will be. I suppose
> the chance of collision depends on the side of the guest.

Yes, right. Overhead for unmap_domain_page may be under-estimated. 

> 
> >  - vlpt overhead: the only vlpt overhead is the flushes at context
> >     switch. And flushing 34MB (which is for supporting 16GB domU)
> >     virtual address range requires 130us. And it happens when two
> >     migrating domUs are contexted switched.
> >
> > Here goes the results:
> >
> >  - Migrating a domain at a time:
> >     * page-table walk overhead: 4.5us * 611 times = 2.7ms
> >     * vlpt overhead: 0 (no flush required)
> >
> >  - Migrating two domains concurrently:
> >     * page-table walk overhead: 4.5us * 8653 times = 39 ms
> >     * vlpt overhead: 130us * 357 times = 46 ms
> 
> The 611, 8653 and 357's in here are from an actual test, right?
> 
> Out of interest what was the total time for each case?
> 
> > Although page-table walk gives little bit better performance in
> > migrating two domains, I think it is better to choose vlpt due to the
> > following reasons:
> >  - In the above tests, I did not run any workloads at migrating domU,
> >     and IIRC, when I run gzip or bonnie++ in domU, the dirty-pages grow
> >     to few thousands. Then, page-table walk overhead becomes few hundred
> >     milli-seconds even in migrating a domain.
> >  - I would expect that migrating a single domain would be used more
> >     Frequently than migrating multiple domains at a time.
> 
> Both of those seem like sound arguments to me.
> 
> > One more thing: regarding your comments about tlb lockdown, which is:
> > > It occurs to me now that with 16 slots changing on context switch
> > > and a further 16 aliasing them (and hence requiring maintenance too)
> > > for the super pages it is possible that the TLB maintenance at
> > > context switch might get prohibitively expensive. We could address
> > > this by firstly only doing it when switching to/from domains which
> > > have log dirty mode enabled and then secondly by seeing if we can
> > > make use of global or locked down mappings for the static Xen
> > > .text/.data/.xenheap mappings and therefore allow us to use a bigger
> global flush.
> >
> > Unfortunately Cortex A15 looks like not supporting tlb lockdown.
> > http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0438d/C
> > HDGEDA
> > E.html
> 
> Oh well.
> 
> > And, I am not sure that setting global of page table entry prevents
> > being flushed from TLB flush operation.
> > If it works, we may decrease the vlpt overhead a lot.
> 
> yes, this is something to investigate, but not urgently I don't think.

Got it. Making it absolutely stable is more important, I think.

> 
> Ian.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.