[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 07/10] xen/arm: Add handling write fault for dirty-page tracing

On Thu, 2013-08-15 at 13:24 +0900, Jaeyong Yoo wrote:

> > Why don't we just context switch the slots for now, only for domains where
> > log dirty is enabled, and then we can measure and see how bad it is etc.
> Here goes the measurement results:

Wow, that was quick, thanks.

> For better understanding of trade-off between vlpt and page-table 
> walk in dirty-page handling, let's consider the  following two cases:
>  - Migrating a single domain at a time:
>  - Migrating multiple domains concurrently:
> For each case, the metrics that we are going to see is the following:
>  - page-table walk overhead: for handling a single dirty-page, 
>     page-table requires 6us and vlpt (improved version) requires 1.5us. 
>     From this, we consider 4.5 us for pure overhead compared to vlpt. 
>     And it happens every dirty-pages.

map_domain_page is has a hash table structure in which the PTE entires
are reference counted, however we don't clear the pte when the ref
reaches 0 so if we immediately use it again we don't need to flush. But
we may need to flush if there is a hash table collision. So in practice
there will be a bit more overhead, I'm not sure how significant that
will be. I suppose the chance of collision depends on the side of the

>  - vlpt overhead: the only vlpt overhead is the flushes at context
>     switch. And flushing 34MB (which is for supporting 16GB domU)
>     virtual address range requires 130us. And it happens when two
>     migrating domUs are contexted switched.
> Here goes the results:
>  - Migrating a domain at a time:
>     * page-table walk overhead: 4.5us * 611 times = 2.7ms
>     * vlpt overhead: 0 (no flush required)
>  - Migrating two domains concurrently:
>     * page-table walk overhead: 4.5us * 8653 times = 39 ms
>     * vlpt overhead: 130us * 357 times = 46 ms

The 611, 8653 and 357's in here are from an actual test, right?

Out of interest what was the total time for each case?

> Although page-table walk gives little bit better performance in
> migrating two domains, I think it is better to choose vlpt due to 
> the following reasons:
>  - In the above tests, I did not run any workloads at migrating domU,
>     and IIRC, when I run gzip or bonnie++ in domU, the dirty-pages grow
>     to few thousands. Then, page-table walk overhead becomes few hundred
>     milli-seconds even in migrating a domain.
>  - I would expect that migrating a single domain would be used more
>     Frequently than migrating multiple domains at a time. 

Both of those seem like sound arguments to me.

> One more thing: regarding your comments about tlb lockdown, which is:
> > It occurs to me now that with 16 slots changing on context switch and 
> > a further 16 aliasing them (and hence requiring maintenance too) for 
> > the super pages it is possible that the TLB maintenance at context 
> > switch might get prohibitively expensive. We could address this by 
> > firstly only doing it when switching to/from domains which have log 
> > dirty mode enabled and then secondly by seeing if we can make use of 
> > global or locked down mappings for the static Xen .text/.data/.xenheap 
> > mappings and therefore allow us to use a bigger global flush.
> Unfortunately Cortex A15 looks like not supporting tlb lockdown.
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0438d/CHDGEDA
> E.html

Oh well.

> And, I am not sure that setting global of page table entry prevents being
> flushed from TLB flush operation.
> If it works, we may decrease the vlpt overhead a lot.

yes, this is something to investigate, but not urgently I don't think.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.