[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Possible Xen grant table locking improvements

On Mon, Oct 20, 2014 at 06:35:47PM +0100, David Vrabel wrote:
> A while ago Matt Wilson posted a patch refactoring some of the grant
> table locking[1].  We (XenServer) have found they significantly improve
> performance.

Fantastic! Thank you for having a look at this work. We definitely
intended to get back to splitting the patch up for the 4.5 cycle, but
other projects were competing for the time. So at this point we were
expecting to come back to this when 4.6 development starts in earnest.

> These patches split the single grant table lock into three.
> 1. The grant table lock protecting grant table resizes.
> 2. The maptrack lock, protecting the domain's maptrack table.
> 3. Per-active entry locks.
> But they still have some performance bottlenecks which we believe is the
> maptrack lock.

I vaguely remember this being a problem as well. Unfortunately it's
been about a year since I was deeply in the code.

> ### Solution
> This is proposal for removing this bottleneck.  This proposal has a
> number of drawbacks (see the end), which I would appreciate feedback on.
> Most guests do not map a grant reference more than twice (Linux, for
> example, will typically map a gref once in the kernel address space, or
> twice if a userspace mapping is required).  The maptrack entries for
> these two mappings can be stored in the active entry (the "fast"
> entries).  If more than two mappings are required, the existing maptrack
> table can be used (the "slow" entries).
> A maptrack handle for a "fast" entry is encoded as:
>     31 30          16  15            0
>   +---+---------------+---------------+
>   | F | domid         | gref          |
>   +---+---------------+---------------+
> F is set for a "fast" entry, and clear for a "slow" one. Grant
> references above 2^16 will have to be tracked with "slow" entries.
> We can omit taking the grant table lock to check the validity of a grant
> ref or maptrack handle since these tables only grow and do not shrink.
> If strict IOMMU mode is used, IOMMU mappings are updated on every grant
> map/unmap.  These are currently setup such that BFN == MFN which
> requires reference counting the IOMMU mappings so they are only torn
> down when all grefs for that MFN are unmapped.  This requires an
> expensive mapcount() operation that iterates over the whole maptrack table.
> There is no requirement for BFN == MFN so each grant map can create its
> own IOMMU mapping.  This will require a region of bus address space that
> does not overlap with RAM.
> ### Possible Problems
> 1. The "fast" maptrack entries cause a problem when destroying domains.
> A domain being destroyed need tear down all active grant maps it has.
> It currently does this with a O(M) operation -- iterating over all the
> maptrack entries.
> With the "fast" maptrack entries being stored in the remote domain's
> active grant entry tables, a walk of all map track entries must iterate
> over /every/ domains' active grant entry tables /and/ the local maptrack
> table).  This is O(D * G + M), which is maybe too expensive?
> (D = number of domains, G = mean grant table size, M = number of local
> maptrack entries).

Hmmmm, that doesn't sound ideal. I suppose we would also need to walk
all of the active grant entry tables under a lock, which might make
the overall runtime of this operation unpredictable under various load

> 2. The "fast" maptrack entries means we cannot[2] limit the total number
> of grant maps a domain can make.
> However, I view this as a positive since it removes a hard scalability
> limit (the maptrack table limit).

I'm a little less worried about this point.

It's probably time for me to revisit the proposed patches so that I
can provide more comprehensive feedback on this approach.


> David
> [1] http://lists.xen.org/archives/html/xen-devel/2013-11/msg01517.html
> [2] We could be that would require synchronizing access to a per-domain
> map count which I'm trying to avoid.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.