[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Possible Xen grant table locking improvements


At 18:35 +0100 on 20 Oct (1413826547), David Vrabel wrote:
> Most guests do not map a grant reference more than twice (Linux, for
> example, will typically map a gref once in the kernel address space, or
> twice if a userspace mapping is required).  The maptrack entries for
> these two mappings can be stored in the active entry (the "fast"
> entries).  If more than two mappings are required, the existing maptrack
> table can be used (the "slow" entries).

Sounds good, as long as the hit rate is indeed high.  Do you know if
the BSD/windows client code behaves this way too?

> A maptrack handle for a "fast" entry is encoded as:
>     31 30          16  15            0
>   +---+---------------+---------------+
>   | F | domid         | gref          |
>   +---+---------------+---------------+
> F is set for a "fast" entry, and clear for a "slow" one. Grant
> references above 2^16 will have to be tracked with "slow" entries.

How restricting is that limit?  Would 2^15½ and also encoding
which of the two entries to look at be good?

> We can omit taking the grant table lock to check the validity of a grant
> ref or maptrack handle since these tables only grow and do not shrink.

Can you also avoid the lock for accessing the entry itself, with a bit
of RCU magic?  Maybe that's overengineering things.

> If strict IOMMU mode is used, IOMMU mappings are updated on every grant
> map/unmap.  These are currently setup such that BFN == MFN which
> requires reference counting the IOMMU mappings so they are only torn
> down when all grefs for that MFN are unmapped.  This requires an
> expensive mapcount() operation that iterates over the whole maptrack table.
> There is no requirement for BFN == MFN so each grant map can create its
> own IOMMU mapping.  This will require a region of bus address space that
> does not overlap with RAM.

Hrmn.  That could be tricky to arrange.  And the reference counting
might end up being cheaper than the extra IOMMU flush operations.
(Also, how much would you bet that clients actually use the returned
BFN correctly?)

Would it be enough to optimise mapcount() a bit?  We could organise the
in-use maptrack entries as a hash table instead of (or as well as) a
single linked list.

On similar lines, would it be worth fragmenting the maptrack itself
(e.g. with per-page locks) to reduce locking contention instead of
moving maptrack entries into the active entry?  If might be Good
Enough[tm], and simpler to build/maintain than this proposal.

> ### Possible Problems
> 1. The "fast" maptrack entries cause a problem when destroying domains.
> A domain being destroyed need tear down all active grant maps it has.
> It currently does this with a O(M) operation -- iterating over all the
> maptrack entries.
> With the "fast" maptrack entries being stored in the remote domain's
> active grant entry tables, a walk of all map track entries must iterate
> over /every/ domains' active grant entry tables /and/ the local maptrack
> table).  This is O(D * G + M), which is maybe too expensive?

It seems OK to me, since it's trivially interruptable for softirqs &c.
(with maybe a bit of extra plumbing around the domain hash table).

> (D = number of domains, G = mean grant table size, M = number of local
> maptrack entries).
> 2. The "fast" maptrack entries means we cannot[2] limit the total number
> of grant maps a domain can make.

Yeah, I think that's fine. :)



Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.