|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Design session "grant v3"
On 22.09.2022 15:42, Marek Marczykowski-Górecki wrote:
> Jürgen: today two grants formats, v1 supports only up to 16TB addresses
> v2 solves 16TB issue, introduces several more features^Wbugs
> v2 is 16 bytes per entry, v1 is 8 bytes per entry, v2 more
> complicated interface to the hypervisor
> virtio could use per-device grant table, currently virtio iommu
> device, slow interface
> v3 could be a grants tree (like iommu page tables), not flat array,
> separate trees for each grantee
> could support sharing large pages too
> easier to have more grants, continuous grant numbers etc
> two options to distingush trees (from HV PoV):
> - sharing guest ensure distinct grant ids between (multiple) trees
> - hv tells guest index under tree got registered
> v3 can be addition to v1/v2, old used for simpler cases where tree is
> an overkill
> hypervisor needs extra memory to keep refcounts - resource allocation
> discussion
How would refcounts be different from today? Perhaps I don't have a clear
enough picture yet how you envision the tree-like structure(s) to be used.
> hv could have TLB to speedup mapping
> issue with v1/v2 - granter cannot revoke pages from uncooperating
> backend
> tree could have special page for revoking grants (redirect to that
> page)
> special domids, local to the guest, toolstack restaring backend could
> request to keep the same virtual domid
> Marek: that requires stateless (or recoverable) protocol, reusing domid
> currently causes issues
> Andrei: how revoking could work
> Jürgen: there needs to be hypercall, replacing and invalidating mapping (scan
> page tables?), possibly adjusting IOMMU etc; may fail, problematic for PV
Why would this be problematic for PV only? In principle any
number of mappings of a grant are possible also for PVH/HVM. So
all of them would need finding and replacing. Because of the
multiple mappings, the M2P is of no use here.
While thinking about this I started wondering in how far things
are actually working correctly right now for backends in PVH/HVM:
Any mapping of a grant is handed to p2m_add_page(), which insists
on there being exactly one mapping of any particular MFN, unless
the page is a foreign one. But how does that allow a domain to
map its own grants, e.g. when block-attaching a device locally in
Dom0? Afaict the grant-map would succeed, but the page would be
unmapped from its original GFN.
> Yann: can backend refuse revoking?
> Jürgen: it shouldn't be this way, but revoke could be controlled by feature
> flag; revoke could pass scratch page per revoke call (more flexible control)
A single scratch page comes with the risk of data corruption, as all
I/O would be directed there. A sink page (for memory writes) would
likely be okay, but device writes (memory reads) can't be done from
a surrogate page.
> Marek: what about unmap notification?
> Jürgen: revoke could even be async; ring page for unmap notifications
>
> Marek: downgrading mappings (rw -> ro)
> Jürgen: must be careful, to not allow crashing backend
>
> Jürgen: we should consider interface to mapping large pages ("map this area
> as a large page if backend shared it as large page")
s/backend/frontend/ I guess?
> Edwin: what happens when shattering that large page?
> Jürgen: on live migration pages are rebuilt anyway, can reconstruct large
> pages
If only we did already rebuild large pages ...
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |