[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m
On 13/01/15 20:02, Ed White wrote: > On 01/13/2015 11:01 AM, Andrew Cooper wrote: >> On 09/01/15 21:26, Ed White wrote: >>> This set of patches adds support to hvm domains for EPTP switching by >>> creating >>> multiple copies of the host p2m (currently limited to 10 copies). >>> >>> The primary use of this capability is expected to be in scenarios where >>> access >>> to memory needs to be monitored and/or restricted below the level at which >>> the >>> guest OS page tables operate. Two examples that were discussed at the 2014 >>> Xen >>> developer summit are: >>> >>> VM introspection: >>> http://www.slideshare.net/xen_com_mgr/ >>> zero-footprint-guest-memory-introspection-from-xen >>> >>> Secure inter-VM communication: >>> http://www.slideshare.net/xen_com_mgr/nakajima-nvf >>> >>> Each p2m copy is populated lazily on EPT violations, and only contains >>> entries for >>> ram p2m types. Permissions for pages in alternate p2m's can be changed in a >>> similar >>> way to the existing memory access interface, and gfn->mfn mappings can be >>> changed. >>> >>> All this is done through extra HVMOP types. >>> >>> The cross-domain HVMOP code has been compile-tested only. Also, the >>> cross-domain >>> code is hypervisor-only, the toolstack has not been modified. >>> >>> The intra-domain code has been tested. Violation notifications can only be >>> received >>> for pages that have been modified (access permissions and/or gfn->mfn >>> mapping) >>> intra-domain, and only on VCPU's that have enabled notification. >>> >>> VMFUNC and #VE will both be emulated on hardware without native support. >>> >>> This code is not compatible with nested hvm functionality and will refuse >>> to work >>> with nested hvm active. It is also not compatible with migration. It should >>> be >>> considered experimental. >> Having reviewed most of the series, I believe I now have a feeling for >> what you are trying to achieve, but I would like to discuss some of the >> design implications. >> >> The following is my understanding of the situation. Please correct me >> if I have made a mistake. >> >> > Thanks for investing the time to do this. Maybe this first couple of days > would have gone more smoothly if something like this was in the cover letter. No problem. (I tend to find that things like this save time in the long run) > > With the exception of a couple of minor points, you are spot on. Cool! > >> Currently, a domain has a single host p2m. This contains the guest >> physical address mappings, and a combination of p2m types which are used >> by existing components to allow certain actions to happen. All vcpus >> run with the same host p2m. >> >> A domain may have a number of nested p2ms (currently an arbitrary limit >> of 10). These are used for nested-virt and are translated by the host >> p2m. Vcpus in guest mode run under a nested p2m. >> >> This new altp2m infrastructure adds the ability to use a different set >> of tables in the place of the host p2m. This, in practice, allows for >> different translations, different p2m types, different access permissions. >> >> One usecase of alternate p2ms is to provide introspection information to >> out-of-guest entities (via the mem_event interface) or to in-guest >> entities (via #VE). >> >> >> Now for some observations and assumptions. >> >> It occurs to me that the altp2m mechanism is generic. From the look of >> the series, it is mostly implemented in a generic way, which is great. >> The only Intel specific bits appear to be the ept handling itself, >> 'vmfunc' instruction support and #VE injection to in-guest entities. >> > That was my intention. I don't know enough about the state of AMD > virtualization to know if it can support these patches by emulating > vmfunc and #VE, but that was my target. As far as I am aware, AMD SVM has no similar concept to vmfunc, nor #VE. However, the same kinds of introspection are certainly possible by playing with the read/write bits on the NPT tables and causing a vmexit. > >> I can't think of any reasonable case where the alternate p2m would want >> mappings different to the host p2m. That is to say, an altp2m will map >> the same set of mfns to make a guest physical address space, but may >> differ in page permissions and possibly p2m types. >> > The set of mfn's is the same, but I do allow gfn->mfn mappings to be > modified under certain circumstances. One use of this is to point the > same VA to different physical pages (with different access permissions) > in different p2m's to hide memory changes. What is the practical use of being able to play paging tricks like this behind a VMs back? > >> Given the above restriction, I believe a lot of the existing features >> can continue to work and coexist. For generating mem_events, the >> permissions can be altered in the altp2m. For injecting #VE, the altp2m >> type can change to the new p2m_ram_rw, so long as the host p2m type is >> compatible. For both, a vmexit can occur. Xen can do the appropriate >> action and also inject a #VE on its way back into the guest. >> >> One thing I have noticed while looking at the #VE stuff that EPT also >> supports A/D tracking, which might be quite a nice optimisation and >> forgo the need for p2m_ram_logdirty, but I think this should be treated >> as an orthogonal item. >> > This is far from my area of expertise, but I believe there is code in Xen > to use EPT D bits in migration. Not that I can spot, although I seem to remember some talk about it. All logdirty code still appears to relies on the logdirty bitmap being filled, which is done from vmexits for p2m_ram_logdirty regions. ~Andrew > > Ed > >> When shared ept/iommu is not in use, altp2m can safely be used by vcpus, >> as this will not interfere with the IOMMU permissions. >> >> Furthermore, I can't conceptually think of an issue against the idea of >> nestedp2m alternatives, following the same rule that the mapped mfns >> match up. That should allow all existing nestedvirt infrastructure >> continue to work. >> >> Does the above look sensible, or have I overlooked something? >> >> ~Andrew >> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |