Xen project Mailing List

Re: [Xen-devel] [PATCH v4] altp2m: Allow the hostp2m entries to be of type p2m_ram_shared

To: George Dunlap <george.dunlap@xxxxxxxxxx>

From: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>

Date: Tue, 19 Jul 2016 13:12:50 -0600

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 19 Jul 2016 19:13:12 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, Jul 19, 2016 at 11:11 AM, George Dunlap <george.dunlap@xxxxxxxxxx> wrote: > On 18/07/16 18:06, Tamas K Lengyel wrote: >>> Incremental improvements are welcome; but they must not cause >>> regressions in existing functionality. >> >> Existing functionality does not get impaired by this as what happens >> right now is a hypervisor crash. I don't see how things can get any >> worst than that. > > Also from another thread: >> If anyone else would have been interested in getting the two systems >> working together othen then me they probably would have complained >> already that hey this crashes the hypervisor. My point being that at >> this point the impact of this patch is likely really low. > > From a user perspective, "failing intermittently in some strange and > unpredictable way" is definitely worse than a hypervisor crash. :-) > > My concern is that someone will start using guests which use the altp2m > interface internally, and that will all work; and then maybe separately > they will start doing some sort of memory sharing between guests, and > that will all work; and then at some point they'll do memory sharing on > a guest using the altp2m functionality internally, and suddenly they'll > get strange intermittent errors where things don't behave the way they > expect and they don't know why. A hypervisor crash that tells them > exactly what code has the problem is definitely preferable. Well, IMHO that's where documenting the expected use-case and the known corner-cases comes into play. > >>> The code as it is in the tree right now was intended to allow both >>> sharing and altp2m to be enabled on the same domain, just not over the >>> same gfn range. And it was intended to be robust -- that is, the >>> sharing code and the altp2m code don't need to be aware of each other >>> and try not to step on each other's toes; each can just do its own thing >>> and Xen will make sure that nothing bad happens (by preventing pages >>> with an altp2m entry from being shared, and unsharing pages for which an >>> altp2m entry is created). >>> >>> It sounds like that's broken right now; it should therefore be fixed. >>> When it is fixed, you'll be able to use both altp2m and sharing on the >>> same domain; Xen will simply prevent sharing from happening on gfn >>> ranges with altp2m entries. >> >> No, that's incorrect, it's the other way around. If you were to try to >> share pages for which you have altp2m entries it will happily oblige. >> It will just fail to do the altp2m actions for entries of shared >> entries (copying the mapping to the altp2m view, mem_access, etc). > > It's quite possible I missed something, but that's not how I read the > code. Before sharing a page you have to have to call > mem_sharing_nominate_page(), which calls page_make_sharable(). > page_make_sharable() will make sure that it has exactly the expected > number of references; which for gfns is 2 and for grant references is 4. > > When you map an mfn into an altp2m of a different gfn, you'll increase > the reference count. So it appears to me that if you attempt to share a > page which is mapped in an altp2m, then the nominate operation will fail > (with -E2BIG, of all things). > > Am I mistaken about that? Hm, no you may be right. I was thinking of the type checking only. If the reference count prevents pages with alt2pm entries from being shared - going from p2m_ram_rw -> p2m_ram_shared - then from my perspective that is fine and I'm not planning on changing that. What I'm trying to get working is if the type is already p2m_ram_shared and is going from p2m_ram_shared -> p2m_ram_rw. I would also like to be able to do mem_access settings for the p2m_ram_shared type pte in an altp2m view. > > (BTB this would probably still be the case even after your patch.) > > Also, as far as I can tell, "It will just fail to do the altp2m actions > for entries of shared entries" is not true; instead, the page will be > un-shared and the altp2m action will then take place. Is this not the case? So right now when the entry is p2m_ram_shared it will crash the hypervisor because of the lock ordering issue during unsharing. If the lock ordering issue is fixed, the unsharing event will result in the altp2m propagate change taking the p2m setting from the hostp2m and copying it to all affected altp2m views, overwriting any setting that may have been stored there. This is the situation that can be monitored with mem_access so that the user can perform the unsharing and recreating the necessary altp2m settings manually. What I mean in the quoted sentence is that the altp2m ops do a type-check right now, so if you shared a page before, the type check will make all altp2m ops fail on that entry. So for example if you have a shared pte, and then try to do altp2m mem_access setting on it, it will fail. > >>> An even bigger improvement would be to allow the same gfns to be subject >>> both to altp2m and sharing at the same time. But this requires thinking >>> carefully about all the corner cases and making sure that they all work >>> correctly. >> >> And this is exactly what this patch allows you to do. An entry can now >> be both shared, get properly copied to altp2m views, allow setting >> mem_access in altp2m views, etc. The only situation you have to take >> core of is when the type of the entry changes from shared to unshared >> as that resets the altp2m views. > > I described another situation you have to be careful of in an earlier > e-mail: > - host gfn A is marked "shared" > - altp2m gfn O is mapped to gfn A (thus also marked as 'shared') > - Guest writes to gfn O, Xen attempts to unshare the page. > > In this circumstance, the fault will ends up in > __mem_sharing_unshare_page(), which will calls rmap_retrieve(d, O, mA). > This returns NULL because gfn O was never put in the reverse map, and > you BUG(). > > Again, am I misreading what would happen? Yes, remapping gfns to point to other shared entries can get you in trouble because the default unsharing op in Xen doesn't have a way to handle that. However, it can also be worked with as I suggested by making gfn O in the altp2m view read/execute-only with mem_access. At that point the external user has to option to switch to another view where gfn O is not pointing to a shared entry, or change the mapping again, or do a number of other things to prevent triggering the BUG(). So the question is, how much hand-holding and restrictions we want to apply to prevent users from experimenting with exotic setups here? > > I'm pretty sure if I went looking I could find some more situations you > need to avoid to prevent problems. > > So the next big missing piece of information in this discussion is > exactly what you do need from this system. You're using altp2m and > mem_sharing (and mem_access) on the same domain, that's obvious. Which > features of altp2m are you using -- are you mainly using the mem_access > changes, or are you also using the gfn mapping functionality? I'm using both mem_access and remapping in altp2m views. However, in my setup the remapped entries are always execute-only (have hidden breakpoints injected). I already have to trap when the hostp2m entries get write events as I need to update the shadow copy and reapply all the injected breakpoints. So from my perspective the sharing will just reduce the memory footprint of the domain as the hostp2m entry can remain shared as long as possible, the unsharing itself have no negative impact at all as I already do the copy/update of the shadow page manually. > > Also, how important is it that pages using altp2m functionality not be > un-shared -- i.e., what proportion of a guest's pages do you expect to > be shared, and what proportion do you need to perform altp2m operations on? I'm doing full VM deduplication so close to all pages start with p2m_ram_shared. Thus pretty much all pages need to pass the altp2m lazy copy operation with a shared page. For the pages that do have mem_access/remapped entries, I already trap write events on the hostp2m so I can do the unsharing manually (mmaping the hostp2m page that will unshare + overwrite all affected altp2m views). From there I can reapply any necessary mem_access/remapping without any problems. > > So there are several things we could do here: > > 1. Allow mem_sharing and altp2m functionality co-existing by avoiding > each other: the same domain can use both, but the same gfn cannot be > both shared and have altp2m entries. This is essentially fixing the bug > in the current code: Adding an altp2m to a gfn unshares it; and gfns > which have altp2m entries cannot be shared. > > This should, I *think*, be fairly straightforward; it might even only > require the lock-reordering part of your patch. > > If your use-case only uses altp2m functionality on a small number of > pages, then this might be a not-too-difficult option that would solve > your problem while not introducing any pitfalls for future users to all > into. If your use case requires large amounts of sharing *and* large > amounts of altp2m entries, this is probably not going to work for you. Well, this is not as simple as that. The altp2m lazy-copy operation automatically populates the currently active altp2m view by copying the pte from the hostp2m. This operation must succeed for my usecase with shared pages. As for applying mem_access permissions in altp2m views resulting in automatic unsharing, this is going to result in a good chunk of pages being unshared for no reason other then the fact that the external user has to be prepared to handle the unsharing manually as I described, but I can live with that, it's still better then nothing. Preventing remapping pointing to a shared page I can live with, this is not part of my usecase. > 2. Allow mem_sharing on altp2m entries with limited functionality. > Specifically: Do not unshare if an altp2m entry only has mem_access > restrictions set; only unshare it if the gfn is remapped. > > This would require some very careful thinking about the different corner > cases that might happen, but I don't think it would be quite the morass > that full altp2m support (with gfn remapping) would involve. > > If your use-case only uses the mem_access altp2m functionality, this > might be a solution that's a reasonable amount of work. This is not likely to be a workable setup for my usecase. > > 3. Allow mem_sharing on altp2m entries with full functionality in a > robust manner, making sure all the corner cases are handled correctly. > > This would obviously be quite a bit more work. I don't think it would > actually be wasted, since it should allow you to have full sharing and > full altp2m access without your dom0 code having to worry about all the > corner cases and interactions. But I can certainly see why you're not > keen on the prospect of doing so. Right, this would be the ultimate goal. However, realistically this is not going to happen due to how much time and effort it involves. > > 4. Allow mem_sharing on altp2m entries with full functionality, but with > corner cases the user of those interfaces must avoid or mitigate; and no > restrictions. > > This is what your current patch does; and as I've said, I don't think > this is suitable because it lays a trap for future people to fall into > if they end up enabling both. > > But after a discussion with IanJ, I've got a couple of variations of > this one. The variations all involve different ways of making sure that > the "full functionality" is only available when there is a caller that > knows about the risks and will actively try to avoid them. > > 4a. Add a per-domain flag to decide whether to allow "unsafe" altp2m / > mem_sharing interactions. If it's clear, then only "safe" operations > are allowed (i.e., it would start with #1, but if someone wanted they > could implement #2 or #3). If it's set, then things behave as in #4. > (We probably should try to prevent hypervisor crashes due to unsharing > events, however.) > > 4b. In the altp2m code, as #4; but add code somewhere else to prevent > the entire features from being turned on in a domain unless the "unsafe" > option is requested. That is, by default a guest with mem_sharing > enabled would not be able to use altp2m at all; and a guest with altp2m > enabled would not be able to use mem_sharing at all. Only with the > "unsafe" flag set would be allowed to have both altp2m and mem_sharing > enabled at the same time. > > Both 4a and 4b seem like they would be relatively straightforward to > implement; like they would fit your use case, but avoid laying a trap > for people in the future. > > What do you think about those options? > > If #1 is acceptable to you, I continue to think that's probably the > combination of most robust and simplest to achieve. I would recommend 2 > above 4a and 4b if feasible; and if not, I would probably choose 4a over > 4b because it allows for incremental improvement. But any of {1, 2, 4a, > 4b} would work for me. > Let's try to figure out if we can get #1 in an acceptable shape. The only point where completely avoiding each other is not possible is during the lazy copy operation where altp2m copies the settings to the currently active view. As long as there are no mem_access/remapping going on for the pte, the default unsharing operation and copying to the altp2m view "just works". As for the other altp2m cases, I can live with unsharing the gfn's and also preventing sharing to be performed on them subsequently. Thanks, Tamas _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.