[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] altp2m: Allow the hostp2m entries to be of type p2m_ram_shared



On Mon, Jul 18, 2016 at 10:10 AM, George Dunlap
<george.dunlap@xxxxxxxxxx> wrote:
> On 18/07/16 16:18, Tamas K Lengyel wrote:
>> On Mon, Jul 18, 2016 at 5:04 AM, George Dunlap <george.dunlap@xxxxxxxxxx> 
>> wrote:
>>> On Fri, Jul 15, 2016 at 3:59 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> 
>>> wrote:
>>>>> I could go on in the analysis, but the point is that there's a morass
>>>>> of interactions here all of which need to be correct, which this patch
>>>>> does not address.  You have a long way to go before sharing and altp2m
>>>>> can be safely used on the same gfn ranges.
>>>>>
>>>>
>>>> Hi George,
>>>> certainly there are cornercases where if the user does not setup things in
>>>> the right order things can still go out of whack. My goal with this patch 
>>>> is
>>>> not to address all of them. The goal of this patch is to not crash the
>>>> hypervisor when the user at least tries to experiment with it, which is the
>>>> current state. So this patch improves the status quo. Also, both 
>>>> mem_sharing
>>>> and altp2m is considered experimental/tech_preview, so the fact that their
>>>> combination is also experimental should be assumed as well. As I explained,
>>>> with this patch in place there is at least one way they can be safely used
>>>> together if the user tracks unsharing requirements through mem_access and
>>>> does unsharing and fixup of the altp2m views manually. There are other ways
>>>> where it would not be safe as after unsharing we don't know how the user
>>>> would want things to look in altp2ms. I don't think we want to start
>>>> guessing about that either so I will not be looking to implement that. So I
>>>> don't agree with this reasoning being grounds for rejecting this patch that
>>>> does incrementally improve the current state.
>>>
>>> So you keep saying "user"; I assume you mean whatever program is
>>> sitting in domain 0, which is going to be doing memsharing, altp2m,
>>> and memaccess stuff all at once?
>>>
>>> The altp2m code was not written for that purpose.  It was written for
>>> *guest* administrators to use within the guest.
>>
>> That's simply not true. It was written specifically to allow both
>> usecases - both internal *and* external. Mixing the use-cases was not
>> envisioned.
>
> Right, well in any case I certainly think that having external users of
> the feature is a good thing to pursue.
>
>>
>>   There's no problem
>>> with finding additional uses for it, as long as the *original* purpose
>>> doesn't get broken; and this patch most certainly does break things
>>> for that purpose.
>>
>> This patch most certainly *does not break* the in-guest usecase by
>> itself. If the in-guest usecase is used without any mem_sharing going
>> on, this patch does not have any effect on that usecase.
>>
>>   Guests using altp2m should not have to know that
>>> sharing is happening behind their backs; nor should they be required
>>> to use mem_access to manually fix up what the hypervisor has allowed
>>> to be broken.
>>
>> And they are not required. This requirement only applies if the user
>> mixes in in-guest and external usecases.
>
> I keep saying "dom0" and "guest administrator", and you keep saying
> "user", as though these are always going to be one and the same person.
> That is a valid use case, but it is not the normal use case for Xen.  We
> need to make it possible for host administrators to be able to decide to
> enable sharing without having to know whether the guest administrator is
> using altp2m; and the for the guest administrator to be able to decide
> to use altp2m without having to know whether the host administrator is
> going to enable sharing.
>
> And even if they are the same person, I think code that Just Works is
> better than code that has a lot of corner cases you have to avoid.
>
>>> If at the moment altp2m + memsharing just plain crashes, then that
>>> should be fixed; and if the lock-ordering parts of the patch fix that,
>>> then they should be applied.
>>
>> Yes, that would be a start at least.
>>
>>   But the code which make sure that the
>>> same gfn range cannot both be shared and subject to altp2m must remain
>>> until they interact properly, with all the corner cases carefully
>>> thought out.
>>
>> Well, I don't see that what you suggest is going to happen if
>> incremental improvements are not allowed in.
>
> Incremental improvements are welcome; but they must not cause
> regressions in existing functionality.

Existing functionality does not get impaired by this as what happens
right now is a hypervisor crash. I don't see how things can get any
worst than that.

>
> The code as it is in the tree right now was intended to allow both
> sharing and altp2m to be enabled on the same domain, just not over the
> same gfn range.  And it was intended to be robust -- that is, the
> sharing code and the altp2m code don't need to be aware of each other
> and try not to step on each other's toes; each can just do its own thing
> and Xen will make sure that nothing bad happens (by preventing pages
> with an altp2m entry from being shared, and unsharing pages for which an
> altp2m entry is created).
>
> It sounds like that's broken right now; it should therefore be fixed.
> When it is fixed, you'll be able to use both altp2m and sharing on the
> same domain; Xen will simply prevent sharing from happening on gfn
> ranges with altp2m entries.

No, that's incorrect, it's the other way around. If you were to try to
share pages for which you have altp2m entries it will happily oblige.
It will just fail to do the altp2m actions for entries of shared
entries (copying the mapping to the altp2m view, mem_access, etc).

>
> An even bigger improvement would be to allow the same gfns to be subject
> both to altp2m and sharing at the same time.  But this requires thinking
> carefully about all the corner cases and making sure that they all work
> correctly.

And this is exactly what this patch allows you to do. An entry can now
be both shared, get properly copied to altp2m views, allow setting
mem_access in altp2m views, etc. The only situation you have to take
core of is when the type of the entry changes from shared to unshared
as that resets the altp2m views. This is a behavior altp2m already has
when for example you remove pages altogether from the hostp2m. The
altp2m system will just flush all affected altp2m views altogether
regardless if there were remapped gfns and/or mem_access settings,
without sending any notification to the user about it. In this regard
the unsharing event is actually better as we can catch that with
mem_access. So again, I don't see this as being a regression by any
means.

Also, just as the external user could subscribe to events that lead to
unsharing the internal user could as well and preventatively fix
things up. How the internal user becomes aware which entries are
shared and which are not is beyond the scope of what I'm interested in
achieving here but there certainly could be channels where they can
communicate that information to each other if needed.

>
>> Anyhow, at this point I'm
>> going to start carrying out-of-tree patches for Xen in my project and
>> just resign from my mem_sharing maintainership as I feel like it's
>> pretty pointless.
>
> I'm sorry that you're discouraged; all I can say is that I hope you
> reconsider.  I'm not trying to block you, and I'm not ignoring your use
> case; it's the job of a maintainer to look at *everyone's* use cases and
> try to make sure that they are all accommodated in so far as it is
> possible.
>
> I'm also trying to make sure that the code you end up using in your
> project is robust and reliable.  It seems to me like if the current
> implementation was fixed, your life would be a lot easier than if we
> accept your patch as it is -- your sharing code could just worry about
> sharing, your altp2m code could just worry about whatever it's trying to
> do, without having to carefully avoid corner cases or manually fix
> things up when corner cases happen.  A bit less sharing would happen,
> because fewer pages would be eligible to be shared, but overall you'd
> have a lot less bugs and headache.
>
> I invested a lot of my very limited time carefully going through both
> sets of code before I answered your e-mail, and I spent a lot of time
> trying to explain the kinds of interactions I think will be a problem.
> I could have just acked the patch without doing that; but I think that
> would have been both less good for you, and less good for the project as
> a whole.

I certainly appreciate your time spent on this. However, I don't see
the point of being maintainer if my opinion on what constitutes an
improvement of the system just gets overruled. As pretty much my
project is the only use-case where these two systems would be used
together at this point, and since I already require my users to
compile Xen from source it is just easier to go this route then what
you suggest and exploring and remedying all possible ways the two
systems could be misused when setup in ways they were not intended. If
these were considered stable features and not experimental I would
agree, but that's just not the case. So I think both of our time is
better spent doing other things then arguing. I would like to hear the
other maintainers opinion on this matter as well but I'm not
interested in arguing endlessly or initiating or vote, so if the patch
is not allowed in I will accept that decision but I will see no point
in continuing as maintainer of the system.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.