[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] x86/hap: Resolve mm-lock order violations when forking VMs with nested p2m

On 06.01.2021 17:26, Tamas K Lengyel wrote:
> On Wed, Jan 6, 2021 at 11:11 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>> On 06.01.2021 16:29, Tamas K Lengyel wrote:
>>> On Wed, Jan 6, 2021 at 7:03 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>> On 04.01.2021 18:41, Tamas K Lengyel wrote:
>>>>> +        for ( i = 0; i < MAX_NESTEDP2M; i++ )
>>>>> +            p2m_lock(d->arch.nested_p2m[i]);
>>>> From a brief scan, this is the first instance of acquiring all
>>>> nested p2m locks in one go. Ordering these by index is perhaps
>>>> fine, but I think this wants spelling out in e.g. mm-locks.h. Of
>>>> course the question is if you really need to go this far, i.e.
>>>> whether really all of the locks need holding. This is even more
>>>> so with p2m_flush_table_locked() not really looking to be a
>>>> quick operation, when there have many pages accumulated for it.
>>>> I.e. the overall lock holding time may turn out even more
>>>> excessive this way than it apparently already is.
>>> I agree this is not ideal but it gets things working without Xen
>>> crashing. I would prefer if we could get rid of the mm lock ordering
>>> altogether in this context.
>> How would this do any good? You'd then be at risk of ac"ually
>> hitting a lock order violation. These are often quite hard to
>> debug.
> The whole lock ordering is just a pain and it gets us into situations
> like this where we are forced to take a bunch of locks to just change
> one thing. I don't have a better solution but I'm also not 100%
> convinced that this lock ordering setup is even sane. Sometimes it
> really ought to be enough to just take one "mm master lock" without
> having to chase down all of them individually.

The concept of a "master lock" would imply all parties wanting to
make _any_ change anywhere (or even just not being certain whether
a change will need making) would need to hold it for writing. This
is clearly against the overall goal of reducing lock contention on
in particular the p2m lock. 

>>>>> --- a/xen/arch/x86/mm/p2m.c
>>>>> +++ b/xen/arch/x86/mm/p2m.c
>>>>> @@ -1598,8 +1598,17 @@ void
>>>>>  p2m_flush_nestedp2m(struct domain *d)
>>>>>  {
>>>>>      int i;
>>>>> +    struct p2m_domain *p2m;
>>>>> +
>>>>>      for ( i = 0; i < MAX_NESTEDP2M; i++ )
>>>>> -        p2m_flush_table(d->arch.nested_p2m[i]);
>>>>> +    {
>>>>> +        p2m = d->arch.nested_p2m[i];
>>>> Please move the declaration here, making this the variable's
>>>> initializer (unless line length constraints make the latter
>>>> undesirable).
>>> I really don't get what difference this would make.
>> Both choice of (generally) inappropriate types (further up)
>> and placement of declarations (here) (and of course also
>> other style violations) can set bad precedents even if in a
>> specific case it may not matter much. So yes, it may be
>> good enough here, but it would violate our desire to
>> - use unsigned types when a variable will hold only non-
>>   negative values (which in the general case may improve
>>   generated code in particular on x86-64),
>> - limit the scopes of variables as much as possible, to
>>   more easily spot inappropriate uses (like bypassing
>>   initialization).
>> This code here actually demonstrates such a bad precedent,
>> using plain int for the loop induction variable. While I
>> can't be any way near sure, there's a certain chance you
>> actually took it and copied it to
>> __mem_sharing_unshare_page(). The chance of such happening
>> is what we'd like to reduce over time.
> Yes, I copied it from p2m.c. All I meant was that such minor changes
> are generally speaking not worth a round-trip of sending new patches.
> I obviously don't care whether this is signed or unsigned. Minor stuff
> like that could be changed on commit and is not even worth having a
> discussion about.

I'm sorry, but no. Committing ought to be a purely mechanical
thing. It is a _courtesy_ of the committer if they offer to
make adjustments. If us offering this regularly gets taken as
"expected behavior", I'm afraid I personally would stop making
such an offer, and instead insist on further round trips.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.