|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH for-4.12] x86/altp2m: fix HVMOP_altp2m_set_domain_state race
>>> On 08.02.19 at 12:58, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> On 2/8/19 1:13 PM, Razvan Cojocaru wrote:
>> On 2/8/19 12:51 PM, Jan Beulich wrote:
>>>>>> On 08.02.19 at 10:56, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>> HVMOP_altp2m_set_domain_state does not domain_pause(), presumably
>>>> on purpose (as it was originally supposed to cater to a in-guest
>>>> agent, and a domain pausing itself is not a good idea).
>>>>
>>>> This can lead to domain crashes in the vmx_vmexit_handler() code
>>>> that checks if the guest has the ability to switch EPTP without an
>>>> exit. That code can __vmread() the host p2m's EPT_POINTER
>>>> (before HVMOP_altp2m_set_domain_state "for_each_vcpu()" has a
>>>> chance to run altp2m_vcpu_initialise(), but after
>>>> d->arch.altp2m_active is set).
>>>>
>>>> While the in-guest scenario continues to pose problems, this
>>>> patch fixes the "external" case.
>>>
>>> IOW you're papering over the problem rather than fixing it. Why
>>> does altp2m_active get set to true before actually having set up
>>> everything? Shouldn't it get cleared early, but set late?
>> Well, yes, that would have been my second attempt: set the "altp2m
>> enabled" bool after the init, and before the uninit and no longer
>> domain_pause() explicitly; however I thought that was a brittle
>> solution, relying on comments / programmer attention to the code
>> sequence rather than taking a proper lock.
>>
>> I'll test that scenario then and return with the results / possibly
>> another patch.
>
> Actually, your suggestion does not work, because the way the code has
> been designed, altp2m_vcpu_initialise() calls altp2m_vcpu_update_p2m(),
> which does the proper work that's interesting to us here, like this:
>
> 2153 static void vmx_vcpu_update_eptp(struct vcpu *v)
> 2154 {
> 2155 struct domain *d = v->domain;
> 2156 struct p2m_domain *p2m = NULL;
> 2157 struct ept_data *ept;
> 2158
> 2159 if ( altp2m_active(d) )
> 2160 p2m = p2m_get_altp2m(v);
> 2161 if ( !p2m )
> 2162 p2m = p2m_get_hostp2m(d);
> 2163
> 2164 ept = &p2m->ept;
> 2165 ept->mfn = pagetable_get_pfn(p2m_get_pagetable(p2m));
> 2166
> 2167 vmx_vmcs_enter(v);
> 2168
> 2169 __vmwrite(EPT_POINTER, ept->eptp);
> 2170
> 2171 if ( v->arch.hvm.vmx.secondary_exec_control &
> 2172 SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
> 2173 __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
> 2174
> 2175 vmx_vmcs_exit(v);
> 2176 }
>
> So please note that on line 2159 it checks if altp2m is active, and only
> then does it do the right thing. So setting the d->arch.altp2m_active
> bool _after_ calling altp2m_vcpu_initialise() will fail to work
> correctly - turning this into a chicken-and-egg problem, or perhaps more
> interestingly, another discussion about whether in-guest-only altp2m
> agents make any sense fundamentally.
Well, to be honest I expected dependencies like this to be there,
and hence I didn't expect it would be a straightforward change.
Just like we do e.g. for the IOMMU enabling, I guess the boolean
wants to become a tristate then (off -> enabling -> enabled),
which interested sites then can use to distinguish what they
want/need to do.
Another relatively obvious solution would be to add a boolean
parameter to altp2m_vcpu_update_p2m() such that
altp2m_vcpu_initialise() can guide it properly. But this of course
depends to a certain degree on how wide spread the problem is.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |