[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] (Reluctant) request to revert several changes, due to regressing VM migration



On 04/06/14 17:00, Jan Beulich wrote:
>>>> On 04.06.14 at 17:45, <andrew.cooper3@xxxxxxxxxx> wrote:
>> Changeset 31ee951a3 "x86/HVM: correct the SMEP logic for
>> HVM_CR0_GUEST_RESERVED_BITS" breaks migration for VMs using SMEP.
>>
>> For migration, the architectural state is restored before the cpuid
>> policy is written.  This appears to be the behaviour in libxl, and is
>> certainly the behaviour in Xapi.
>>
>> As a result, a VM using SMEP will fail the CR4 check in
>> hvm_load_cpu_ctxt().  This is easy to observe by performing a localhost
>> migration of a modern HVM Linux VM which enables SMEP.
>>
>> Changeset 58658992 performs an equivalent action for SMAP, and as such
>> will be equivalently broken on supporting hardware.
>>
>>
>> Specifically, c/s f952f9c7f0e which is the backport of 31ee951a3 into
>> staging-4.4 is the problematic change which is causing regressions in
>> XenServer testing.
>>
>>
>> This is a reluctant request as pragmatically the changeset is correct.
> So as already hinted at on irc - what's wrong with using
> cpu_has_smep as long as the guest's d->arch.cpuid[] is blank?
> If the incoming guest didn't see SMEP available, all its CR4.SMEP
> would necessarily be clear (or if they weren't, this would sooner
> or later result in a guest crash).
>
> But then again - isn't there another problem here: hvm_cpuid()
> assumes to be on the subject vCPU, which hardly can be the
> case for the hvm_load_cpu_ctxt() code path using the macro
> in question. So perhaps it even needs to be further relaxed in
> using cpu_has_smep when not on current != v. Which of course
> would require care by eventual future users of this macro.
>
> Jan
>

As I have said, both previously on the list, and at the hackathon, the
cpuid handling and domain cpuid policy infrastructure is a massive
stinking swamp which gets worse every time I find a new bit of it.

For XenServer and our VM feature levelling support, I am going to have
to fix it somehow.  Although, fixing the 32/64bit migration issue is
still a more important problem.


I am tempted to suggest reverting it back to what it was before and
leaving it in that state until other bits of the infrastructure are
actually working.  This certainly isn't the only place where this code
is fragile.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.