[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] x86/vMCE: save/restore MCA capabilities



>>> On 23.03.12 at 09:55, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
> Liu, Jinsong wrote:
>> Jan Beulich wrote:
>>>>>> On 06.03.12 at 12:55, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>>>> wrote: 
>>>> Jan Beulich wrote:
>>>>>>>> On 06.03.12 at 10:28, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>>>>>> wrote:
>>>>>> Jan Beulich wrote:
>>>>>>> But we're getting all the farther away from the actual question:
>>>>>>> Do we need to provide for saving/restoring of any of the _CTL
>>>>>>> registers? 
>>>>>>> 
>>>>>> 
>>>>>> Per Tony's elaboration about _CTL h/w meaning, I thought they are
>>>>>> model specific mainly used for debug purpose and os defaultly set
>>>>>> all 1's to them (if any misunderstanding please point out to me).
>>>>>> So how about unbind _CTL with host (say, pure software emulated
>>>>>> msr, not involve h_mcg_ctl/h_mci_ctrl[bank])? If so we don't need
>>>>>> save/restore _CTL. After all they are model specific, and emulated
>>>>>> as all 1's to guest seems reasonable.
>>>>> 
>>>>> If the guest OS considers a particular CPU model to require an
>>>>> adjustment to any of these, any such adjustment would be lost over
>>>>> migration. I'm simply uncertain whether all OSes will tolerate that
>>>>> (in which case ignoring the writes in the first place would
>>>>> probably be better). 
>>>>> 
>>>> 
>>>> I'm unsure its risk but if concern OSes tolerance, it would better
>>>> avoid such inconsistent case. An update approach is, pure s/w
>>>> emulated _CTL + save/restore, which would get rid of h/w
>>>> heterogeneity and keep consistent when migrate.
>>>> Does it make sense?
>>> 
>>> That would be an option, but again only if OSes don't make
>>> assumptions on the number of banks for certain CPU models.
>>> 
>> 
>> Afict the only way SDM recommand to get bank number is via mcg_cap,
>> so if OS assume bank number via cpu model it would either get same
>> number as that via mcg_cap or get wrong number which is OS problem
>> not Xen.   
> 
> Jan, any more concern about this thread (_CTL)?

While I accept your statement above as valid from a theoretical pov,
it's not going to work in practice if someone comes up with a case
where an OS works flawlessly on real hardware, yet has a problem
when virtualized - it will be the virtualization software that gets
blamed, not the OS.

That said, in this matter I'm fine with not doing anything until we get
an actual report of a problem (at which point working around it by
enforcing the bank count via config setting is probably the most viable
option).

> And, as for vMCE live migration, it indeed exist some issues when migrate. 
> Currently we are discussing internally, and will present approach/patches 
> when available.

Thanks, this is the part that we really need to deal with (and at least
settle on the migration _interface_ before 4.2 gets out).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.