[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen mce bugfix


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
  • Date: Wed, 27 Feb 2013 16:41:25 +0000
  • Accept-language: en-US
  • Cc: "Ren, Yongjie" <yongjie.ren@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Wed, 27 Feb 2013 16:41:55 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHOFN5kqZTiEd/FTjyR1sPJXJqZr5iNml7wgABN1fA=
  • Thread-topic: Xen mce bugfix

Liu, Jinsong wrote:
> Jan Beulich wrote:
>>>>> On 27.02.13 at 12:08, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>>> wrote: 
>>> Jan Beulich wrote:
>>>>>>> On 27.02.13 at 11:37, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>>>>> wrote:
>>>>> The reason of the former patch to clear MCi_ADDR/MISC is that it's
>>>>>           recommended by Intel SDM: LOG MCA REGISTER:
>>>>>           SAVE IA32_MCi_STATUS;
>>>>>           If MISCV in IA32_MCi_STATUS
>>>>>           THEN
>>>>>                   SAVE IA32_MCi_MISC;
>>>>>           FI;
>>>>>           IF ADDRV in IA32_MCi_STATUS
>>>>>           THEN
>>>>>                   SAVE IA32_MCi_ADDR;
>>>>>           FI;
>>>>>           IF CLEAR_MC_BANK = TRUE
>>>>>           THEN
>>>>>                   SET all 0 to IA32_MCi_STATUS;
>>>>>           If MISCV in IA32_MCi_STATUS
>>>>>           THEN
>>>>>                   SET all 0 to IA32_MCi_MISC;
>>>>>           FI;
>>>>>           IF ADDRV in IA32_MCi_STATUS
>>>>>           THEN
>>>>>                   SET all 0 to IA32_MCi_ADDR;
>>>>>           FI;
>>>>> 
>>>>> For Xen mce, it's meaningful to read MCi_ADDR/MISC only when real
>>>>> error occur (which indicated by MCi_STATUS), so only clear
>>>>> MCi_STATUS at mce handler is an acceptable work around -- after
>>>>> all, to read MCi_ADDR/MISC is pointless if MCi_STATUS is 0.
>>>> 
>>>> So then what - revert your original patch (and ignore the SDM)?
>>>> I'm not in favor of this...
>>> 
>>> Not revert entire 23327, but only use this patch to revert
>>> MCi_ADDR/MISC clear. 
>>> 
>>> I also agree it's not good, but currently seems we don't have a
>>> simple and clean way to fix it, except we spend much time to to
>>> update xen-mceinj *tools* -- even so it's low-priority?
>> 
>> No, fixing the tool seems unnecessary for this problem, all we
>> need is a way to avoid the problematic MSR writes when finishing
>> an injected MCE. That's fully contained to the hypervisor.
>> 
>> Jan
> 
> The problem comes from xen-mceinj tools simulate *some* banks for
> *some* cpus (intpose_arr array). Tools sometimes access simulated
> value, sometimes access real hardware --> that's problematic syntax
> and what really need fix.   
> 
> Thanks,
> Jinsong

OK, update bugfix patch, better than drop clear MCi_ADDR/MISC in this patch, 
will send out later.

Thanks,
Jinsong


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.