[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] MCHECK, AMD: Fix code to allow calls to vmce_amd_rdmsr and vmce_amd_wrmsr



>>> On 05.11.13 at 16:39, Aravind Gopalakrioshnan 
>>> <aravind.gopalakrishnan@xxxxxxx> wrote:
>   
> -    switch ( msr & (MSR_IA32_MC0_CTL | 3) )
> +    switch ( msr )
> 
>> You can't drop the masking altogether here - that way you're
>> preventing banks other than bank 0 to be handled here.
> 
> When I drop the mask, I am allowing all MCA MSR's here.. It is when I 
> apply the mask
> that I am allowing only MC0 bank MSR's alone. Here is an example: (All 
> values in hexadecimal form)
> 
> (MSR_IA32_MC0_CTL | 3) = 0x403; Therefore -
> 
> 
> Msr val = 400
> val & 0x403 = 400
> 
> Msr val = 401
> val & 0x403 = 401
> 
> Msr val = 402
> val & 0x403 = 402
> 
> Msr val = 403
> val & 0x403 = 403
> 
> Msr val = 404
> val & 0x403 = 400

So you contradict yourself here: MSR 404 is being allowed in, and
is being handled just like MSR 400, just that "bank" is 1 in that
case.

> We need to route the MC4 MSR's (0x413 for DRAM errors,

MSR 413 is really MSR_IA32_MCx_MISC(4) if I'm not mistaken,
i.e. visible or invisible depending on the number of MCE MSR
banks a guest gets to see.

> 0xc0000408 for 
> Link errors, 0xc0000409 for L3 errors)
> to be handled by vmce_amd_wrmsr and vmce_amd_rdmsr. Since by removing 
> the mask altogether, I am also
> allowing MSR's 0x404, 0x405, 0x406 and 0x407 (which is harmless still as 
> they fall to 'default' in vmce_amd_rdmsr or
> vmce_amd_wrmsr functions);

No, they all end up at the default case all of the sudden, while
they're supposed to be (and are currently) handled by the
respective non-default cases.

The only thing I agree to is that for the C00004xxx range
we're wrongly masking off the top two bits, which was an
oversight when support for these got added.

>>> --- a/xen/include/asm-x86/msr-index.h
>>> +++ b/xen/include/asm-x86/msr-index.h
>>> @@ -218,9 +218,9 @@
>>>   #define AMD64_NB_CFG_CF8_EXT_ENABLE_BIT   46
>>>   
>>>   /* AMD Family10h machine check MSRs */
>>> -#define MSR_F10_MC4_MISC1          0xc0000408
>>> -#define MSR_F10_MC4_MISC2          0xc0000409
>>> -#define MSR_F10_MC4_MISC3          0xc000040A
>>> +#define MSR_F10_MC4_MISC1          0x00000413
>>> +#define MSR_F10_MC4_MISC2          0xc0000408
>>> +#define MSR_F10_MC4_MISC3          0xc0000409
>> Fam10 BIOS and Kernel Developer's Guide disagrees with this.
>> Fam15 model 0x also doesn't list C0000413 (but indeed marks
>> C000040A [as well as subsequent ones] as reserved). Fam15
>> model 1x even lists everything from C0000409 onwards as
>> reserved.
> It is MSR 0x413 and not 0xc0000413. MSR 0x413 *is* listed as DRAM
> thresholding register.

So it doesn't belong into this group in the first place then. As
above - as a "standard" or "architectural" MCE MSR, it shouldn't
get its own #define, but be referenced through
MSR_IA32_MCx_MISC().

> And you are right about Link Thresholding (MSRC0000408) and
> L3 Thresholding(MSRC0000409). One way to resolve this can be to add
> quirks in 'mce_amd_quirks' structure and check for it before we emulate
> Link/L3 thresholding MSR's to the guest.
> 
> Thoughts?

That's a possibility. But you first of all need to explain how a
bank 4 MSR would be seen by a guest at all, when we only
emulate 1 or 2 banks these days.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.