[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1 of 2] x86/x2apic: Sandy-Bridge BT98 Erratum



On 29/04/13 10:04, Jan Beulich wrote:
>>>> On 26.04.13 at 20:11, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> +/* Xeon E5 Family processors (Sandy-Bridge) suffer from erratum BT98, which
>> + * affects Stepping C-1, but is reported fixed in Stepping C-2.
>> + *
>> + * This causes system instability when using x2apic and VT-d queued
>> + * invalidation.  The workaround is to disable x2apic and VT-d.
>> + */
>> +static void __init smb_bt98_erratum(void)
> snb_bt98_erratum() perhaps? And is this really limited to just one
> specific product line (usually the same microarchitecture gets used
> for multiple ones, and then the BT98 reference would become
> confusing)?

Oops - I did mean snb.  However, I would request guidance from Intel as
to what the best naming policy would be.

>
>> +{
>> +    const struct cpuinfo_x86 *c = &boot_cpu_data;
>> +
>> +    if (!(c->x86_vendor == X86_VENDOR_INTEL &&
>> +          c->x86 == 6 &&
>> +          c->x86_model == 0x2d &&
>> +          c->x86_mask == 0x6))
>> +            return;
>> +
>> +    printk(KERN_WARNING
>> +           "Disabling x2apic due to Sandy-Bridge BT98 erratum\n");
>> +
>> +    clear_bit(X86_FEATURE_X2APIC, boot_cpu_data.x86_capability);
>> +    x2apic_enabled = 0;
>> +
> Up here it's fine.
>
>> +    /* If the BIOS started us in x2apic mode, switch back to xapic. */
>> +    if (apic_boot_mode == APIC_MODE_X2APIC) {
>> +            uint64_t msr;
>> +
>> +            rdmsrl(MSR_IA32_APICBASE, msr);
>> +            msr &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
>> +            wrmsrl(MSR_IA32_APICBASE, msr);
>> +            msr |= MSR_IA32_APICBASE_ENABLE;
>> +            wrmsrl(MSR_IA32_APICBASE, msr);
>> +
>> +            apic_boot_mode = APIC_MODE_XAPIC;
>> +    }
> But how good are our chances that this won't cause subsequent
> problems? The BIOS may e.g. have assigned APIC IDs wider than
> what xAPIC can deal with. I think we ought to consider denying
> to boot in that case altogether. How does Linux deal with the
> erratum?

I guess "Works on my affected silicon" is not a good enough indicator,
although in that case it was booting in xapic mode not x2apic mode.

A quick look over linux mainline suggests that there is no workaround
present for this erratum.

As for APIC IDs, the CPU IDs are fixed, and the IO-APIC id on the PCH is
4 bits wide.  Unhelpfully, the SDM states that the BIOS should read all
the LAPIC IDs, and if any single one is greater than 255, the BIOS must
hand over to the OS in x2apic mode.

The code in this patch runs before the APs are booted, so querying their
LAPIC IDs is somewhat hard.  Alternatively, doing a synchronised switch
from x2apic to xapic once the APs are running is likely to also be
problematic, but might be our only option.

>
>> +}
>> +
>>  void __init generic_apic_probe(void) 
>>  { 
>>      int i, changed;
>>  
>>      record_boot_APIC_mode();
>>  
>> +        /* Must be before check_x2apic_preenabled() */
> Indentation (in case you need to resubmit anyway).
>
>> +    smb_bt98_erratum();
> Perhaps worth moving into check_x2apic_preenabled() instead,
> keeping x2apic related code as much as possible in x2apic.c?

Ok.

>
>> +
>>      check_x2apic_preenabled();
>>      cmdline_apic = changed = (genapic != NULL);
>>  
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.