[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] AMD/IO-APIC: Use old IO-APIC ack method if AMD 813{1, 2} PCI-X tunnel is present
On 04/06/13 17:29, Jan Beulich wrote: >>>> On 04.06.13 at 12:13, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: >> References at time of patch: >> >> AMD 8132 PCI-X HyperTransport Tunnel >> http://support.amd.com/us/ChipsetMotherboard_TechDocs/30801.pdf >> >> AMD 8131 PCI-X HyperTransport Tunnel >> http://support.amd.com/us/ChipsetMotherboard_TechDocs/26310_AMD-8131_HyperTra >> >> nsport_PCI-X_Tunnel_Revision_Guide_rev_3_18.pdf >> >> The IO-APICs on the above two PCI-X HyperTransport Tunnels will issue illegal >> HT packets if an interrupt gets masked while active. >> >> This is sadly exactly what the "new" IO-APIC ack mode does. The "old" ack >> mode does however avoid this bad behaviour, and stabilises an affected >> system belonging to a customer. > Reading the errata descriptions, I'm getting the impression that > your patch only tries to deal with some portion of the issue, and > even that only on the basis that devices (and perhaps even their > drivers) are well behaved. That's because there's no silencing of > interrupt sources being done here, and I also can't see how we > would from the hypervisor. > > If that's correct, is adding the workaround code really worth it, > rather than documenting that on those systems "ioapic_ack=old" > will make them a little more reliable? > > If it's not correct, would you mind explaining in some more detail > how you see your change dealing with all aspects of the erratum, > namely for edge triggered IRQs as well as for the masking that's > done with the old ack mode for level triggered ones (or why you > don't have to deal with those cases)? The observation we have from the customer is that using "new" mode will result in host hangs/crashes about once every two hours. With "old", the system has been rock solid for a week. Unfortunately I do not have the hardware at my disposal. > >> +void __init amd813x_errata_quirks(void) >> +{ >> + unsigned i, found = 0; >> + uint32_t bus, dev, vendor_id; >> + >> + if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD ) >> + return; >> + >> + /* Check for the affected IO-APIC version (0x11) to save scanning the >> PCI >> + * bus on most systems. */ >> + for ( i = 0; i < nr_ioapics; ++i ) >> + { >> + if ( mp_ioapics[i].mpc_apicver == 0x11 ) >> + { >> + found = 1; >> + break; >> + } >> + } >> + >> + if ( !found ) >> + return; >> + >> + for ( bus = 0; bus < 256; ++bus ) >> + { >> + for ( dev = 0; dev < 32; ++dev ) >> + { >> + /* IO-APIC is always Function 1. Check for Multifunction. */ >> + if ( !( pci_conf_read8(0, bus, dev, 0, PCI_HEADER_TYPE ) & 0x80 >> ) ) >> + continue; >> + >> + vendor_id = pci_conf_read32(0, bus, dev, 1, PCI_VENDOR_ID); >> + >> + /* 0x7451 is AMD-8131 PCI-X IO-APIC */ >> + if ( vendor_id == 0x74511022 ) >> + printk(XENLOG_WARNING "Found AMD 8131 PCI-X tunnel, >> erraturm #63: "); > "erratum" Doh. > >> + /* 0x7459 is AMD-8132 PCI-X IO-APIC */ >> + else if ( vendor_id == 0x74591022 ) >> + printk(XENLOG_WARNING "Found AMD 8132 PCI-X tunnel, >> erraturm #81: "); > Again. > >> + else >> + continue; > And overall this is the kind of thing that I think a switch statement > is dealing with in a more legible way, at once avoiding the need for > the "vendor_id" local variable. Ok > >> + >> + if ( ioapic_ack_forced ) >> + { > Inverting the condition would allow you to make this a "if/else-if/else" > sequence, with less deep indentation... > >> + if ( ioapic_ack_new != 0 ) > != 0 on a boolean variable? Two lines earlier you didn't do so, > please be consistent. Sorry - I thought I checked this. > >> + printk("Not overriding command line. System will be >> unstable. " > Is "will" really right here? These chipsets are rather old, and not > having seen reports of instabilities makes me imply that many > such systems just happen to work fine. Therefore, "may" might > be the better term... Yes - see description above. > >> + "Use ioapic_ack_mode=old\n"); > "Don't use \"ioapic_ack=new\"\n" would seem better. > > Jan No - this is more informative to someone who doesn't know the Xen command line arguments inside/out. ~Andrew > >> + else >> + printk("Command line is ok for system stability\n"); >> + } >> + else >> + { >> + printk("Forcing 'old' IO-APIC ack method\n"); >> + ioapic_ack_new = 0; >> + } >> + return; >> + } >> + } >> +} >> + >> void __init setup_IO_APIC(void) >> { >> + amd813x_errata_quirks(); >> + >> enable_IO_APIC(); >> >> if (acpi_ioapic) > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |