[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable] Commit 2ca9fbd739b8a72b16dd790d0fff7b75f5488fb8 AMD IOMMU: allocate IRTE entries instead of using a static mapping, makes dom0 boot process stall several times.

Friday, August 23, 2013, 12:51:28 AM, you wrote:

> Friday, August 16, 2013, 3:15:50 PM, you wrote:

>>>>> On 16.08.13 at 12:44, Sander Eikelenboom <linux@xxxxxxxxxxxxxx> wrote:
>>> Friday, August 16, 2013, 11:18:56 AM, you wrote:
>>>>>>> On 16.08.13 at 10:40, Sander Eikelenboom <linux@xxxxxxxxxxxxxx> wrote:
>>>>> Hmm only the "no-cpuidle" is needed (cpufreq=xen can stay) to make the 
>>>>> stalls 
>>>>> disappear,
>>>>> but makes me wonder how that is related to the commit the bisection found 
>>>>> ..
>>>>> machine has been running with cpuidle enabled for ages ..
>>>> That's odd indeed. If you're up to do a little bit of debugging here,
>>>> why don't you log the sequence of interrupts arriving both with
>>>> and without said commit. This might end up being a lot of data, so
>>>> you may want to filter out uninteresting stuff and/or log only to
>>>> a memory buffer which then gets dumped upon some debug key
>>>> press.
>>> Hmm making said debug patch is getting probably a bit out of my league ..
>>> since the generated interrupts will probably outpace flushing to the 
>>> console.
>>> And i'm not sure in what things you are actually interested around the irq 
>>> flow (probably the hpet msi ones ?).

>> No, much more the ones from the devices that you say the drivers
>> of which cause the stalls while initializing. The question mainly is
>> whether the distribution of interrupts between CPUs changed in a
>> way that made the system more susceptible to missing wakeups
>> via HPET MSIs.

>> Along that lines was also the question regarding interrupt counts
>> for the devices in question, which if I'm not mistaken you didn't
>> answer yet.

> Hi Jan,

> Got things running again, first on baremetal linux
> I was having 2 seperate problems it seems:

> 1) the southbridge ioapic(6) isn't in the IVRS tables, the Linux kernel has 
> had a patch
>    "iommu/amd: Add ioapic and hpet ivrs override" 
> (https://lists.linux-foundation.org/pipermail/iommu/2013-April/005506.html)

>    That patch makes it possible to override the incorrect IVRS tables on the 
> command line.
>    Using ivrs_ioapic[6]=00:14.0 ivrs_ioapic[7]=00:00.1 ivrs_hpet[0]=00:14.0 
> made it boot correctly and enable the iommu and interrupt remapping.

>    That patch would probably be a good candidate to port to Xen too, 
> considering the iommu massacre on at least boards with a 890fx chipset.
>    (or make it a quirk for these earlier chipsets, since the BDF for 
> northbridge and southridge ioapic and hpet seem to be known fixed values
>    from what i read from earlier mailinglist and code comments, but perhaps 
> Suravee could comment on that ... he seems to have tested the patch for Linux 
> as well
> (https://lists.linux-foundation.org/pipermail/iommu/2013-April/005528.html))

> 2) After that my sata controller gave read errors, but i found it was still 
> in "ide" mode instead of "ahci" in the bios. A seperate but perhaps related 
> issue
>    (probably something with the enabling of multiple msi's which the driver 
> can not handle in "ide" mode, will sort that out later.)

> After i got things working on baremetal Linux, i adjusted Xen and hardcoded 
> it to add a mapping for ioapic[6]=00:14.0.
> (the entries for ivrs_ioapic[7] and hpet[0] are actually correct in the bios 
> tables, so they don't need correction for me at the moment)

> And hey presto .. no stalls ..

Pfffrt i shouldn't hit send when i'm tired and it's quite late .. i had a stale 
"no-cpuidle" in my grub cfg .. no wonder ..

So still stalls and now also a freeze when a passed through device is enabled 
in a HVM domain.

Sorry and will continue investigating tomorrow ..


> So i think porting the override patch to Xen (or make it a quirk and ignore 
> the IVRS table for special devices on certain chipsets) could solve a lot of 
> the
> reported iommu problems for AMD systems.

> --
> Sander

>> Jan

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.