[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable] Commit 2ca9fbd739b8a72b16dd790d0fff7b75f5488fb8 AMD IOMMU: allocate IRTE entries instead of using a static mapping, makes dom0 boot process stall several times.

Friday, August 16, 2013, 3:15:50 PM, you wrote:

>>>> On 16.08.13 at 12:44, Sander Eikelenboom <linux@xxxxxxxxxxxxxx> wrote:
>> Friday, August 16, 2013, 11:18:56 AM, you wrote:
>>>>>> On 16.08.13 at 10:40, Sander Eikelenboom <linux@xxxxxxxxxxxxxx> wrote:
>>>> Hmm only the "no-cpuidle" is needed (cpufreq=xen can stay) to make the 
>>>> stalls 
>>>> disappear,
>>>> but makes me wonder how that is related to the commit the bisection found 
>>>> ..
>>>> machine has been running with cpuidle enabled for ages ..
>>> That's odd indeed. If you're up to do a little bit of debugging here,
>>> why don't you log the sequence of interrupts arriving both with
>>> and without said commit. This might end up being a lot of data, so
>>> you may want to filter out uninteresting stuff and/or log only to
>>> a memory buffer which then gets dumped upon some debug key
>>> press.
>> Hmm making said debug patch is getting probably a bit out of my league ..
>> since the generated interrupts will probably outpace flushing to the 
>> console.
>> And i'm not sure in what things you are actually interested around the irq 
>> flow (probably the hpet msi ones ?).

> No, much more the ones from the devices that you say the drivers
> of which cause the stalls while initializing. The question mainly is
> whether the distribution of interrupts between CPUs changed in a
> way that made the system more susceptible to missing wakeups
> via HPET MSIs.

> Along that lines was also the question regarding interrupt counts
> for the devices in question, which if I'm not mistaken you didn't
> answer yet.

Hi Jan,

Got things running again, first on baremetal linux
I was having 2 seperate problems it seems:

1) the southbridge ioapic(6) isn't in the IVRS tables, the Linux kernel has had 
a patch
   "iommu/amd: Add ioapic and hpet ivrs override" 

   That patch makes it possible to override the incorrect IVRS tables on the 
command line.
   Using ivrs_ioapic[6]=00:14.0 ivrs_ioapic[7]=00:00.1 ivrs_hpet[0]=00:14.0 
made it boot correctly and enable the iommu and interrupt remapping.

   That patch would probably be a good candidate to port to Xen too, 
considering the iommu massacre on at least boards with a 890fx chipset.
   (or make it a quirk for these earlier chipsets, since the BDF for 
northbridge and southridge ioapic and hpet seem to be known fixed values
   from what i read from earlier mailinglist and code comments, but perhaps 
Suravee could comment on that ... he seems to have tested the patch for Linux 
as well

2) After that my sata controller gave read errors, but i found it was still in 
"ide" mode instead of "ahci" in the bios. A seperate but perhaps related issue
   (probably something with the enabling of multiple msi's which the driver can 
not handle in "ide" mode, will sort that out later.)

After i got things working on baremetal Linux, i adjusted Xen and hardcoded it 
to add a mapping for ioapic[6]=00:14.0.
(the entries for ivrs_ioapic[7] and hpet[0] are actually correct in the bios 
tables, so they don't need correction for me at the moment)

And hey presto .. no stalls ..

So i think porting the override patch to Xen (or make it a quirk and ignore the 
IVRS table for special devices on certain chipsets) could solve a lot of the
reported iommu problems for AMD systems.


> Jan

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.