[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/4] x86/IOMMU: multi-vector MSI prerequisites



On 04/11/2013 02:13 AM, Jan Beulich wrote:
On 11.04.13 at 03:51, Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx> 
wrote:
- From drivers/passthrough/amd/iommu_init.c: enable_iommu(), the
enabling code tries to set MSI affinity for IOMMU.
- From xen/arch/x86/msi.c: set_msi_affinity(), the code tries to
"read_msi_msg()" which eventually calls "amd_iommu_read_msi_from_ire()".
- Please note that the "amd_iommu_read_msi_from_ire()" was originally
empty prior the patch.
- From the patch, the new code try to call "get_intremap_entry()", which
is the function that was asserting due to table is NULL.
But even before the patch update_intremap_entry_from_msi_msg()
also necessarily called get_intremap_entry() for the IOMMU device
itself. And the respective write_msi_msg() call follows right after the
read_msi_msg() one, i.e. there's no room for other setup to be done
by that point. Plus the assertion was there already.

Here, the MSI read message was sent to read IOMMU registers.  Since the
message is targeting for IOMMU itself (bdf 0x2 in this case), it should
not be remapped.  That is why I believe the interrupt remapping table is
empty.
Whether or not the MSI for the IOMMU device itself needs to be
remapped is an independent question, but judging from the
current code as well as from the specification not saying otherwise
it needs to be. Confirmation on this would surely be appreciated
(because otherwise (a) I'd need to understand how this special
case works with the code before the patch and (b) new special
casing may need to be added).

In any case - I think I said this before already - seeing a full log up
to the crash with "iommu=debug" in place might help. After all the
ACPI tables ought to mention the IOMMU PCI device in some way,
and hence there ought to be an IVRS mapping as well as an
associated intremap table. Or else the only way this can work
currently would be by way of find_iommu_for_device() returning
NULL for the IOMMU PCI device (which would mean a similar check
would need to be added to amd_iommu_read_msi_from_ire(), but
it would then be bogus for the failure there to cause a log message
to be issued if iommu_debug - that should be trivial to verify on
an unpatched or older [say 4.2.x based] hypervisor passing
"iommu=debug").

Jan

I have attached the xl dmesg output from the machine which run the non-patched Xen (w/ iommu=debug). It is showing that there is no IVHD entry setting up for the bdf 0x2 (the IOMMU). Also, there is a line "AMD-Vi: Fail to find iommu for MSI device id = 0x2) which can be traced to xen/drivers/passthrough/amd/iommu_intr.c: amd_iommu_msi_msg_update_ire() which is called from xen/arch/x86/msi.c: write_msi_msg().

When comparing with other AMD systems in the past (i.e. with SR56XX chipset), the IVHD contains IVHD entries such as below:
(XEN) AMD-Vi: IVHD Device Entry: type 0x3 id 0 flags 0
(XEN) AMD-Vi:  Dev_Id Range: 0 -> 0x2

which will generate the mapping entry for bfd 0x0, 0x1, 0x2.

Suravee
PS: I could not get the serial console out from the trouble target system (Trinity based), so I could not get the crash message. But from the prior email I sent to first report the issue, you can see that it failed during "enable_iommu".

Attachment: tn.xldmesg
Description: tn.xldmesg

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.