[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to express "externally managed" IOMMU domains for VFIO/IOMMUFD ?



Hello and thanks for your response.

Le 23/04/2026 à 10:05, Tian, Kevin a écrit :
>> From: Teddy Astie
>> Sent: Wednesday, April 22, 2026 11:59 PM
>>
>> Hello,
>>
>> On Xen, for PV-IOMMU [1], we have IOMMU support in Dom0, which in
>> particular allows using VFIO and IOMMUFD from Dom0.
>>
>> However, its interactions with PCI Passthrough are unclear, and it would
> 
> VFIO manages PCI passthrough. since it's already allowed which part of
> interaction is unclear?
> 

AIUI, VFIO has no real knowledge of what is a "virtual machine" (at 
least not in a way that would suffice for us), hence don't really PCI 
Passthrough on its own.

For instance, the DMA remapping aspect of QEMU's PCI Passthrough is 
implemented by keeping the VFIO DMA mappings in sync with "guest 
memory", however, we can't really do that in our case as we don't have 
full control and view of guest memory.

>> be preferable to let the kernel handle some of this logic. That would
>> for instance avoid situations where toolstack causes Xen and Linux to go
>> out of sync on where devices belong.
> 
> what is 'some of this logic' and what is the exact out-of-sync scenario?
> 

In the sense of letting the kernel handle the PCI Passthrough lifecycle.

For now, the userland ("toolstack") is performing passthrough-related 
operations on behalf of the kernel, i.e move a device into the guest. 
That causes a problem where the Linux IOMMU subsystem thinks the device 
is in a specific IOMMU domain, while it's actually not.
That causes in particular Linux IOMMU logic to misbehave, and the device 
to eventually DMA in the wrong places.

The idea isn't really to "fix" this specific case, but more to provide a 
alternative where the kernel orchestrate PCI Passthrough instead. So the 
logic would be now orchestrated in one place instead.

>>
>> On Xen, we have a dedicated hypercalls for moving a device into another
>> guest (so it no longer belongs in Dom0, at far as DMA is concerned).
>>
>> But it looks like there are no way to describe that idea of "attach that
>> device to this VM" nor "the device is in a VM"; which makes that
>> impracticable.
>>
>> There may be things that could be done with the vIOMMU objects, but
>> there would be no "parent domain" in such case, as said earlier it
>> doesn't exist in the IOMMU subsystem.
>>
>> What is expected to be done instead ?
>>
>> Teddy
>>
>> [1] https://www.youtube.com/watch?v=pLMGRgEJ-Eg
>>
> 
> It'd be much easier to collect comments if you can put plain words
> to explain the problem rather than expecting other folks to watch
> the video first...

The video is more to additional context, it's not really directly 
related to this issue.

Teddy


--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.