|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices
On 26/10/2022 15:33, Rahul Singh wrote: Hi Julien, Hi Rahul, On 26 Oct 2022, at 2:36 pm, Julien Grall <julien@xxxxxxx> wrote: On 26/10/2022 14:17, Rahul Singh wrote:Hi All,Hi Rahul,At Arm, we started to implement the POC to support 2 levels of page tables/nested translation in SMMUv3. To support nested translation for guest OS Xen needs to expose the virtual IOMMU. If we passthrough the device to the guest that is behind an IOMMU and virtual IOMMU is enabled for the guest there is a need to add IOMMU binding for the device in the passthrough node as per [1]. This email is to get an agreement on how to add the IOMMU binding for guest OS. Before I will explain how to add the IOMMU binding let me give a brief overview of how we will add support for virtual IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested translation support. SMMUv3 hardware supports two stages of translation. Each stage of translation can be independently enabled. An incoming address is logically translated from VA to IPA in stage 1, then the IPA is input to stage 2 which translates the IPA to the output PA. Stage 1 is intended to be used by a software entity( Guest OS) to provide isolation or translation to buffers within the entity, for example, DMA isolation within an OS. Stage 2 is intended to be available in systems supporting the Virtualization Extensions and is intended to virtualize device DMA to guest VM address spaces. When both stage 1 and stage 2 are enabled, the translation configuration is called nesting. Stage 1 translation support is required to provide isolation between different devices within the guest OS. XEN already supports Stage 2 translation but there is no support for Stage 1 translation for guests. We will add support for guests to configure the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU hardware and exposes the virtual SMMU to the guest. Guest can use the native SMMU driver to configure the stage 1 translation. When the guest configures the SMMU for Stage 1, XEN will trap the access and configure the hardware accordingly. Now back to the question of how we can add the IOMMU binding between the virtual IOMMU and the master devices so that guests can configure the IOMMU correctly. The solution that I am suggesting is as below: For dom0, while handling the DT node(handle_node()) Xen will replace the phandle in the "iommus" property with the virtual IOMMU node phandle.Below, you said that each IOMMUs may have a different ID space. So shouldn't we expose one vIOMMU per pIOMMU? If not, how do you expect the user to specify the mapping?Yes you are right we need to create one vIOMMU per pIOMMU for dom0. This also helps in the ACPI case where we don’t need to modify the tables to delete the pIOMMU entries and create one vIOMMU. In this case, no need to replace the phandle as Xen create the vIOMMU with the same pIOMMU phandle and same base address. For domU guests one vIOMMU per guest will be created. IIRC, the SMMUv3 is using a ring like the GICv3 ITS. I think we need to be open here because this may end up to be tricky to security support it (we have N guest ring that can write to M host ring).
I am confused. Below, you are making the virtual master ID optional. So shouldn't this be mandatory if you really need the mapping with the virtual ID? The device tree node will be used to assign the device to the guest and configure the Stage-2 translation. Guest will use the vMaster-ID to configure the vIOMMU during boot. Xen needs information to link vMaster-ID to pMaster-ID to configure the corresponding pIOMMU. As I mention we need vMaster-ID in case a system could have 2 identical Master-ID but each one connected to a different SMMU and assigned to the guest. I am afraid I still don't understand why this is a requirement. Libxl could have enough knowledge (which will be necessarry for the PCI case) to know the IOMMU and pMasterID associated with a device. So libxl could allocate the vMasterID, tell Xen the corresponding mapping and update the device-tree. IOW, it doesn't seem to be necessary to involve the user in the process here.
This means that libxl will need to know the associated pMasterID to a PCI device. So, I don't understand why you can't do the same for platform devices.
Ok. I think it would be better to use very different phandle in your example so it doesn't look like a mistake. Cheers, -- Julien Grall
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |