|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices
Hi Michal,
> On 26 Oct 2022, at 6:17 pm, Michal Orzel <michal.orzel@xxxxxxx> wrote:
>
> Hi Rahul,
>
> On 26/10/2022 16:33, Rahul Singh wrote:
>>
>>
>> Hi Julien,
>>
>>> On 26 Oct 2022, at 2:36 pm, Julien Grall <julien@xxxxxxx> wrote:
>>>
>>>
>>>
>>> On 26/10/2022 14:17, Rahul Singh wrote:
>>>> Hi All,
>>>
>>> Hi Rahul,
>>>
>>>> At Arm, we started to implement the POC to support 2 levels of page
>>>> tables/nested translation in SMMUv3.
>>>> To support nested translation for guest OS Xen needs to expose the virtual
>>>> IOMMU. If we passthrough the
>>>> device to the guest that is behind an IOMMU and virtual IOMMU is enabled
>>>> for the guest there is a need to
>>>> add IOMMU binding for the device in the passthrough node as per [1]. This
>>>> email is to get an agreement on
>>>> how to add the IOMMU binding for guest OS.
>>>> Before I will explain how to add the IOMMU binding let me give a brief
>>>> overview of how we will add support for virtual
>>>> IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested
>>>> translation support. SMMUv3 hardware
>>>> supports two stages of translation. Each stage of translation can be
>>>> independently enabled. An incoming address is logically
>>>> translated from VA to IPA in stage 1, then the IPA is input to stage 2
>>>> which translates the IPA to the output PA. Stage 1 is
>>>> intended to be used by a software entity( Guest OS) to provide isolation
>>>> or translation to buffers within the entity, for example,
>>>> DMA isolation within an OS. Stage 2 is intended to be available in systems
>>>> supporting the Virtualization Extensions and is
>>>> intended to virtualize device DMA to guest VM address spaces. When both
>>>> stage 1 and stage 2 are enabled, the translation
>>>> configuration is called nesting.
>>>> Stage 1 translation support is required to provide isolation between
>>>> different devices within the guest OS. XEN already supports
>>>> Stage 2 translation but there is no support for Stage 1 translation for
>>>> guests. We will add support for guests to configure
>>>> the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU
>>>> hardware and exposes the virtual SMMU to the guest.
>>>> Guest can use the native SMMU driver to configure the stage 1 translation.
>>>> When the guest configures the SMMU for Stage 1,
>>>> XEN will trap the access and configure the hardware accordingly.
>>>> Now back to the question of how we can add the IOMMU binding between the
>>>> virtual IOMMU and the master devices so that
>>>> guests can configure the IOMMU correctly. The solution that I am
>>>> suggesting is as below:
>>>> For dom0, while handling the DT node(handle_node()) Xen will replace the
>>>> phandle in the "iommus" property with the virtual
>>>> IOMMU node phandle.
>>> Below, you said that each IOMMUs may have a different ID space. So
>>> shouldn't we expose one vIOMMU per pIOMMU? If not, how do you expect the
>>> user to specify the mapping?
>>
>> Yes you are right we need to create one vIOMMU per pIOMMU for dom0. This
>> also helps in the ACPI case
>> where we don’t need to modify the tables to delete the pIOMMU entries and
>> create one vIOMMU.
>> In this case, no need to replace the phandle as Xen create the vIOMMU with
>> the same pIOMMU
>> phandle and same base address.
>>
>> For domU guests one vIOMMU per guest will be created.
>>
>>>
>>>> For domU guests, when passthrough the device to the guest as per [2], add
>>>> the below property in the partial device tree
>>>> node that is required to describe the generic device tree binding for
>>>> IOMMUs and their master(s)
>>>> "iommus = < &magic_phandle 0xvMasterID>
>>>> • magic_phandle will be the phandle ( vIOMMU phandle in xl) that will
>>>> be documented so that the user can set that in partial DT node (0xfdea).
>>>
>>> Does this mean only one IOMMU will be supported in the guest?
>>
>> Yes.
>>
>>>
>>>> • vMasterID will be the virtual master ID that the user will provide.
>>>> The partial device tree will look like this:
>>>> /dts-v1/;
>>>> / {
>>>> /* #*cells are here to keep DTC happy */
>>>> #address-cells = <2>;
>>>> #size-cells = <2>;
>>>> aliases {
>>>> net = &mac0;
>>>> };
>>>> passthrough {
>>>> compatible = "simple-bus";
>>>> ranges;
>>>> #address-cells = <2>;
>>>> #size-cells = <2>;
>>>> mac0: ethernet@10000000 {
>>>> compatible = "calxeda,hb-xgmac";
>>>> reg = <0 0x10000000 0 0x1000>;
>>>> interrupts = <0 80 4 0 81 4 0 82 4>;
>>>> iommus = <0xfdea 0x01>;
>>>> };
>>>> };
>>>> };
>>>> In xl.cfg we need to define a new option to inform Xen about vMasterId to
>>>> pMasterId mapping and to which IOMMU device this
>>>> the master device is connected so that Xen can configure the right IOMMU.
>>>> This is required if the system has devices that have
>>>> the same master ID but behind a different IOMMU.
>>>
>>> In xl.cfg, we already pass the device-tree node path to passthrough. So Xen
>>> should already have all the information about the IOMMU and Master-ID. So
>>> it doesn't seem necessary for Device-Tree.
>>>
>>> For ACPI, I would have expected the information to be found in the IOREQ.
>>>
>>> So can you add more context why this is necessary for everyone?
>>
>> We have information for IOMMU and Master-ID but we don’t have information
>> for linking vMaster-ID to pMaster-ID.
>> The device tree node will be used to assign the device to the guest and
>> configure the Stage-2 translation. Guest will use the
>> vMaster-ID to configure the vIOMMU during boot. Xen needs information to
>> link vMaster-ID to pMaster-ID to configure
>> the corresponding pIOMMU. As I mention we need vMaster-ID in case a system
>> could have 2 identical Master-ID but
>> each one connected to a different SMMU and assigned to the guest.
>
> I think the proposed solution would work and I would just like to clear some
> issues.
>
> Please correct me if I'm wrong:
>
> In the xl config file we already need to specify dtdev to point to the device
> path in host dtb.
> In the partial device tree we specify the vMasterId as well as magic phandle.
> Isn't it that we already have all the information necessary without the need
> for iommu_devid_map?
> For me it looks like the partial dtb provides vMasterID and dtdev provides
> pMasterID as well as physical phandle to SMMU.
>
> Having said that, I can also understand that specifying everything in one
> place using iommu_devid_map can be easier
> and reduces the need for device tree parsing.
>
> Apart from that, what is the reason of exposing only one vSMMU to guest
> instead of one vSMMU per pSMMU?
> In the latter solution, the whole issue with handling devices with the same
> stream ID but belonging to different SMMUs
> would be gone. It would also result in a more natural way of the device tree
> look. Normally a guest would see
> e.g. both SMMUs and exposing only one can be misleading.
Please see the other email that I replied to Julien to know the answer to the
above question.
Regards,
Rahul
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |