[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen virtual IOMMU high level design doc
Hi Jan: Sorry for later response. Thanks a lot for your comments. On 2016年08月25日 19:11, Jan Beulich wrote: >>>> On 17.08.16 at 14:05, <tianyu.lan@xxxxxxxxx> wrote: >> 1 Motivation for Xen vIOMMU >> ============================================================================ >> === >> 1.1 Enable more than 255 vcpu support >> HPC virtualization requires more than 255 vcpus support in a single VM >> to meet parallel computing requirement. More than 255 vcpus support >> requires interrupt remapping capability present on vIOMMU to deliver >> interrupt to #vcpu >255 Otherwise Linux guest fails to boot up with >255 >> vcpus if interrupt remapping is absent. > > I continue to question this as a valid motivation at this point in > time, for the reasons Andrew has been explaining. If we want to support Linux guest with >255 vcpus, interrupt remapping is necessary. From Linux commit introducing x2apic and IR mode, it said IR was a pre-requisite for enabling x2apic mode in the CPU. https://lwn.net/Articles/289881/ So far, no sure behavior on the other OS. We may watch Windows guest behavior later on KVM and there is still a bug to run Windows guest with IR function on KVM. > >> 2. Xen vIOMMU Architecture >> ============================================================================ >> ==== >> >> * vIOMMU will be inside Xen hypervisor for following factors >> 1) Avoid round trips between Qemu and Xen hypervisor >> 2) Ease of integration with the rest of the hypervisor >> 3) HVMlite/PVH doesn't use Qemu >> * Dummy xen-vIOMMU in Qemu as a wrapper of new hypercall to create >> /destory vIOMMU in hypervisor and deal with virtual PCI device's 2th >> level translation. > > How does the create/destroy part of this match up with 3) right > ahead of it? The create/destroy hypercalls will work for both hvm and hvmlite. Suppose hvmlite has tool stack(E.G libxl) which can call new hypercalls to create or destroy virtual iommu in hypervisor. > >> 3 Xen hypervisor >> ========================================================================== >> >> 3.1 New hypercall XEN_SYSCTL_viommu_op >> 1) Definition of "struct xen_sysctl_viommu_op" as new hypercall parameter. >> >> struct xen_sysctl_viommu_op { >> u32 cmd; >> u32 domid; >> union { >> struct { >> u32 capabilities; >> } query_capabilities; >> struct { >> u32 capabilities; >> u64 base_address; >> } create_iommu; >> struct { >> u8 bus; >> u8 devfn; > > Please can we avoid introducing any new interfaces without segment/ > domain value, even if for now it'll be always zero? Sure. Will add segment field. > >> u64 iova; >> u64 translated_addr; >> u64 addr_mask; /* Translation page size */ >> IOMMUAccessFlags permisson; >> } 2th_level_translation; > > I suppose "translated_addr" is an output here, but for the following > fields this already isn't clear. Please add IN and OUT annotations for > clarity. > > Also, may I suggest to name this "l2_translation"? (But there are > other implementation specific things to be considered here, which > I guess don't belong into a design doc discussion.) How about this? struct { /* IN parameters. */ u8 segment; u8 bus; u8 devfn; u64 iova; /* Out parameters. */ u64 translated_addr; u64 addr_mask; /* Translation page size */ IOMMUAccessFlags permisson; } l2_translation; > >> }; >> >> typedef enum { >> IOMMU_NONE = 0, >> IOMMU_RO = 1, >> IOMMU_WO = 2, >> IOMMU_RW = 3, >> } IOMMUAccessFlags; >> >> >> Definition of VIOMMU subops: >> #define XEN_SYSCTL_viommu_query_capability 0 >> #define XEN_SYSCTL_viommu_create 1 >> #define XEN_SYSCTL_viommu_destroy 2 >> #define XEN_SYSCTL_viommu_dma_translation_for_vpdev 3 >> >> Definition of VIOMMU capabilities >> #define XEN_VIOMMU_CAPABILITY_1nd_level_translation (1 << 0) >> #define XEN_VIOMMU_CAPABILITY_2nd_level_translation (1 << 1) > > l1 and l2 respectively again, please. Will update. > >> 3.3 Interrupt remapping >> Interrupts from virtual devices and physical devices will be delivered >> to vlapic from vIOAPIC and vMSI. It needs to add interrupt remapping >> hooks in the vmsi_deliver() and ioapic_deliver() to find target vlapic >> according interrupt remapping table. The following diagram shows the logic. > > Missing diagram or stale sentence? Sorry. It's stale sentence and moved the diagram to 2.2 Interrupt remapping overview. > >> 3.5 Implementation consideration >> Linux Intel IOMMU driver will fail to be loaded without 2th level >> translation support even if interrupt remapping and 1th level >> translation are available. This means it's needed to enable 2th level >> translation first before other functions. > > Is there a reason for this? I.e. do they unconditionally need that > functionality? Yes, Linux intel IOMMU driver unconditionally needs l2 translation. Driver checks whether there is a valid sagaw(supported Adjusted Guest Address Widths) during initializing IOMMU data struct and return error if not. -- Best regards Tianyu Lan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |