[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V2 4/25] Xen/doc: Add Xen virtual IOMMU doc



On Wed, Aug 09, 2017 at 04:34:05PM -0400, Lan Tianyu wrote:
> This patch is to add Xen virtual IOMMU doc to introduce motivation,
> framework, vIOMMU hypercall and xl configuration.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx>
> ---
>  docs/misc/viommu.txt | 139 
> +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 139 insertions(+)
>  create mode 100644 docs/misc/viommu.txt
> 
> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
> new file mode 100644
> index 0000000..39455bb
> --- /dev/null
> +++ b/docs/misc/viommu.txt

IMHO, this should be the first patch in the series.

> @@ -0,0 +1,139 @@
> +Xen virtual IOMMU
> +
> +Motivation
> +==========
> +*) Enable more than 255 vcpu support

Seems like the "*)" is some kind of leftover?

> +HPC cloud service requires VM provides high performance parallel
> +computing and we hope to create a huge VM with >255 vcpu on one machine
> +to meet such requirement. Pin each vcpu to separate pcpus.

I would re-write this as:

The current requirements of HPC cloud service requires VM with a high
number of CPUs in order to achieve high performance in parallel
computing.

Also, this is needed in order to create VMs with > 128 vCPUs, not 255
vCPUs. That's because the APIC ID used by Xen is CPU ID * 2 (ie: CPU
127 has APIC ID 254, which is the last one available in xAPIC mode).
You should reword the paragraphs below in order to fix the mention of
255 vCPUs.

> +
> +To support >255 vcpus, X2APIC mode in guest is necessary because legacy
> +APIC(XAPIC) just supports 8-bit APIC ID and it only can support 255
> +vcpus at most. X2APIC mode supports 32-bit APIC ID and it requires
> +interrupt mapping function of vIOMMU.

Correct me if I'm wrong, but I don't think x2APIC requires vIOMMU. The
IOMMU is required so that you can route interrupts to all the possible
CPUs. One could image a setup where only CPUs with APIC IDs < 255 are
used as targets of external interrupts, and that doesn't require a
IOMMU.

> +The reason for this is that there is no modification to existing PCI MSI
> +and IOAPIC with the introduction of X2APIC. PCI MSI/IOAPIC can only send
> +interrupt message containing 8-bit APIC ID, which cannot address >255
> +cpus. Interrupt remapping supports 32-bit APIC ID and so it's necessary
> +to enable >255 cpus with x2apic mode.
> +
> +
> +vIOMMU Architecture
> +===================
> +vIOMMU device model is inside Xen hypervisor for following factors
> +    1) Avoid round trips between Qemu and Xen hypervisor
> +    2) Ease of integration with the rest of hypervisor
> +    3) HVMlite/PVH doesn't use Qemu
> +
> +* Interrupt remapping overview.
> +Interrupts from virtual devices and physical devices are delivered
> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
> +this procedure.
> +
> ++---------------------------------------------------+
> +|Qemu                       |VM                     |
> +|                           | +----------------+    |
> +|                           | |  Device driver |    |
> +|                           | +--------+-------+    |
> +|                           |          ^            |
> +|       +----------------+  | +--------+-------+    |
> +|       | Virtual device |  | |  IRQ subsystem |    |
> +|       +-------+--------+  | +--------+-------+    |
> +|               |           |          ^            |
> +|               |           |          |            |
> ++---------------------------+-----------------------+
> +|hypervisor     |                      | VIRQ       |
> +|               |            +---------+--------+   |
> +|               |            |      vLAPIC      |   |
> +|               |VIRQ        +---------+--------+   |
> +|               |                      ^            |
> +|               |                      |            |
> +|               |            +---------+--------+   |
> +|               |            |      vIOMMU      |   |
> +|               |            +---------+--------+   |
> +|               |                      ^            |
> +|               |                      |            |
> +|               |            +---------+--------+   |
> +|               |            |   vIOAPIC/vMSI   |   |
> +|               |            +----+----+--------+   |
> +|               |                 ^    ^            |
> +|               +-----------------+    |            |
> +|                                      |            |
> ++---------------------------------------------------+
> +HW                                     |IRQ
> +                                +-------------------+
> +                                |   PCI Device      |
> +                                +-------------------+
> +
> +
> +vIOMMU hypercall
> +================
> +Introduce new domctl hypercall "xen_domctl_viommu_op" to create/destroy
            ^ a
> +vIOMMU and query vIOMMU capabilities that device model can support.
         ^ s                                ^ the
> +
> +* vIOMMU hypercall parameter structure
> +
> +/* vIOMMU type - specify vendor vIOMMU device model */
> +#define VIOMMU_TYPE_INTEL_VTD     (1u << 0)
> +
> +/* vIOMMU capabilities */
> +#define VIOMMU_CAP_IRQ_REMAPPING  (1u << 0)
> +
> +struct xen_domctl_viommu_op {
> +    uint32_t cmd;
> +#define XEN_DOMCTL_create_viommu          0
> +#define XEN_DOMCTL_destroy_viommu         1
> +#define XEN_DOMCTL_query_viommu_caps      2
> +    union {
> +        struct {
> +            /* IN - vIOMMU type  */
> +            uint64_t viommu_type;
> +            /* IN - MMIO base address of vIOMMU. */
> +            uint64_t base_address;
> +            /* IN - Length of MMIO region */
> +            uint64_t length;
> +            /* IN - Capabilities with which we want to create */
> +            uint64_t capabilities;
> +            /* OUT - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } create_viommu;
> +
> +        struct {
> +            /* IN - vIOMMU identity */
> +            uint32_t viommu_id;
> +        } destroy_viommu;
> +
> +        struct {
> +            /* IN - vIOMMU type */
> +            uint64_t viommu_type;
> +            /* OUT - vIOMMU Capabilities */
> +            uint64_t capabilities;
> +        } query_caps;
> +    } u;
> +};
> +
> +- XEN_DOMCTL_query_viommu_caps
> +    Query capabilities of vIOMMU device model. vIOMMU_type specifies
> +which vendor vIOMMU device model(E,G Intel VTD) is targeted and hypervisor
> +returns capability bits(E,G interrupt remapping bit).
> +
> +- XEN_DOMCTL_create_viommu
> +    Create vIOMMU device with vIOMMU_type, capabilities, MMIO
> +base address and length. Hypervisor returns viommu_id. Capabilities should
> +be in range of value returned by query_viommu_caps hypercall.
> +
> +- XEN_DOMCTL_destroy_viommu
> +    Destroy vIOMMU in Xen hypervisor with viommu_id as parameters.
> +
> +Now just suppport single vIOMMU for one VM and introduced domtcls are 
> compatible
> +with multi-vIOMMU support.
> +
> +xl vIOMMU configuration

This should be "xl x86 vIOMMU configuration", since it's clearly x86
specific.

> +=======================
> +viommu="type=intel_vtd,intremap=1,x2apic=1"

Shouldn't this have some kind of array form? From the code I saw it
seems like you are adding support for domains having multiple IOMMUs,
in which case this should at least look like:

viommu = [
    'type=intel_vtd,intremap=1,x2apic=1',
    'type=intel_vtd,intremap=1,x2apic=1'
]

But then it's missing to which PCI bus each IOMMU is attached.

Also, why do you need the x2apic parameter? Is there any value in
providing a vIOMMU if it doesn't support x2APIC mode?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.