|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 01/28] Xen/doc: Add Xen virtual IOMMU doc
On Fri, Feb 09, 2018 at 12:54:11PM +0000, Roger Pau Monné wrote:
>On Fri, Nov 17, 2017 at 02:22:08PM +0800, Chao Gao wrote:
>> From: Lan Tianyu <tianyu.lan@xxxxxxxxx>
>>
>> This patch is to add Xen virtual IOMMU doc to introduce motivation,
>> framework, vIOMMU hypercall and xl configuration.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx>
>> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
>> ---
>> docs/misc/viommu.txt | 120
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 120 insertions(+)
>> create mode 100644 docs/misc/viommu.txt
>>
>> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
>> new file mode 100644
>> index 0000000..472d2b5
>> --- /dev/null
>> +++ b/docs/misc/viommu.txt
>> @@ -0,0 +1,120 @@
>> +Xen virtual IOMMU
>> +
>> +Motivation
>> +==========
>> +Enable more than 128 vcpu support
>> +
>> +The current requirements of HPC cloud service requires VM with a high
>> +number of CPUs in order to achieve high performance in parallel
>> +computing.
>> +
>> +To support >128 vcpus, X2APIC mode in guest is necessary because legacy
>> +APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is
>> +CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available
>> +in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode
>> +supports 32-bit APIC ID and it requires the interrupt remapping
>> functionality
>> +of a vIOMMU if the guest wishes to route interrupts to all available vCPUs
>> +
>> +PCI MSI/IOAPIC can only send interrupt message containing 8-bit APIC ID,
>> +which cannot address cpus with >254 APIC ID. Interrupt remapping supports
>> +32-bit APIC ID and so it's necessary for >128 vcpus support.
>> +
>> +vIOMMU Architecture
>> +===================
>> +vIOMMU device model is inside Xen hypervisor for following factors
>> + 1) Avoid round trips between Qemu and Xen hypervisor
>> + 2) Ease of integration with the rest of hypervisor
>> + 3) PVH doesn't use Qemu
>> +
>> +* Interrupt remapping overview.
>> +Interrupts from virtual devices and physical devices are delivered
>> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
>> +this procedure.
>> +
>> ++---------------------------------------------------+
>> +|Qemu |VM |
>> +| | +----------------+ |
>> +| | | Device driver | |
>> +| | +--------+-------+ |
>> +| | ^ |
>> +| +----------------+ | +--------+-------+ |
>> +| | Virtual device | | | IRQ subsystem | |
>> +| +-------+--------+ | +--------+-------+ |
>> +| | | ^ |
>> +| | | | |
>> ++---------------------------+-----------------------+
>> +|hypervisor | | VIRQ |
>> +| | +---------+--------+ |
>> +| | | vLAPIC | |
>> +| |VIRQ +---------+--------+ |
>> +| | ^ |
>> +| | | |
>> +| | +---------+--------+ |
>> +| | | vIOMMU | |
>> +| | +---------+--------+ |
>> +| | ^ |
>> +| | | |
>> +| | +---------+--------+ |
>> +| | | vIOAPIC/vMSI | |
>> +| | +----+----+--------+ |
>> +| | ^ ^ |
>> +| +-----------------+ | |
>> +| | |
>> ++---------------------------------------------------+
>> +HW |IRQ
>> + +-------------------+
>> + | PCI Device |
>> + +-------------------+
>> +
>> +
>> +vIOMMU hypercall
>> +================
>> +Introduce a new domctl hypercall "xen_domctl_viommu_op" to create
>> +vIOMMUs instance in hypervisor. vIOMMU instance will be destroyed
>> +during destroying domain.
>> +
>> +* vIOMMU hypercall parameter structure
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD 0
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING (1u << 0)
>> +
>> +struct xen_domctl_viommu_op {
>> + uint32_t cmd;
>> +#define XEN_DOMCTL_viommu_create 0
>> + union {
>> + struct {
>> + /* IN - vIOMMU type */
>> + uint8_t type;
>> + /* IN - MMIO base address of vIOMMU. */
>> + uint64_t base_address;
>> + /* IN - Capabilities with which we want to create */
>> + uint64_t capabilities;
>> + /* OUT - vIOMMU identity */
>> + uint32_t id;
>> + } create;
>> + } u;
>> +};
>> +
>> +- XEN_DOMCTL_create_viommu
>> + Create vIOMMU device with type, capabilities and MMIO base address.
>> +Hypervisor allocates viommu_id for new vIOMMU instance and return back.
>> +The vIOMMU device model in hypervisor should check whether it can
>> +support the input capabilities and return error if not.
>> +
>> +vIOMMU domctl and vIOMMU option in configure file consider multi-vIOMMU
>> +support for single VM.(e.g, parameters of create vIOMMU includes vIOMMU id).
>> +But function implementation only supports one vIOMMU per VM so far.
>> +
>> +xl x86 vIOMMU configuration"
>> +============================
>> +viommu = [
>> + 'type=intel_vtd,intremap=1',
>> + ...
>> +]
>> +
>> +"type" - Specify vIOMMU device model type. Currently only supports Intel vtd
>> +device model.
>
>Although I see the point in being able to specify the vIOMMU type, is
>this really helpful from an admin PoV?
>
>What would happen for example if you try to add an Intel vIOMMU to a
>guest running on an AMD CPU? I guess the guest OSes would be quite
>surprised about that...
>
>I think the most common way to use this option would be:
>
>viommu = [
> 'intremap=1',
> ...
>]
Agree it.
>
>And vIOMMUs should automatically be added to guests with > 128 vCPUs?
>IIRC Linux requires a vIOMMU in order to run with > 128 vCPUs (which
>is quite arbitrary, but anyway...).
I think linux will only use 128 CPUs for this case on bare-metal.
Considering a benign VM shouldn't has a weird configuration -- has > 128
vcpus but has no viommu, adding vIOMMUs automatically when needed is
fine with me.
Thanks
Chao
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |