|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH V3 1/29] Xen/doc: Add Xen virtual IOMMU doc
Hi Roger:
Thanks for review.
On 2017年10月18日 21:26, Roger Pau Monné wrote:
> On Thu, Sep 21, 2017 at 11:01:42PM -0400, Lan Tianyu wrote:
>> This patch is to add Xen virtual IOMMU doc to introduce motivation,
>> framework, vIOMMU hypercall and xl configuration.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx>
>> ---
>> docs/misc/viommu.txt | 136
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 136 insertions(+)
>> create mode 100644 docs/misc/viommu.txt
>>
>> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
>> new file mode 100644
>> index 0000000..348e8c4
>> --- /dev/null
>> +++ b/docs/misc/viommu.txt
>> @@ -0,0 +1,136 @@
>> +Xen virtual IOMMU
>> +
>> +Motivation
>> +==========
>> +Enable more than 128 vcpu support
>> +
>> +The current requirements of HPC cloud service requires VM with a high
>> +number of CPUs in order to achieve high performance in parallel
>> +computing.
>> +
>> +To support >128 vcpus, X2APIC mode in guest is necessary because legacy
>> +APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is
>> +CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available
>> +in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode
>> +supports 32-bit APIC ID and it requires the interrupt remapping
>> functionality
>> +of a vIOMMU if the guest wishes to route interrupts to all available vCPUs
>> +
>> +The reason for this is that there is no modification for existing PCI MSI
>> +and IOAPIC when introduce X2APIC.
>
> I'm not sure the above sentence makes much sense. IMHO I would just
> remove it.
OK. Will remove.
>
>> PCI MSI/IOAPIC can only send interrupt
>> +message containing 8-bit APIC ID, which cannot address cpus with >254
>> +APIC ID. Interrupt remapping supports 32-bit APIC ID and so it's necessary
>> +for >128 vcpus support.
>> +
>> +
>> +vIOMMU Architecture
>> +===================
>> +vIOMMU device model is inside Xen hypervisor for following factors
>> + 1) Avoid round trips between Qemu and Xen hypervisor
>> + 2) Ease of integration with the rest of hypervisor
>> + 3) HVMlite/PVH doesn't use Qemu
>
> Just use PVH here, HVMlite == PVH now.
OK.
>
>> +
>> +* Interrupt remapping overview.
>> +Interrupts from virtual devices and physical devices are delivered
>> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
>> +this procedure.
>> +
>> ++---------------------------------------------------+
>> +|Qemu |VM |
>> +| | +----------------+ |
>> +| | | Device driver | |
>> +| | +--------+-------+ |
>> +| | ^ |
>> +| +----------------+ | +--------+-------+ |
>> +| | Virtual device | | | IRQ subsystem | |
>> +| +-------+--------+ | +--------+-------+ |
>> +| | | ^ |
>> +| | | | |
>> ++---------------------------+-----------------------+
>> +|hypervisor | | VIRQ |
>> +| | +---------+--------+ |
>> +| | | vLAPIC | |
>> +| |VIRQ +---------+--------+ |
>> +| | ^ |
>> +| | | |
>> +| | +---------+--------+ |
>> +| | | vIOMMU | |
>> +| | +---------+--------+ |
>> +| | ^ |
>> +| | | |
>> +| | +---------+--------+ |
>> +| | | vIOAPIC/vMSI | |
>> +| | +----+----+--------+ |
>> +| | ^ ^ |
>> +| +-----------------+ | |
>> +| | |
>> ++---------------------------------------------------+
>> +HW |IRQ
>> + +-------------------+
>> + | PCI Device |
>> + +-------------------+
>> +
>> +
>> +vIOMMU hypercall
>> +================
>> +Introduce a new domctl hypercall "xen_domctl_viommu_op" to create/destroy
>> +vIOMMUs.
>> +
>> +* vIOMMU hypercall parameter structure
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD 0
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING (1u << 0)
>> +
>> +struct xen_domctl_viommu_op {
>> + uint32_t cmd;
>> +#define XEN_DOMCTL_create_viommu 0
>> +#define XEN_DOMCTL_destroy_viommu 1
>
> I would invert the order of the domctl names:
>
> #define XEN_DOMCTL_viommu_create 0
> #define XEN_DOMCTL_viommu_destroy 1
>
> It's clearer if the operation is the last part of the name.
OK. Will update.
>
>> + union {
>> + struct {
>> + /* IN - vIOMMU type */
>> + uint64_t viommu_type;
>
> Hm, do we really need a uint64_t for the IOMMU type? A uint8_t should
> be more that enough (256 different IOMMU implementations).
OK. Will update.
>
>> + /* IN - MMIO base address of vIOMMU. */
>> + uint64_t base_address;
>> + /* IN - Capabilities with which we want to create */
>> + uint64_t capabilities;
>> + /* OUT - vIOMMU identity */
>> + uint32_t viommu_id;
>> + } create_viommu;
>> +
>> + struct {
>> + /* IN - vIOMMU identity */
>> + uint32_t viommu_id;
>> + } destroy_viommu;
>
> Do you really need the destroy operation? Do we expect to hot-unplug
> vIOMMUs? Otherwise vIOMMUs should be removed when the domain is
> destroyed.
Yes. no such requirement so far and added it just for multi-vIOMMU
consideration. I will remove it and add back when it's really needed.
>
>> + } u;
>> +};
>> +
>> +- XEN_DOMCTL_create_viommu
>> + Create vIOMMU device with vIOMMU_type, capabilities and MMIO base
>> +address. Hypervisor allocates viommu_id for new vIOMMU instance and return
>> +back. The vIOMMU device model in hypervisor should check whether it can
>> +support the input capabilities and return error if not.
>> +
>> +- XEN_DOMCTL_destroy_viommu
>> + Destroy vIOMMU in Xen hypervisor with viommu_id as parameter.
>> +
>> +These vIOMMU domctl and vIOMMU option in configure file consider
>> multi-vIOMMU
>> +support for single VM.(e.g, parameters of create/destroy vIOMMU includes
>> +vIOMMU id). But function implementation only supports one vIOMMU per VM so
>> far.
>> +
>> +Xen hypervisor vIOMMU command
>> +=============================
>> +Introduce vIOMMU command "viommu=1" to enable vIOMMU function in hypervisor.
>> +It's default disabled.
>
> Hm, I'm not sure we really need this. At the end viommu will be
> disabled by default for guests, unless explicitly enabled in the
> config file.
This is according to Jan's early comments on RFC patch
https://patchwork.kernel.org/patch/9733869/.
"It's actually a question whether in our current scheme a Kconfig
option is appropriate here in the first place. I'd rather see this be
an always built feature which needs enabling on the command line
for the time being."
>
> Thanks, Roger.
>
--
Best regards
Tianyu Lan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |