|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/7] xen: use domid check in is_hardware_domain
>>> On 18.03.14 at 22:34, Daniel De Graaf <dgdegra@xxxxxxxxxxxxx> wrote:
> Instead of checking is_privileged to determine if a domain should
> control the hardware, check that the domain_id is equal to zero (which
> is currently the only domain for which is_privileged is true). This
> allows other places where domain_id is checked for zero to be replaced
> with is_hardware_domain.
>
> The distinction between is_hardware_domain, is_control_domain, and
> domain 0 is based on the following disaggregation model:
>
> Domain 0 bootstraps the system. It may remain to perform requested
> builds of domains that need a minimal trust chain (i.e. vTPM domains).
> Other than being built by the hypervisor, nothing is special about this
> domain - although it may be useful to have is_control_domain() return
> true depending on the toolstack it uses to build other domains.
>
> The hardware domain manages devices for PCI pass-through to driver
> domains or can act as a driver domain itself, depending on the desired
> degree of disaggregation. It is also the domain managing devices that
> do not support pass-through: PCI configuration space access, parsing the
> hardware ACPI tables and system power or machine check events. This is
> the only domain where is_hardware_domain() is true. The return of
> is_control_domain() may be false for this domain.
>
> The control domain manages other domains, controls guest launch and
> shutdown, and manages resource constraints; is_control_domain() returns
> true. The functionality guarded by is_control_domain may in the future
> be adapted to use explicit hypercalls, eliminating the special treatment
> of this domain. It may be reasonable to have multiple control domains
> on a multi-tenant system.
>
> Guest domains and other service or driver domains are all treated
> identically by the hypervisor; the security policy may further constrain
> administrative actions on or communication between these domains.
>
> Signed-off-by: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx>
> Cc: Ian Campbell <ian.campbell@xxxxxxxxxx>
Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Keir Fraser <keir@xxxxxxx>
> Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
> Cc: Tim Deegan <tim@xxxxxxx>
> Cc: Xiantao Zhang <xiantao.zhang@xxxxxxxxx>
>
> ---
> Changes from v1:
> - Fix up some references to dom0 in comments
> - Show real CPUID values to hardware domain
> - Require xenoprof to be hardware domain due to lack of cleanup on destroy
> - Change a few ( d == dom0 ) references to is_hardware_domain
>
> xen/arch/arm/domain.c | 6 +++---
> xen/arch/arm/gic.c | 4 ++--
> xen/arch/arm/vgic.c | 2 +-
> xen/arch/arm/vtimer.c | 5 +++--
> xen/arch/arm/vuart.c | 2 +-
> xen/arch/x86/cpu/mcheck/vmce.c | 2 +-
> xen/arch/x86/domain.c | 3 ++-
> xen/arch/x86/hvm/i8254.c | 2 +-
> xen/arch/x86/mm.c | 2 +-
> xen/arch/x86/time.c | 4 ++--
> xen/arch/x86/traps.c | 21 +++++++++++----------
> xen/common/domain.c | 6 +++---
> xen/common/kernel.c | 2 +-
> xen/common/xenoprof.c | 8 +++++++-
> xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +-
> xen/drivers/passthrough/iommu.c | 2 +-
> xen/drivers/passthrough/vtd/iommu.c | 11 ++++++-----
> xen/drivers/passthrough/vtd/x86/vtd.c | 2 +-
> xen/include/xen/sched.h | 4 ++--
> 19 files changed, 50 insertions(+), 40 deletions(-)
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 46ee486..2c76a8f 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -540,10 +540,10 @@ int arch_domain_create(struct domain *d, unsigned int
> domcr_flags)
>
> /*
> * Virtual UART is only used by linux early printk and decompress code.
> - * Only use it for dom0 because the linux kernel may not support
> - * multi-platform.
> + * Only use it for the hardware domain because the linux kernel may not
> + * support multi-platform.
> */
> - if ( (d->domain_id == 0) && (rc = domain_vuart_init(d)) )
> + if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
> goto fail;
>
> return 0;
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 0095b97..8168b7b 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -865,10 +865,10 @@ int gicv_setup(struct domain *d)
> int ret;
>
> /*
> - * Domain 0 gets the hardware address.
> + * The hardware domain gets the hardware address.
> * Guests get the virtual platform layout.
> */
> - if ( d->domain_id == 0 )
> + if ( is_hardware_domain(d) )
> {
> d->arch.vgic.dbase = gic.dbase;
> d->arch.vgic.cbase = gic.cbase;
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 553411d..65f48f2 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -82,7 +82,7 @@ int domain_vgic_init(struct domain *d)
> /* Currently nr_lines in vgic and gic doesn't have the same meanings
> * Here nr_lines = number of SPIs
> */
> - if ( d->domain_id == 0 )
> + if ( is_hardware_domain(d) )
> d->arch.vgic.nr_lines = gic_number_lines() - 32;
> else
> d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
> diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
> index 3d6a721..cb690bb 100644
> --- a/xen/arch/arm/vtimer.c
> +++ b/xen/arch/arm/vtimer.c
> @@ -54,10 +54,11 @@ int vcpu_domain_init(struct domain *d)
> int vcpu_vtimer_init(struct vcpu *v)
> {
> struct vtimer *t = &v->arch.phys_timer;
> - bool_t d0 = (v->domain == dom0);
> + bool_t d0 = is_hardware_domain(v->domain);
>
> /*
> - * Domain 0 uses the hardware interrupts, guests get the virtual
> platform.
> + * Hardware domain uses the hardware interrupts, guests get the virtual
> + * platform.
> */
>
> init_timer(&t->timer, phys_timer_expired, t, v->processor);
> diff --git a/xen/arch/arm/vuart.c b/xen/arch/arm/vuart.c
> index b9d3ced..c02a8a9 100644
> --- a/xen/arch/arm/vuart.c
> +++ b/xen/arch/arm/vuart.c
> @@ -46,7 +46,7 @@
>
> int domain_vuart_init(struct domain *d)
> {
> - ASSERT( !d->domain_id );
> + ASSERT( is_hardware_domain(d) );
>
> d->arch.vuart.info = serial_vuart_info(SERHND_DTUART);
> if ( !d->arch.vuart.info )
> diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
> index ed00f7c..c83375e 100644
> --- a/xen/arch/x86/cpu/mcheck/vmce.c
> +++ b/xen/arch/x86/cpu/mcheck/vmce.c
> @@ -430,7 +430,7 @@ int unmmap_broken_page(struct domain *d, mfn_t mfn,
> unsigned long gfn)
> int rc;
>
> /* Always trust dom0's MCE handler will prevent future access */
> - if ( d == dom0 )
> + if ( is_hardware_domain(d) )
> return 0;
>
> if (!mfn_valid(mfn_x(mfn)))
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 0d563de..0cfca2a 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1504,7 +1504,8 @@ void context_switch(struct vcpu *prev, struct vcpu
> *next)
> }
>
> set_cpuid_faulting(is_pv_vcpu(next) &&
> - (next->domain->domain_id != 0));
> + !is_control_domain(next->domain) &&
> + !is_hardware_domain(next->domain));
> }
>
> if (is_hvm_vcpu(next) && (prev != next) )
> diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c
> index f7493b8..6a91f76 100644
> --- a/xen/arch/x86/hvm/i8254.c
> +++ b/xen/arch/x86/hvm/i8254.c
> @@ -540,7 +540,7 @@ int pv_pit_handler(int port, int data, int write)
> .data = data
> };
>
> - if ( (current->domain->domain_id == 0) && dom0_pit_access(&ioreq) )
> + if ( is_hardware_domain(current->domain) && dom0_pit_access(&ioreq) )
> {
> /* nothing to do */;
> }
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index fdc5ed3..ad48acc 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4787,7 +4787,7 @@ long arch_memory_op(int op,
> XEN_GUEST_HANDLE_PARAM(void) arg)
> .max_mfn = MACH2PHYS_NR_ENTRIES - 1
> };
>
> - if ( !mem_hotplug && current->domain == dom0 )
> + if ( !mem_hotplug && is_hardware_domain(current->domain) )
> mapping.max_mfn = max_page - 1;
> if ( copy_to_guest(arg, &mapping, 1) )
> return -EFAULT;
> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> index f904af2..89308bd 100644
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1839,7 +1839,7 @@ void tsc_set_info(struct domain *d,
> uint32_t tsc_mode, uint64_t elapsed_nsec,
> uint32_t gtsc_khz, uint32_t incarnation)
> {
> - if ( is_idle_domain(d) || (d->domain_id == 0) )
> + if ( is_idle_domain(d) || is_hardware_domain(d) )
> {
> d->arch.vtsc = 0;
> return;
> @@ -1943,7 +1943,7 @@ static void dump_softtsc(unsigned char key)
> "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count);
> for_each_domain ( d )
> {
> - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT )
> + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT )
> continue;
> printk("dom%u%s: mode=%d",d->domain_id,
> is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode);
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index ed4ae2d..21c8b63 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -732,18 +732,19 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t
> sub_idx,
> void pv_cpuid(struct cpu_user_regs *regs)
> {
> uint32_t a, b, c, d;
> + struct vcpu *curr = current;
>
> a = regs->eax;
> b = regs->ebx;
> c = regs->ecx;
> d = regs->edx;
>
> - if ( current->domain->domain_id != 0 )
> + if ( !is_control_domain(curr->domain) &&
> !is_hardware_domain(curr->domain) )
> {
> unsigned int cpuid_leaf = a, sub_leaf = c;
>
> if ( !cpuid_hypervisor_leaves(a, c, &a, &b, &c, &d) )
> - domain_cpuid(current->domain, a, c, &a, &b, &c, &d);
> + domain_cpuid(curr->domain, a, c, &a, &b, &c, &d);
>
> switch ( cpuid_leaf )
> {
> @@ -751,15 +752,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
> {
> unsigned int _eax, _ebx, _ecx, _edx;
> /* EBX value of main leaf 0 depends on enabled xsave features
> */
> - if ( sub_leaf == 0 && current->arch.xcr0 )
> + if ( sub_leaf == 0 && curr->arch.xcr0 )
> {
> /* reset EBX to default value first */
> b = XSTATE_AREA_MIN_SIZE;
> for ( sub_leaf = 2; sub_leaf < 63; sub_leaf++ )
> {
> - if ( !(current->arch.xcr0 & (1ULL << sub_leaf)) )
> + if ( !(curr->arch.xcr0 & (1ULL << sub_leaf)) )
> continue;
> - domain_cpuid(current->domain, cpuid_leaf, sub_leaf,
> + domain_cpuid(curr->domain, cpuid_leaf, sub_leaf,
> &_eax, &_ebx, &_ecx, &_edx);
> if ( (_eax + _ebx) > b )
> b = _eax + _ebx;
> @@ -796,7 +797,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
> __clear_bit(X86_FEATURE_DS, &d);
> __clear_bit(X86_FEATURE_ACC, &d);
> __clear_bit(X86_FEATURE_PBE, &d);
> - if ( is_pvh_vcpu(current) )
> + if ( is_pvh_vcpu(curr) )
> __clear_bit(X86_FEATURE_MTRR, &d);
>
> __clear_bit(X86_FEATURE_DTES64 % 32, &c);
> @@ -805,7 +806,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
> __clear_bit(X86_FEATURE_VMXE % 32, &c);
> __clear_bit(X86_FEATURE_SMXE % 32, &c);
> __clear_bit(X86_FEATURE_TM2 % 32, &c);
> - if ( is_pv_32bit_vcpu(current) )
> + if ( is_pv_32bit_vcpu(curr) )
> __clear_bit(X86_FEATURE_CX16 % 32, &c);
> __clear_bit(X86_FEATURE_XTPR % 32, &c);
> __clear_bit(X86_FEATURE_PDCM % 32, &c);
> @@ -844,12 +845,12 @@ void pv_cpuid(struct cpu_user_regs *regs)
>
> case 0x80000001:
> /* Modify Feature Information. */
> - if ( is_pv_32bit_vcpu(current) )
> + if ( is_pv_32bit_vcpu(curr) )
> {
> __clear_bit(X86_FEATURE_LM % 32, &d);
> __clear_bit(X86_FEATURE_LAHF_LM % 32, &c);
> }
> - if ( is_pv_32on64_vcpu(current) &&
> + if ( is_pv_32on64_vcpu(curr) &&
> boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
> __clear_bit(X86_FEATURE_SYSCALL % 32, &d);
> __clear_bit(X86_FEATURE_PAGE1GB % 32, &d);
> @@ -1860,7 +1861,7 @@ static inline uint64_t guest_misc_enable(uint64_t val)
> static int is_cpufreq_controller(struct domain *d)
> {
> return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
> - (d->domain_id == 0));
> + is_hardware_domain(d));
> }
>
> #include "x86_64/mmconfig.h"
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index ad8a1b6..b5868a5 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -237,7 +237,7 @@ struct domain *domain_create(
> else if ( domcr_flags & DOMCRF_pvh )
> d->guest_type = guest_type_pvh;
>
> - if ( domid == 0 )
> + if ( is_hardware_domain(d) )
> {
> d->is_pinned = opt_dom0_vcpus_pin;
> d->disable_migrate = 1;
> @@ -262,7 +262,7 @@ struct domain *domain_create(
> d->is_paused_by_controller = 1;
> atomic_inc(&d->pause_count);
>
> - if ( domid )
> + if ( !is_hardware_domain(d) )
> d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
> else
> d->nr_pirqs = nr_static_irqs + extra_dom0_irqs;
> @@ -594,7 +594,7 @@ void domain_shutdown(struct domain *d, u8 reason)
> d->shutdown_code = reason;
> reason = d->shutdown_code;
>
> - if ( d->domain_id == 0 )
> + if ( is_hardware_domain(d) )
> dom0_shutdown(reason);
>
> if ( d->is_shutting_down )
> diff --git a/xen/common/kernel.c b/xen/common/kernel.c
> index b371f8f..7e83353 100644
> --- a/xen/common/kernel.c
> +++ b/xen/common/kernel.c
> @@ -303,7 +303,7 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void)
> arg)
> (1U << XENFEAT_auto_translated_physmap);
> if ( supervisor_mode_kernel )
> fi.submap |= 1U << XENFEAT_supervisor_mode_kernel;
> - if ( current->domain == dom0 )
> + if ( is_hardware_domain(current->domain) )
> fi.submap |= 1U << XENFEAT_dom0;
> #ifdef CONFIG_X86
> switch ( d->guest_type )
> diff --git a/xen/common/xenoprof.c b/xen/common/xenoprof.c
> index 52ab00d..3c30c3e 100644
> --- a/xen/common/xenoprof.c
> +++ b/xen/common/xenoprof.c
> @@ -601,9 +601,15 @@ static int xenoprof_op_init(XEN_GUEST_HANDLE_PARAM(void)
> arg)
> xenoprof_init.cpu_type)) )
> return ret;
>
> + /* Only the hardware domain may become the primary profiler here
> because
> + * there is currently no cleanup of xenoprof_primary_profiler or
> associated
> + * profiling state when the primary profiling domain is shut down or
> + * crashes. Once a better cleanup method is present, it will be
> possible to
> + * allow another domain to be the primary profiler.
> + */
> xenoprof_init.is_primary =
> ((xenoprof_primary_profiler == d) ||
> - ((xenoprof_primary_profiler == NULL) && (d->domain_id == 0)));
> + ((xenoprof_primary_profiler == NULL) && is_hardware_domain(d)));
> if ( xenoprof_init.is_primary )
> xenoprof_primary_profiler = current->domain;
>
> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> index c26aabc..cf67494 100644
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -122,7 +122,7 @@ static void amd_iommu_setup_domain_device(
>
> BUG_ON( !hd->root_table || !hd->paging_mode || !iommu->dev_table.buffer
> );
>
> - if ( iommu_passthrough && (domain->domain_id == 0) )
> + if ( iommu_passthrough && is_hardware_domain(domain) )
> valid = 0;
>
> if ( ats_enabled )
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index 0cf0748..54d891f 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -817,7 +817,7 @@ static void iommu_dump_p2m_table(unsigned char key)
> ops = iommu_get_ops();
> for_each_domain(d)
> {
> - if ( !d->domain_id )
> + if ( is_hardware_domain(d) )
> continue;
>
> if ( iommu_use_hap_pt(d) )
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index beb3723..9c8e4a3 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1327,7 +1327,7 @@ int domain_context_mapping_one(
> return res;
> }
>
> - if ( iommu_passthrough && (domain->domain_id == 0) )
> + if ( iommu_passthrough && is_hardware_domain(domain) )
> {
> context_set_translation_type(*context, CONTEXT_TT_PASS_THRU);
> agaw = level_to_agaw(iommu->nr_pt_levels);
> @@ -1723,7 +1723,7 @@ static int intel_iommu_map_page(
> return 0;
>
> /* do nothing if dom0 and iommu supports pass thru */
> - if ( iommu_passthrough && (d->domain_id == 0) )
> + if ( iommu_passthrough && is_hardware_domain(d) )
> return 0;
>
> spin_lock(&hd->mapping_lock);
> @@ -1767,7 +1767,7 @@ static int intel_iommu_map_page(
> static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
> {
> /* Do nothing if dom0 and iommu supports pass thru. */
> - if ( iommu_passthrough && (d->domain_id == 0) )
> + if ( iommu_passthrough && is_hardware_domain(d) )
> return 0;
>
> dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
> @@ -1950,8 +1950,9 @@ static int intel_iommu_remove_device(u8 devfn, struct
> pci_dev *pdev)
> continue;
>
> /*
> - * If the device belongs to dom0, and it has RMRR, don't remove
> - * it from dom0, because BIOS may use RMRR at booting time.
> + * If the device belongs to the hardware domain, and it has RMRR,
> don't
> + * remove it from the hardware domain, because BIOS may use RMRR at
> + * booting time.
> */
> if ( is_hardware_domain(pdev->domain) )
> return 0;
> diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c
> b/xen/drivers/passthrough/vtd/x86/vtd.c
> index ca17cb1..f271a42 100644
> --- a/xen/drivers/passthrough/vtd/x86/vtd.c
> +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
> @@ -111,7 +111,7 @@ void __init iommu_set_dom0_mapping(struct domain *d)
> {
> unsigned long i, j, tmp, top;
>
> - BUG_ON(d->domain_id != 0);
> + BUG_ON(!is_hardware_domain(d));
>
> top = max(max_pdx, pfn_to_pdx(0xffffffffUL >> PAGE_SHIFT) + 1);
>
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 00f0eba..466949f 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -775,10 +775,10 @@ void watchdog_domain_destroy(struct domain *d);
> /*
> * Use this check when the following are both true:
> * - Using this feature or interface requires full access to the hardware
> - * (that is, this is would not be suitable for a driver domain)
> + * (that is, this would not be suitable for a driver domain)
> * - There is never a reason to deny dom0 access to this
> */
> -#define is_hardware_domain(_d) ((_d)->is_privileged)
> +#define is_hardware_domain(_d) ((_d)->domain_id == 0)
>
> /* This check is for functionality specific to a control domain */
> #define is_control_domain(_d) ((_d)->is_privileged)
> --
> 1.8.5.3
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |