[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] HVM support for e820_host (Was: Bug: Limitation of <=2GB RAM in domU persists with 4.3.0)
Hmm...gcc -O2 -fomit-frame-pointer -m64 -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -DNDEBUG -fno-builtin -fno-common -Wredundant-decls -iwithprefix include -Werror -Wno-pointer-arith -pipe -I/root/rpmbuild/BUILD/xen-4.3.0/xen/include -I/root/rpmbuild/BUILD/xen-4.3.0/xen/include/asm-x86/mach-generic -I/root/rpmbuild/BUILD/xen-4.3.0/xen/include/asm-x86/mach-default -msoft-float -fno-stack-protector -fno-exceptions -Wnested-externs -mno-red-zone -mno-sse -fpic -fno-asynchronous-unwind-tables -DGCC_HAS_VISIBILITY_ATTRIBUTE -nostdinc -g -D__XEN__ -include /root/rpmbuild/BUILD/xen-4.3.0/xen/include/xen/config.h -DHAS_ACPI -DHAS_GDBSX -DHAS_PASSTHROUGH -MMD -MF .debug.o.d -c debug.c -o debug.o gcc -O2 -fomit-frame-pointer -m64 -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -DNDEBUG -fno-builtin -fno-common -Wredundant-decls -iwithprefix include -Werror -Wno-pointer-arith -pipe -I/root/rpmbuild/BUILD/xen-4.3.0/xen/include -I/root/rpmbuild/BUILD/xen-4.3.0/xen/include/asm-x86/mach-generic -I/root/rpmbuild/BUILD/xen-4.3.0/xen/include/asm-x86/mach-default -msoft-float -fno-stack-protector -fno-exceptions -Wnested-externs -mno-red-zone -mno-sse -fpic -fno-asynchronous-unwind-tables -DGCC_HAS_VISIBILITY_ATTRIBUTE -nostdinc -g -D__XEN__ -include /root/rpmbuild/BUILD/xen-4.3.0/xen/include/xen/config.h -DHAS_ACPI -DHAS_GDBSX -DHAS_PASSTHROUGH -MMD -MF .domain.o.d -c domain.c -o domain.o domain.c: In function âarch_domain_destroyâ: domain.c:595: error: âstruct pv_domainâ has no member named âe820â make[4]: *** [domain.o] Error 1 It would seem you omitted this block from the original patch: === @@ -592,8 +592,8 @@ void arch_domain_destroy(struct domain *d) { if ( is_hvm_domain(d) ) hvm_domain_destroy(d); - else - xfree(d->arch.pv_domain.e820); + + xfree(d->arch.e820); free_domain_pirqs(d); if ( !is_idle_domain(d) ) ===Was that intentional? Does that block look OK to you? Should I re-add it? GordanOn Wed, 4 Sep 2013 22:04:42 -0400, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote: On Wed, Sep 04, 2013 at 02:11:06PM +0100, Gordan Bobic wrote:I have this at the point where it actually builds. Otherwise completely untested (will do that later today). Attached are: 1) libxl patch Modified from the original patch to _not_ implicitly enable e820_host when PCI devices are passed. 2) Mukesh's hypervisor e820 patch from here: http://lists.xen.org/archives/html/xen-devel/2013-05/msg01603.html Modified slightly to attempt to address Jan's comment on the same thread, and to adjust the diff line pointers to match against 4.3.0 release code.I think that was the old version. I spotted a bug in it that was causing a hang. And also the one that explains why libxl would refuse to setup the E820. The problem was that in the XENMEM_set_memory_map there was a check to make sure that the guest launched was not HVM. Also there was bug in the initial domain creation where the spinlock was only set for PV and not for HVM.3) A patch based on Konrad's earlier in this thread, with a few additions and changes to make it all compile. Some peer review would be most welcome - this is my first venture into Xen code, so please do assume that I have no idea what I'm doing at the moment. :) I added yet another E820MAX #define, this time to tools/firmware/hvmloader/e820.h If there is a better place to #include that via from e820.c, please point me in the right direction.I think I saw that #define in tools/libxc/xenctrl.h. But sincethe tools/firmware cannot link to the libxc (b/c it is a MinicontainedOS) I believe just having the #define in hvmloader/e820.h is the right call. Good first pass. I altered it a bit and got in the HVM guest the E820 entries printed out. Here is a big giant diff: diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c index 2e05e93..3c80241 100644 --- a/tools/firmware/hvmloader/e820.c +++ b/tools/firmware/hvmloader/e820.c @@ -22,6 +22,9 @@ #include "config.h" #include "util.h" +#include "hypercall.h" +#include <xen/memory.h> +#include <errno.h> void dump_e820_table(struct e820entry *e820, unsigned int nr) { @@ -74,10 +77,20 @@ int build_e820_table(struct e820entry *e820, unsigned int bios_image_base) { unsigned int nr = 0; + struct xen_memory_map op; + struct e820entry map[E820MAX]; + int rc; if ( !lowmem_reserved_base ) lowmem_reserved_base = 0xA0000; + set_xen_guest_handle(op.buffer, map); + + rc = hypercall_memory_op ( XENMEM_memory_map, &op); + if ( rc != -ENOSYS) { /* It works!? */ + printf("%s:%d got %d op.nr_entries \n", __func__, __LINE__, op.nr_entries); + dump_e820_table(&map[0], op.nr_entries); + } /* Lowmem must be at least 512K to keep Windows happy) */ ASSERT ( lowmem_reserved_base > 512<<10 ); diff --git a/tools/firmware/hvmloader/e820.h b/tools/firmware/hvmloader/e820.h index b2ead7f..2fa700d 100644 --- a/tools/firmware/hvmloader/e820.h +++ b/tools/firmware/hvmloader/e820.h @@ -8,6 +8,9 @@ #define E820_RESERVED 2 #define E820_ACPI 3 #define E820_NVS 4 +#define E820_UNUSABLE 5 + +#define E820MAX 128 struct e820entry { uint64_t addr; diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 0c32d0b..d8e2346 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c@@ -208,6 +208,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,libxl_defbool_setdefault(&b_info->disable_migrate, false); + libxl_defbool_setdefault(&b_info->e820_host, false); + switch (b_info->type) { case LIBXL_DOMAIN_TYPE_HVM: if (b_info->shadow_memkb == LIBXL_MEMKB_DEFAULT)@@ -280,7 +282,6 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,break; case LIBXL_DOMAIN_TYPE_PV: - libxl_defbool_setdefault(&b_info->u.pv.e820_host, false); if (b_info->shadow_memkb == LIBXL_MEMKB_DEFAULT) b_info->shadow_memkb = 0; if (b_info->u.pv.slack_memkb == LIBXL_MEMKB_DEFAULT)diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idlindex 85341a0..fd6389a 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl@@ -299,6 +299,8 @@ libxl_domain_build_info = Struct("domain_build_info",[("irqs", Array(uint32, "num_irqs")), ("iomem", Array(libxl_iomem_range, "num_iomem")), ("claim_mode", libxl_defbool), + # Use host's E820 for PCI passthrough. + ("e820_host", libxl_defbool), ("u", KeyedUnion(None, libxl_domain_type, "type", [("hvm", Struct(None, [("firmware", string), ("bios", libxl_bios_type),@@ -345,8 +347,6 @@ libxl_domain_build_info = Struct("domain_build_info",[("cmdline", string), ("ramdisk", string),("features", string, {'const': True}), - # Use host's E820 for PCI passthrough.- ("e820_host", libxl_defbool), ])), ("invalid", Struct(None, [])), ], keyvar_init_val = "LIBXL_DOMAIN_TYPE_INVALID")), diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c index a78c91d..94515a5 100644 --- a/tools/libxl/libxl_x86.c +++ b/tools/libxl/libxl_x86.c @@ -216,28 +216,41 @@ static int libxl__e820_alloc(libxl__gc *gc, uint32_t domid, struct e820entry map[E820MAX]; libxl_domain_build_info *b_info;- if (d_config == NULL || d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM)- return ERROR_INVAL; - b_info = &d_config->b_info; - if (!libxl_defbool_val(b_info->u.pv.e820_host)) + if (!libxl_defbool_val(b_info->e820_host)) { + LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.",__func__, __LINE__); return ERROR_INVAL; - + } rc = xc_get_machine_memory_map(ctx->xch, map, E820MAX); if (rc < 0) { + LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.",__func__, __LINE__); errno = rc; return ERROR_FAIL; } nr = rc; - rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb, - (b_info->max_memkb - b_info->target_memkb) + - b_info->u.pv.slack_memkb); + LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.nr:%d",__func__, __LINE__, nr); + if (d_config == NULL || d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM) { + rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,+ (b_info->max_memkb - b_info->target_memkb));+ LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.rc%d",__func__, __LINE__, rc); + } else if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV) { + rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,+ (b_info->max_memkb - b_info->target_memkb) ++ b_info->u.pv.slack_memkb); + LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.rc%d",__func__, __LINE__, rc); + } + + LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.rc%d",__func__, __LINE__, rc); if (rc) return ERROR_FAIL; + LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.rc%d, nr:%d",__func__, __LINE__, rc, nr); + rc = xc_domain_set_memory_map(ctx->xch, domid, map, nr); + LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.rc%d",__func__, __LINE__, rc); if (rc < 0) { + LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s:%d.rc%d",__func__, __LINE__, rc); errno = rc; return ERROR_FAIL; } @@ -296,8 +309,7 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config, xc_shadow_control(ctx->xch, domid, XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION, NULL, 0, &shadow, 0, NULL); } - if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_PV && - libxl_defbool_val(d_config->b_info.u.pv.e820_host)) { + if (libxl_defbool_val(d_config->b_info.e820_host)) { ret = libxl__e820_alloc(gc, domid, d_config); if (ret) { LIBXL__LOG_ERRNO(gc->owner, LIBXL__LOG_ERROR, diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index ed99622..d98ca24 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -1291,11 +1291,7 @@ skip_vfb: if (!xlu_cfg_get_long (config, "pci_permissive", &l, 0)) pci_permissive = l;- /* To be reworked (automatically enabled) once the auto ballooning- * after guest starts is done (with PCI devices passed in). */ - if (c_info->type == LIBXL_DOMAIN_TYPE_PV) { - xlu_cfg_get_defbool(config, "e820_host", &b_info->u.pv.e820_host, 0); - } + xlu_cfg_get_defbool(config, "e820_host", &b_info->e820_host, 0); if (!xlu_cfg_get_list (config, "pci", &pcis, 0, 0)) { d_config->num_pcidevs = 0; @@ -1314,7 +1310,7 @@ skip_vfb: d_config->num_pcidevs++; }if (d_config->num_pcidevs && c_info->type == LIBXL_DOMAIN_TYPE_PV)- libxl_defbool_set(&b_info->u.pv.e820_host, true); + libxl_defbool_set(&b_info->e820_host, true); } switch (xlu_cfg_get_list(config, "cpuid", &cpuids, 0, 1)) { diff --git a/tools/libxl/xl_sxp.c b/tools/libxl/xl_sxp.c index a16a025..f34f0ba 100644 --- a/tools/libxl/xl_sxp.c +++ b/tools/libxl/xl_sxp.c @@ -87,6 +87,10 @@ void printf_info_sexp(int domid, libxl_domain_config *d_config) } } + printf("\t(e820_host %s)\n", + libxl_defbool_to_string(b_info->e820_host)); + + printf("\t(image\n"); switch (c_info->type) { case LIBXL_DOMAIN_TYPE_HVM: @@ -150,8 +154,6 @@ void printf_info_sexp(int domid, libxl_domain_config *d_config) printf("\t\t\t(kernel %s)\n", b_info->u.pv.kernel); printf("\t\t\t(cmdline %s)\n", b_info->u.pv.cmdline); printf("\t\t\t(ramdisk %s)\n", b_info->u.pv.ramdisk); - printf("\t\t\t(e820_host %s)\n", - libxl_defbool_to_string(b_info->u.pv.e820_host)); printf("\t\t)\n"); break; default: diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 874742c..4796221 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -566,10 +566,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) { /* 64-bit PV guest by default. */ d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; - - spin_lock_init(&d->arch.pv_domain.e820_lock); } + spin_lock_init(&d->arch.e820_lock); /* initialize default tsc behavior in case tools don't */ tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0); spin_lock_init(&d->arch.vtsc_lock); diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 54b1e6a..6c9b58c 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3142,10 +3142,10 @@ static long hvm_memory_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) switch ( cmd & MEMOP_CMD_MASK ) { - case XENMEM_memory_map: case XENMEM_machine_memory_map: case XENMEM_machphys_mapping: return -ENOSYS; + case XENMEM_memory_map: case XENMEM_decrease_reservation: rc = do_memory_op(cmd, arg);current->domain->arch.hvm_domain.qemu_mapcache_invalidate = 1;@@ -3217,10 +3217,10 @@ static long hvm_memory_op_compat32(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) switch ( cmd & MEMOP_CMD_MASK ) { - case XENMEM_memory_map: case XENMEM_machine_memory_map: case XENMEM_machphys_mapping: return -ENOSYS; + case XENMEM_memory_map: case XENMEM_decrease_reservation: rc = compat_memory_op(cmd, arg);current->domain->arch.hvm_domain.qemu_mapcache_invalidate = 1;diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index e7f0e13..4c3ce9a 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4740,19 +4740,13 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg) return rc; } - if ( is_hvm_domain(d) ) - { - rcu_unlock_domain(d); - return -EPERM; - } - e820 = xmalloc_array(e820entry_t, fmap.map.nr_entries); if ( e820 == NULL ) { rcu_unlock_domain(d); return -ENOMEM; } - +if ( copy_from_guest(e820, fmap.map.buffer, fmap.map.nr_entries) ){ xfree(e820); @@ -4760,11 +4754,11 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg) return -EFAULT; } - spin_lock(&d->arch.pv_domain.e820_lock); - xfree(d->arch.pv_domain.e820); - d->arch.pv_domain.e820 = e820; - d->arch.pv_domain.nr_e820 = fmap.map.nr_entries; - spin_unlock(&d->arch.pv_domain.e820_lock); + spin_lock(&d->arch.e820_lock); + xfree(d->arch.e820); + d->arch.e820 = e820; + d->arch.nr_e820 = fmap.map.nr_entries; + spin_unlock(&d->arch.e820_lock); rcu_unlock_domain(d); return rc; @@ -4778,26 +4772,26 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg) if ( copy_from_guest(&map, arg, 1) ) return -EFAULT; - spin_lock(&d->arch.pv_domain.e820_lock); + spin_lock(&d->arch.e820_lock); /* Backwards compatibility. */ - if ( (d->arch.pv_domain.nr_e820 == 0) || - (d->arch.pv_domain.e820 == NULL) ) + if ( (d->arch.nr_e820 == 0) || + (d->arch.e820 == NULL) ) { - spin_unlock(&d->arch.pv_domain.e820_lock); + spin_unlock(&d->arch.e820_lock); return -ENOSYS; }- map.nr_entries = min(map.nr_entries, d->arch.pv_domain.nr_e820);- if ( copy_to_guest(map.buffer, d->arch.pv_domain.e820, + map.nr_entries = min(map.nr_entries, d->arch.nr_e820); + if ( copy_to_guest(map.buffer, d->arch.e820, map.nr_entries) || __copy_to_guest(arg, &map, 1) ) { - spin_unlock(&d->arch.pv_domain.e820_lock); + spin_unlock(&d->arch.e820_lock); return -EFAULT; } - spin_unlock(&d->arch.pv_domain.e820_lock); + spin_unlock(&d->arch.e820_lock); return 0; }diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.hindex d79464d..c3f9f8e 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -234,11 +234,6 @@ struct pv_domain /* map_domain_page() mapping cache. */ struct mapcache_domain mapcache; - - /* Pseudophysical e820 map (XENMEM_memory_map). */ - spinlock_t e820_lock; - struct e820entry *e820; - unsigned int nr_e820; }; struct arch_domain @@ -313,6 +308,11 @@ struct arch_domain(possibly other cases in the future */uint64_t vtsc_kerncount; /* for hvm, counts all vtsc */ uint64_t vtsc_usercount; /* not used for hvm */ + + /* Pseudophysical e820 map (XENMEM_memory_map). */ + spinlock_t e820_lock; + struct e820entry *e820; + unsigned int nr_e820; } __cacheline_aligned; #define has_arch_pdevs(d) (!list_empty(&(d)->arch.pdev_list)) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |