[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [RFC XEN PATCH v3 00/39] Add vNVDIMM support to HVM domains
Overview ================== (RFC v2 can be found at https://lists.xen.org/archives/html/xen-devel/2017-03/msg02401.html) Well, this RFC v3 changes and inflates a lot from previous versions. The primary changes are listed below, most of which are to simplify the first implementation and avoid additional inflation. 1. Drop the support to maintain the frametable and M2P table of PMEM in RAM. In the future, we may add this support back. 2. Hide host NFIT and deny access to host PMEM from Dom0. In other words, the kernel NVDIMM driver is loaded in Dom 0 and existing management utilities (e.g. ndctl) do not work in Dom0 anymore. This is to workaround the inferences of PMEM access between Dom0 and Xen hypervisor. In the future, we may add a stub driver in Dom0 which will hold the PMEM pages being used by Xen hypervisor and/or other domains. 3. As there is no NVDIMM driver and management utilities in Dom0 now, we cannot easily specify an area of host NVDIMM (e.g., by /dev/pmem0) and manage NVDIMM in Dom0 (e.g., creating labels). Instead, we have to specify the exact MFNs of host PMEM pages in xl domain configuration files and the newly added Xen NVDIMM management utility xen-ndctl. If there are indeed some tasks that have to be handled by existing driver and management utilities, such as recovery from hardware failures, they have to be accomplished out of Xen environment. After 2. is solved in the future, we would be able to make existing driver and management utilities work in Dom0 again. All patches can be found at Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 How to Test ================== 1. Build and install this patchset with the associated QEMU patches. 2. Use xen-ndctl to get a list of PMEM regions detected by Xen hypervisor, e.g. # xen-ndctl list --raw Raw PMEM regions: 0: MFN 0x480000 - 0x880000, PXM 3 which indicates a PMEM region is present at MFN 0x480000 - 0x880000. 3. Setup a management area to manage the guest data areas. # xen-ndctl setup-mgmt 0x480000 0x4c0000 # xen-ndctl list --mgmt Management PMEM regions: 0: MFN 0x480000 - 0x4c0000, used 0xc00 The first command setup the PMEM area in MFN 0x480000 - 0x4c0000 (1GB) as a management area, which is also used to manage itself. The second command list all management areas, and 'used' field shows the number of pages has been used from the beginning of that area. The size ratio between a management area and areas that it manages (including itself) should be at least 1 : 100 (i.e., 32 bytes for frametable and 8 bytes for M2P table per page). The size of a management area as well as a data area below is currently restricted to 256 Mbytes or multiples. The alignment is restricted to 2 Mbytes or multiples. 4. Setup a data area that can be used by guest. # xen-ndctl setup-data 0x4c0000 0x880000 0x480c00 0x4c0000 # xen-ndctl list --data Data PMEM regions: 0: MFN 0x4c0000 - 0x880000, MGMT MFN 0x480c00 - 0x48b000 The first command setup the remaining PMEM pages from MFN 0x4c0000 to 0x880000 as a data area. The management area MFN from 0x480c00 to 0x4c0000 is specified to manage this data area. The actual used management pages can be found by the second command. 5. Assign a data pages to a HVM domain by adding the following line in the domain configuration. vnvdimms = [ 'type=mfn, backend=0x4c0000, nr_pages=0x100000' ] which assigns 4 Gbytes PMEM starting from MFN 0x4c0000 to that domain. A 4 Gbytes PMEM should be present in guest (e.g., as /dev/pmem0) after above steps of setup. There can be one or multiple entries in vnvdimms, which do not overlap with each other. Sharing the PMEM pages between domains are not supported, so PMEM pages assigned to each domain should not overlap with each other. Patch Organization ================== This RFC v3 is composed of following 6 parts per the task they are going to solve. The tool stack patches are collected and separated into each part. - Part 0. Bug fix and code cleanup [01/39] x86_64/mm: fix the PDX group check in mem_hotadd_check() [02/39] x86_64/mm: drop redundant MFN to page conventions in cleanup_frame_table() [03/39] x86_64/mm: avoid cleaning the unmapped frame table - Part 1. Detect host PMEM Detect host PMEM via NFIT. No frametable and M2P table for them are created in this part. [04/39] xen/common: add Kconfig item for pmem support [05/39] x86/mm: exclude PMEM regions from initial frametable [06/39] acpi: probe valid PMEM regions via NFIT [07/39] xen/pmem: register valid PMEM regions to Xen hypervisor [08/39] xen/pmem: hide NFIT and deny access to PMEM from Dom0 [09/39] xen/pmem: add framework for hypercall XEN_SYSCTL_nvdimm_op [10/39] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_rgions_nr [11/39] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_regions [12/39] tools/xen-ndctl: add NVDIMM management util 'xen-ndctl' [13/39] tools/xen-ndctl: add command 'list' - Part 2. Setup host PMEM for management and guest data usage Allow users or admins in Dom0 to setup host PMEM pages for management and guest data usages. * Management PMEM pages are used to store the frametable and M2P of PMEM pages (including themselves), and never mapped to guest. * Guest data PMEM pages can be mapped to guest and used as the backend storage of virtual NVDIMM devices. [14/39] x86_64/mm: refactor memory_add() [15/39] x86_64/mm: allow customized location of extended frametable and M2P table [16/39] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_setup to setup management PMEM region [17/39] tools/xen-ndctl: add command 'setup-mgmt' [18/39] xen/pmem: support PMEM_REGION_TYPE_MGMT for XEN_SYSCTL_nvdimm_pmem_get_regions_nr [19/39] xen/pmem: support PMEM_REGION_TYPE_MGMT for XEN_SYSCTL_nvdimm_pmem_get_regions [20/39] tools/xen-ndctl: add option '--mgmt' to command 'list' [21/39] xen/pmem: support setup PMEM region for guest data usage [22/39] tools/xen-ndctl: add command 'setup-data' [23/39] xen/pmem: support PMEM_REGION_TYPE_DATA for XEN_SYSCTL_nvdimm_pmem_get_regions_nr [24/39] xen/pmem: support PMEM_REGION_TYPE_DATA for XEN_SYSCTL_nvdimm_pmem_get_regions [25/39] tools/xen-ndctl: add option '--data' to command 'list' - Part 3. Hypervisor support to map host PMEM pages to HVM domain [26/39] xen/pmem: add function to map PMEM pages to HVM domain [27/39] xen/pmem: release PMEM pages on HVM domain destruction [28/39] xen: add hypercall XENMEM_populate_pmem_map - Part 4. Pass ACPI from QEMU to Xen Guest NFIT and NVDIMM namespace devices are built by QEMU. This part implements the interface for the device model to pass its ACPI (DM ACPI) to Xen, and loads DM ACPI. A simple blacklist mechanism is added to reject DM ACPI tables and namespace devices that may conflict with those built by Xen itself. [29/39] tools: reserve guest memory for ACPI from device model [30/39] tools/libacpi: expose the minimum alignment used by mem_ops.alloc [31/39] tools/libacpi: add callback to translate GPA to GVA [32/39] tools/libacpi: add callbacks to access XenStore [33/39] tools/libacpi: add a simple AML builder [34/39] tools/libacpi: add DM ACPI blacklists [35/39] tools/libacpi: load ACPI built by the device model - Part 5. Remaining tool stack changes Add xl domain configuration and generate new QEMU options for vNVDIMM. [36/39] tools/xl: add xl domain configuration for virtual NVDIMM devices [37/39] tools/libxl: allow aborting domain creation on fatal QMP init errors [38/39] tools/libxl: initiate PMEM mapping via QMP callback [39/39] tools/libxl: build qemu options from xl vNVDIMM configs .gitignore | 1 + docs/man/xl.cfg.pod.5.in | 33 ++ tools/firmware/hvmloader/Makefile | 3 +- tools/firmware/hvmloader/util.c | 75 ++++ tools/firmware/hvmloader/util.h | 10 + tools/firmware/hvmloader/xenbus.c | 44 +- tools/flask/policy/modules/dom0.te | 2 +- tools/flask/policy/modules/xen.if | 2 +- tools/libacpi/acpi2_0.h | 2 + tools/libacpi/aml_build.c | 326 ++++++++++++++ tools/libacpi/aml_build.h | 116 +++++ tools/libacpi/build.c | 330 ++++++++++++++ tools/libacpi/libacpi.h | 23 + tools/libxc/include/xc_dom.h | 1 + tools/libxc/include/xenctrl.h | 88 ++++ tools/libxc/xc_dom_x86.c | 13 + tools/libxc/xc_domain.c | 15 + tools/libxc/xc_misc.c | 157 +++++++ tools/libxl/Makefile | 5 +- tools/libxl/libxl.h | 5 + tools/libxl/libxl_create.c | 4 +- tools/libxl/libxl_dm.c | 81 +++- tools/libxl/libxl_dom.c | 25 ++ tools/libxl/libxl_qmp.c | 139 +++++- tools/libxl/libxl_types.idl | 16 + tools/libxl/libxl_vnvdimm.c | 79 ++++ tools/libxl/libxl_vnvdimm.h | 30 ++ tools/libxl/libxl_x86_acpi.c | 36 ++ tools/misc/Makefile | 4 + tools/misc/xen-ndctl.c | 399 +++++++++++++++++ tools/xl/xl_parse.c | 125 +++++- tools/xl/xl_vmcontrol.c | 15 +- xen/arch/x86/acpi/boot.c | 4 + xen/arch/x86/acpi/power.c | 7 + xen/arch/x86/dom0_build.c | 5 + xen/arch/x86/domain.c | 32 +- xen/arch/x86/mm.c | 123 ++++- xen/arch/x86/setup.c | 4 + xen/arch/x86/shutdown.c | 3 + xen/arch/x86/tboot.c | 4 + xen/arch/x86/x86_64/mm.c | 309 +++++++++---- xen/common/Kconfig | 8 + xen/common/Makefile | 1 + xen/common/compat/memory.c | 1 + xen/common/domain.c | 3 + xen/common/kexec.c | 3 + xen/common/memory.c | 44 ++ xen/common/pmem.c | 769 ++++++++++++++++++++++++++++++++ xen/common/sysctl.c | 9 + xen/drivers/acpi/Makefile | 2 + xen/drivers/acpi/nfit.c | 298 +++++++++++++ xen/include/acpi/actbl1.h | 69 +++ xen/include/asm-x86/domain.h | 1 + xen/include/asm-x86/mm.h | 10 +- xen/include/public/hvm/hvm_xs_strings.h | 8 + xen/include/public/memory.h | 14 +- xen/include/public/sysctl.h | 100 ++++- xen/include/xen/acpi.h | 10 + xen/include/xen/pmem.h | 76 ++++ xen/include/xen/sched.h | 3 + xen/include/xsm/dummy.h | 11 + xen/include/xsm/xsm.h | 12 + xen/xsm/dummy.c | 4 + xen/xsm/flask/hooks.c | 17 + xen/xsm/flask/policy/access_vectors | 4 + 65 files changed, 4044 insertions(+), 128 deletions(-) create mode 100644 tools/libacpi/aml_build.c create mode 100644 tools/libacpi/aml_build.h create mode 100644 tools/libxl/libxl_vnvdimm.c create mode 100644 tools/libxl/libxl_vnvdimm.h create mode 100644 tools/misc/xen-ndctl.c create mode 100644 xen/common/pmem.c create mode 100644 xen/drivers/acpi/nfit.c create mode 100644 xen/include/xen/pmem.h -- 2.14.1 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |