[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen

To: "Haozhong Zhang" <haozhong.zhang@xxxxxxxxx>
From: "Jan Beulich" <JBeulich@xxxxxxxx>
Date: Tue, 02 Aug 2016 08:46:15 -0600
Cc: Juergen Gross <JGross@xxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx>
Delivery-date: Tue, 02 Aug 2016 14:46:34 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 18.07.16 at 02:29, <haozhong.zhang@xxxxxxxxx> wrote:
> 4.2.2 Detection of Host pmem Devices
> 
>  The detection and initialize host pmem devices require a non-trivial
>  driver to interact with the corresponding ACPI namespace devices,
>  parse namespace labels and make necessary recovery actions. Instead
>  of duplicating the comprehensive Linux pmem driver in Xen hypervisor,
>  our designs leaves it to Dom0 Linux and let Dom0 Linux report
>  detected host pmem devices to Xen hypervisor.
> 
>  Our design takes following steps to detect host pmem devices when Xen
>  boots.
>  (1) As booting on bare metal, host pmem devices are detected by Dom0
>      Linux NVDIMM driver.
> 
>  (2) Our design extends Linux NVDIMM driver to reports SPA's and sizes
>      of the pmem devices and reserved areas to Xen hypervisor via a
>      new hypercall.
> 
>  (3) Xen hypervisor then checks
>      - whether SPA and size of the newly reported pmem device is overlap
>        with any previously reported pmem devices;

... or with system RAM.

>      - whether the reserved area can fit in the pmem device and is
>        large enough to hold page_info structs for itself.

So "reserved" here means available for Xen's use, but not for more
general purposes? How would the area Linux uses for its own
purposes get represented?

>  (4) Because the reserved area is now used by Xen hypervisor, it
>      should not be accessible by Dom0 any more. Therefore, if a host
>      pmem device is recorded by Xen hypervisor, Xen will unmap its
>      reserved area from Dom0. Our design also needs to extend Linux
>      NVDIMM driver to "balloon out" the reserved area after it
>      successfully reports a pmem device to Xen hypervisor.

... "balloon out" ... _after_? That'd be unsafe.

> 4.2.3 Get Host Machine Address (SPA) of Host pmem Files
> 
>  Before a pmem file is assigned to a domain, we need to know the host
>  SPA ranges that are allocated to this file. We do this work in xl.
> 
>  If a pmem device /dev/pmem0 is given, xl will read
>  /sys/block/pmem0/device/{resource,size} respectively for the start
>  SPA and size of the pmem device.
> 
>  If a pre-allocated file /mnt/dax/file is given,
>  (1) xl first finds the host pmem device where /mnt/dax/file is. Then
>      it uses the method above to get the start SPA of the host pmem
>      device.
>  (2) xl then uses fiemap ioctl to get the extend mappings of
>      /mnt/dax/file, and adds the corresponding physical offsets and
>      lengths in each mapping entries to above start SPA to get the SPA
>      ranges pre-allocated for this file.

Remind me again: These extents never change, not even across
reboot? I think this would be good to be written down here explicitly.
Hadn't there been talk of using labels to be able to allow a guest to
own the exact same physical range again after reboot or guest or
host?

>  3) When hvmloader loads a type 0 entry, it extracts the signature
>     from the data blob and search for it in builtin_table_sigs[].  If
>     found anyone, hvmloader will report an error and stop. Otherwise,
>     it will append it to the end of loaded guest ACPI.

Duplicate table names aren't generally collisions: There can, for
example, be many tables named "SSDT".

>  4) When hvmloader loads a type 1 entry, it extracts the device name
>     from the data blob and search for it in builtin_nd_names[]. If
>     found anyone, hvmloader will report and error and stop. Otherwise,
>     it will wrap the AML code snippet by "Device (name[4]) {...}" and
>     include it in a new SSDT which is then appended to the end of
>     loaded guest ACPI.

But all of these could go into a single SSDT, instead of (as it sounds)
each into its own one?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen
  - From: Haozhong Zhang

Prev by Date: Re: [Xen-devel] live migrating hvm from 4.4 to 4.5 fails due to kvmvapic
Next by Date: Re: [Xen-devel] Xen 4.7.0 boot PANIC on kernel 4.7.0-4 + UEFI ?
Previous by thread: Re: [Xen-devel] live migrating hvm from 4.4 to 4.5 fails due to kvmvapic
Next by thread: Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.