[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC Design Doc v2] Add vNVDIMM support for Xen
On 08/03/16 03:47, Jan Beulich wrote: > >>> On 03.08.16 at 11:37, <haozhong.zhang@xxxxxxxxx> wrote: > > On 08/03/16 02:45, Jan Beulich wrote: > >> >>> On 03.08.16 at 08:54, <haozhong.zhang@xxxxxxxxx> wrote: > >> > On 08/02/16 08:46, Jan Beulich wrote: > >> >> >>> On 18.07.16 at 02:29, <haozhong.zhang@xxxxxxxxx> wrote: > >> >> > (4) Because the reserved area is now used by Xen hypervisor, it > >> >> > should not be accessible by Dom0 any more. Therefore, if a host > >> >> > pmem device is recorded by Xen hypervisor, Xen will unmap its > >> >> > reserved area from Dom0. Our design also needs to extend Linux > >> >> > NVDIMM driver to "balloon out" the reserved area after it > >> >> > successfully reports a pmem device to Xen hypervisor. > >> >> > >> >> ... "balloon out" ... _after_? That'd be unsafe. > >> >> > >> > > >> > Before ballooning is accomplished, the pmem driver does not create any > >> > device node under /dev/ and hence no one except the pmem drive can > >> > access the reserved area on pmem, so I think it's okey to balloon > >> > after reporting. > >> > >> Right now Dom0 isn't allowed to access any memory in use by Xen > >> (and not explicitly shared), and I don't think we should deviate > >> from that model for pmem. > > > > In this design, Xen hypervisor unmaps the reserved area from Dom0 so > > that Dom0 cannot access the reserved area afterwards. And "balloon" is > > in fact not a memory ballooning, because Linux kernel never allocates > > from pmem like normal ram. In my current implementation, it's just to > > remove the reserved area from a resource struct covering pmem. > > Ah, in that case please either use a different term, or explain what > "balloon out" is meant to mean in this context. > > >> >> > 4.2.3 Get Host Machine Address (SPA) of Host pmem Files > >> >> > > >> >> > Before a pmem file is assigned to a domain, we need to know the host > >> >> > SPA ranges that are allocated to this file. We do this work in xl. > >> >> > > >> >> > If a pmem device /dev/pmem0 is given, xl will read > >> >> > /sys/block/pmem0/device/{resource,size} respectively for the start > >> >> > SPA and size of the pmem device. > >> >> > > >> >> > If a pre-allocated file /mnt/dax/file is given, > >> >> > (1) xl first finds the host pmem device where /mnt/dax/file is. Then > >> >> > it uses the method above to get the start SPA of the host pmem > >> >> > device. > >> >> > (2) xl then uses fiemap ioctl to get the extend mappings of > >> >> > /mnt/dax/file, and adds the corresponding physical offsets and > >> >> > lengths in each mapping entries to above start SPA to get the SPA > >> >> > ranges pre-allocated for this file. > >> >> > >> >> Remind me again: These extents never change, not even across > >> >> reboot? I think this would be good to be written down here explicitly. > >> > > >> > Yes > >> > > >> >> Hadn't there been talk of using labels to be able to allow a guest to > >> >> own the exact same physical range again after reboot or guest or > >> >> host? > >> > > >> > You mean labels in NVDIMM label storage area? As defined in Intel > >> > NVDIMM Namespace Specification, labels are used to specify > >> > namespaces. For a pmem interleave set (possible cross several dimms), > >> > at most one pmem namespace (and hence at most one label) is > >> > allowed. Therefore, labels can not be used to partition pmem. > >> > >> Okay. But then how do particular ranges get associated with the > >> owning guest(s)? Merely by SPA would seem rather fragile to me. > >> > > > > By using the file name, e.g. if I specify vnvdimm = [ 'file=/mnt/dax/foo' ] > > in a domain config file, SPA occupied by /mnt/dax/foo are mapped to > > the domain. If the same file is used every time the domain is created, > > the same virtual device will be seen by that domain. > > So what if the file got deleted and re-created in between? Since > I don't think you can specify the SPAs to use when creating such > a file, such an operation would be quite different from removing > and re-adding e.g. a specific PCI device (to be used by a guest) > on a host (while the guest is not running). > If modified in between, guest will see a virtual pmem device of different data. But the usage of pmem is similar to disk: if a file of the same content is given every time, the guest can get a virtual pmem/disk of the same data as last reboot/shutdown; keeping the data unchanged between multiple boots is out of the scope of Xen. Haozhong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |