[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 01/26/16 23:30, Haozhong Zhang wrote: > On 01/26/16 05:44, Jan Beulich wrote: > > >>> On 26.01.16 at 12:44, <George.Dunlap@xxxxxxxxxxxxx> wrote: > > > On Thu, Jan 21, 2016 at 2:52 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote: > > >>>>> On 21.01.16 at 15:01, <haozhong.zhang@xxxxxxxxx> wrote: > > >>> On 01/21/16 03:25, Jan Beulich wrote: > > >>>> >>> On 21.01.16 at 10:10, <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > > >>>> > c) hypervisor should mange PMEM resource pool and partition it to > > >>>> > multiple > > >>>> > VMs. > > >>>> > > >>>> Yes. > > >>>> > > >>> > > >>> But I Still do not quite understand this part: why must pmem resource > > >>> management and partition be done in hypervisor? > > >> > > >> Because that's where memory management belongs. And PMEM, > > >> other than PBLK, is just another form of RAM. > > > > > > I haven't looked more deeply into the details of this, but this > > > argument doesn't seem right to me. > > > > > > Normal RAM in Xen is what might be called "fungible" -- at boot, all > > > RAM is zeroed, and it basically doesn't matter at all what RAM is > > > given to what guest. (There are restrictions of course: lowmem for > > > DMA, contiguous superpages, &c; but within those groups, it doesn't > > > matter *which* bit of lowmem you get, as long as you get enough to do > > > your job.) If you reboot your guest or hand RAM back to the > > > hypervisor, you assume that everything in it will disappear. When you > > > ask for RAM, you can request some parameters that it will have > > > (lowmem, on a specific node, &c), but you can't request a specific > > > page that you had before. > > > > > > This is not the case for PMEM. The whole point of PMEM (correct me if > > > I'm wrong) is to be used for long-term storage that survives over > > > reboot. It matters very much that a guest be given the same PRAM > > > after the host is rebooted that it was given before. It doesn't make > > > any sense to manage it the way Xen currently manages RAM (i.e., that > > > you request a page and get whatever Xen happens to give you). > > > > Interesting. This isn't the usage model I have been thinking about > > so far. Having just gone back to the original 0/4 mail, I'm afraid > > we're really left guessing, and you guessed differently than I did. > > My understanding of the intentions of PMEM so far was that this > > is a high-capacity, slower than DRAM but much faster than e.g. > > swapping to disk alternative to normal RAM. I.e. the persistent > > aspect of it wouldn't matter at all in this case (other than for PBLK, > > obviously). > > > > Of course, pmem could be used in the way you thought because of its > 'ram' aspect. But I think the more meaningful usage is from its > persistent aspect. For example, the implementation of some journal > file systems could store logs in pmem rather than the normal ram, so > that if a power failure happens before those in-memory logs are > completely written to the disk, there would still be chance to restore > them from pmem after next booting (rather than abandoning all of > them). > > (I'm still writing the design doc which will include more details of > underlying hardware and the software interface of nvdimm exposed by > current linux) > > > However, thinking through your usage model I have problems > > seeing it work in a reasonable way even with virtualization left > > aside: To my knowledge there's no established protocol on how > > multiple parties (different versions of the same OS, or even > > completely different OSes) would arbitrate using such memory > > ranges. And even for a single OS it is, other than for disks (and > > hence PBLK), not immediately clear how it would communicate > > from one boot to another what information got stored where, > > or how it would react to some or all of this storage having > > disappeared (just like a disk which got removed, which - unless > > it held the boot partition - would normally have pretty little > > effect on the OS coming back up). > > > > Label storage area is a persistent area on NVDIMM and can be used to > store partitions information. It's not included in pmem (that part > that is mapped into the system address space). Instead, it can be only > accessed through NVDIMM _DSM method [1]. However, what contents are > stored and how they are interpreted are left to software. One way is > to follow NVDIMM Namespace Specification [2] to store an array of > labels that describe the start address (from the base 0 of pmem) and > the size of each partition, which is called as namespace. On Linux, > each namespace is exposed as a /dev/pmemXX device. > > In the virtualization, the (virtual) label storage area of vNVDIMM and > the corresponding _DSM method are emulated by QEMU. The virtual label > storage area is not written to the host one. Instead, we can reserve a > piece area on pmem for the virtual one. > > Besides namespaces, we can also create DAX file systems on pmem and > use files to partition. > Forgot references: [1] NVDIMM DSM Interface Examples, http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf [2] NVDIMM Namespace Specification, http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf > Haozhong > > > > So if Xen is going to use PMEM, it will have to invent an entirely new > > > interface for guests, and it will have to keep track of those > > > resources across host reboots. In other words, it will have to > > > duplicate all the work that Linux already does. What do we gain from > > > that duplication? Why not just leverage what's already implemented in > > > dom0? > > > > Indeed if my guessing on the intentions was wrong, then the > > picture completely changes (also for the points you've made > > further down). > > > > Jan > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |