[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [V10 PATCH 0/4] pvh dom0 patches...

On Tue, May 06, 2014 at 06:00:53PM -0700, Mukesh Rathor wrote:
> On Tue, 6 May 2014 09:13:08 +0200
> Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
> > On 06/05/14 02:28, Mukesh Rathor wrote:
> > > On Mon, 5 May 2014 10:52:54 +0200
> > > Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
> ....
> > >> This was because I was using start_info->nr_pages as the number of
> > >> usable RAM pages, but AFAICT from the code in domain_build.c,
> > >> pvh_map_all_iomem is making holes in the p2m, but it is not adding
> > >> those freed pages back to the end of the memory map, so the value
> > >> in nr_pages is not the number of usable RAM pages, but the number
> > >> of pages in the p2m map (taking into account both usable RAM pages
> > >> and p2m_mmio_direct regions).
> > >>
> > >> I'm not sure if this logic is correct, shouldn't the freed pages by
> > >> pvh_map_all_iomem be added to the end of the memory map?
> > > 
> > > Yeah, it's confusing a bit. Let me talk in terms of linux.
> > > 
> > > In case of PV dom0, linux parses the e820 and punches holes in
> > > the p2m itself. See xen_set_identity_and_release(). For pvh, we
> > > can skip some of that (since it already happened in xen), but we
> > > still use the "released" variable to keep track of that. Then
> > > later, xen_populate_chunk() adds those "released" pages back via 
> > > XENMEM_populate_physmap. This happens for both PV and PVH, so the 
> > > pages are added back. 
> > > 
> > > xen_memory_setup():
> > >         /*
> > >          * Populate back the non-RAM pages and E820 gaps that had
> > > been
> > >          * released. */
> > >         populated = xen_populate_chunk(map, memmap.nr_entries,
> > >                         max_pfn, &last_pfn, xen_released_pages);
> > > 
> > > 
> > > Perhaps your logic that does similar for PV needs to make sure it
> > > is populating the pages back for pvh also?  You just don't need to 
> > > punch holes in the p2m as you do for PV, for PVH you can skip that 
> > > part.  Hope that makes sense.
> > 
> > Sure, that makes a lot of sense. What I was wondering is why don't we
> > move the logic of "xen_populate_chunk" into Xen code itself, like we
> > do for the holes part. If Xen is in charge of doing the holes in the
> > memory map, it should also take care of adding those pages back to
> > the end of the memory map, what we do right now is very awkward IMHO.
> We don't because the guest needs to know the last pfn mapped. PV guest
> starts with nr_pages as max pfn mapped and then walks the e820 punching
> holes, keeping track of last_pfn etc... That code in linux is a bit
> fragile, there is max_pfn, lastpfn, max_pages, xen_start_info->nr_pages, 
> extra_pages, to name a few! 
> IOW, even if we did this in xen, guest would still need to have the
> exact same code just to adjust it's last_pfn. That would be dangerous
> if they go out of sync.

The Linux code already has that.
> Dont' you already have all that logic for PV dom0? You would just need
> to skip the p2m update.
> Jan, 
> As Konrad points out, there are other issues with that part of the code.
> So, IMO, we should wait and address that altogether for both PV and PVH,
> and for now leave this as is.
> If you still feel we should do this in xen (as in required for this
> patch series to go in), then we'd need to communicate last pfn to guest.
> However, note that guest would still have all that code for PV, for pvh
> we'd just skip it. Obvious way coming to mind is:
> unsigned long last_pfn_mapped; 
> added to struct start_info.  Alternative would be to add a 
> new sub call to perhaps getdomaininfo hcall or something similar. 
> Please lmk. 

Uh, why? The Linux generic code figures out the last_pfn by
enumerating the E820. We only need 'max_pfn_mapped' in the P2M code
to figure the top of the P2M array provided by us - so that we
can use the extend_brk for anything above that- but that is
flexible enough to deal with holes in the P2M and such.
We also skip all of that when we run in the PVH mode.

For example, right now the Linux PV code can run with an E820 that
looks like swiss cheese - aka like the hosts' one. That is easily
seen if you boot an PV guest with PCI passthrough devices - as the
libxl 'e820_host' option gets turned on which creates an E820
that looks like the hosts. Granted it does not populate the P2M
as such (it is all linear). But the point is that the Linux
code is capable of dealing with this and bring the P2M to sync.

Hoisting this up in the hypervisor would be a plus - as the
Linux code wouldn't have to do this anymore.

> thanks,
> mukesh
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.