[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree



On 09/16/2014 02:44 PM, Juergen Gross wrote:
On 09/16/2014 01:56 PM, David Vrabel wrote:
On 16/09/14 11:38, Juergen Gross wrote:
On 09/16/2014 12:14 PM, David Vrabel wrote:
On 16/09/14 04:52, Juergen Gross wrote:
On 09/15/2014 04:30 PM, David Vrabel wrote:
On 15/09/14 11:46, Juergen Gross wrote:
So you'd prefer:

1) >512GB pv-domains (including Dom0) will be supported only with
new
      Xen (4.6?), no matter if the user requires migration to be
supported

Yes.  >512 GiB and not being able to migrate are not obviously
related
from the point of view of the end user (unlike assigning a PCI
device).

Failing at domain save time is most likely too late for the end user.

What would you think about following compromise:

We add a flag that indicates support of multi-level p2m. Additionally
the Linux kernel can ignore the flag not being set either if
started as
Dom0 or if told so via kernel parameter.

This sounds fine but this override should be via the command line
parameter only.  Crash dump analysis tools may not understand the 4
level p2m.

to:

2) >512GB pv-domains (especially Dom0 and VMs with direct hw
access) can
      be started on current Xen versions, migration is possible
only if
Xen
      is new (4.6?)

There's also my preferred option:

3) >512 GiB PV domains are not supported.  Large guests must be
PVH or
PVHVM.

In theory okay, but not right now, I think. PVH Dom0 is not production
ready.

I'm not really seeing the need for such a large dom0.

Okay, then I'd come back to V1 of my patches. This is the minimum
required to be able to boot up a system with Xen and more than 512GB
memory without having to reduce the Dom0 memory via Xen boot parameter.

Otherwise the hypervisor built mfn_list mapped into the initial address
space will be too large.

And no, I don't think setting the boot parameter is the solution here.
Dom0 should be usable on a huge machine without special parameters.

Ok. The case where's dom0's p2m format matters is pretty specialized.

I also think a flat array for the p2m might be better (less complex).
There's plenty of virtual address space in a 64-bit guest to allow for
this.

Hmm, do you think we could reserve an area of many GBs for Xen in
virtual space? I suspect this would be rejected as another "Xen-ism".

alloc_vm_area()

Nice idea, but alloc_vm_area() allocates ptes for the whole area.
__get_vm_area() would be better, I think.


BTW: the mfn_list_list will still be required to be built as a tree.

The tools could be given the guest virtual address and walk the guest
page tables.

This is probably too much of a difference from the existing ABI to be
worth pursuing at this point.

Okay, coming back to the main question:

What to do regarding support of >512GB domains:

1. we need another level of the p2m map
2. we are trying the linear p2m table
    a) with a 4 level mfn_list_list
    b) with access to the p2m table via page tables
3. my V1 patches are okay, as they enable Dom0 to start on machines
    with huge memory

I thought a little bit more about this.

I like the idea to use the virtual mapped linear p2m list. It would
remove the need to build the p2m tree at an early boot stage, as the
initial mfn_list supplied by the hypervisor can be used until the kernel
builds it's own list.

I'll try to create patch doing this. As this is not affecting the
initial mapping of initrd and mfn_list I've posted V2 of my patches
to eliminate some of the limitations of those initial mappings.

Whether the mfn_list_list should be kept as a tree or (if indicated by
a flag to be supported) is accessed via page table walk of the tools
can be decided later.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.