[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 16/17] libxc/xc_dom_arm: Copy ACPI tables to guest space



On Thu, 21 Jul 2016, Julien Grall wrote:
> Hi Stefano,
> 
> On 21/07/16 18:53, Stefano Stabellini wrote:
> > On Wed, 20 Jul 2016, Boris Ostrovsky wrote:
> > > On 07/20/2016 01:28 PM, Stefano Stabellini wrote:
> > > > On Wed, 20 Jul 2016, Boris Ostrovsky wrote:
> > > > > On 07/20/2016 09:41 AM, Julien Grall wrote:
> > > > > > 
> > > > > > On 20/07/2016 14:33, Boris Ostrovsky wrote:
> > > > > > > On 07/20/2016 08:33 AM, Julien Grall wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > On 14/07/16 14:37, Stefano Stabellini wrote:
> > > > > > > > > On Wed, 13 Jul 2016, Julien Grall wrote:
> > > > > > > > > > Hello,
> > > > > > > > > > 
> > > > > > > > > > On 12/07/2016 17:58, Boris Ostrovsky wrote:
> > > > > > > > > > > On 07/12/2016 12:10 PM, Julien Grall wrote:
> > > > > > > > > > > > On 12/07/2016 16:08, Boris Ostrovsky wrote:
> > > > > > > > > > > > > On 07/12/2016 10:57 AM, Shannon Zhao wrote:
> > > > > > > > > > > > It will affect some others part of the guest if we don't
> > > > > > > > > > > > increment
> > > > > > > > > > > > the
> > > > > > > > > > > > "maxmem" requested by the user. For ARM the ACPI blob
> > > > > > > > > > > > will be
> > > > > > > > > > > > exposed
> > > > > > > > > > > > at a specific address that is outside of the guest RAM
> > > > > > > > > > > > (see the
> > > > > > > > > > > > guest
> > > > > > > > > > > > memory layout in public/arch-arm.h).
> > > > > > > > > > > > 
> > > > > > > > > > > > We chose this solution over putting in the RAM because
> > > > > > > > > > > > the ACPI
> > > > > > > > > > > > tables
> > > > > > > > > > > > are not easily relocatable (compare to the device tree,
> > > > > > > > > > > > initrd and
> > > > > > > > > > > > kernel) so we could not take advantage of superpage in
> > > > > > > > > > > > both stage-2
> > > > > > > > > > > > (hypervisor) and stage-1 (kernel) page table.
> > > > > > > > > > > Maybe this is something ARM-specific then. For x86 we will
> > > > > > > > > > > want to
> > > > > > > > > > > keep
> > > > > > > > > > > maxmem unchanged.
> > > > > > > > > > I don't think what I described in my previous mail is
> > > > > > > > > > ARM-specific. The
> > > > > > > > > > pressure will be more important on the TLBs, if Xen does not
> > > > > > > > > > use
> > > > > > > > > > superpage in
> > > > > > > > > > the stage 2 page tables (i.e EPT for x86) no matter the
> > > > > > > > > > architecture.
> > > > > > > > > > 
> > > > > > > > > > IHMO, this seems to be a bigger drawback compare to add few
> > > > > > > > > > more
> > > > > > > > > > kilobytes to
> > > > > > > > > > maxmem in the toolstack for the ACPI blob. You will loose
> > > > > > > > > > them when
> > > > > > > > > > creating
> > > > > > > > > > the intermediate page table in any case.
> > > > > > > > > I agree with Julien. On ARM we have to increase maxmem because
> > > > > > > > > I don't
> > > > > > > > > think that shattering a superpage is acceptable for just a few
> > > > > > > > > KBs. In
> > > > > > > > > fact, it's not much about increasing maxmem, but it's about
> > > > > > > > > keeping
> > > > > > > > > the
> > > > > > > > > allocation of guest memory to the value passed by the user in
> > > > > > > > > "memory",
> > > > > > > > > so that it can be done in the most efficient way possible. (I
> > > > > > > > > am
> > > > > > > > > assuming users are going to allocate VMs of 2048MB, rather
> > > > > > > > > than
> > > > > > > > > 2049MB.)
> > > > > > > > > 
> > > > > > > > > I wouldn't want to end up adding to the performance tuning
> > > > > > > > > page on the
> > > > > > > > > wiki "Make sure to add 1 more MB to the memory of your VM to
> > > > > > > > > get the
> > > > > > > > > most out of the system."
> > > > > > > > > 
> > > > > > > > > I know that the location of the ACPI blob on x86 is different
> > > > > > > > > in guest
> > > > > > > > > memory space, but it seems to me that the problem would be the
> > > > > > > > > same. Do
> > > > > > > > > you have 1 gigabyte pages in stage-2 on x86? If so, I would
> > > > > > > > > think
> > > > > > > > > twice
> > > > > > > > > about this. Otherwise, if you only have 4K and 2MB
> > > > > > > > > allocations,
> > > > > > > > > then it
> > > > > > > > > might not make that much of a difference.
> > > > > > > > Looking at the x86 code, 1 gigabyte pages seems to be supported.
> > > > > > > > 
> > > > > > > > Boris, do you have any opinions on this?
> > > > > > > 
> > > > > > > I don't think I understand the superpage shattering argument.  In
> > > > > > > x86
> > > > > > > the tables live in regular RAM and a guest is free to use those
> > > > > > > addresses as regular memory.
> > > > > > > 
> > > > > > > This apparently is different from how ARM manages the tables (you
> > > > > > > said
> > > > > > > in an earlier message that they are not part of RAM) so I can see
> > > > > > > that
> > > > > > > taking memory from RAM allocation to store the tables may affect
> > > > > > > how
> > > > > > > mapping is done, potentially causing GB pages to be broken.
> > > > > > > 
> > > > > > > In fact (and I am totally speculating here) padding memory for x86
> > > > > > > may
> > > > > > > actually *cause* shattering because we will have (for example)
> > > > > > > 2049MB of
> > > > > > > RAM to deal with.
> > > > > > Correct me if I am wrong. On your series you are populating the page
> > > > > > at a specific address for the ACPI tables separately to the RAM
> > > > > > allocation. So you will shatter GB pages if the user provides 2048MB
> > > > > > because the ACPI tables is accounted in the 2048MB.
> > > > > And to be honest I am not convinced this was a well selected address
> > > > > (0xfc000000). I am actually thinking about moving it down (this may
> > > > > require changing dsdt.asl). I don't know whether I will actually do it
> > > > > in this version but it is certainly a possibility.
> > > > I don't understand how this statement fits in the discussion.
> > > > 
> > > > The memory allocation for the ACPI blob is done by the toolstack
> > > > separately from the rest of guest memory, leading to two separate
> > > > stage-2 pagetable allocations of less than 1-gigabyte pages. Is that
> > > > correct?
> > > 
> > > 
> > > If I move the table lower into memory we won't have to do any extra
> > > allocation. The memory will have been already allocated for the guest,
> > > we just map it and copy the tables.
> > 
> > I see, thanks for the explanation. I think this could work for ARM too
> > and should avoid the stage-2 shattering issue described above.
> 
> But you will end up to have stage-1 shattering issue if you put the ACPI
> tables lower into the guest RAM, reducing the overall performance. It is why I
> first asked Shannon to put the ACPI outside of the guest RAM.

I am not sure about this actually: even with the ACPI blob in the middle
of guest memory, the guest OS could still use a single superpage for its
own stage-1 memory mappings. I don't know if Linux does it that way, but
it should be possible.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.