[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 16/17] libxc/xc_dom_arm: Copy ACPI tables to guest space



Hi Stefano,

On 21/07/16 18:53, Stefano Stabellini wrote:
On Wed, 20 Jul 2016, Boris Ostrovsky wrote:
On 07/20/2016 01:28 PM, Stefano Stabellini wrote:
On Wed, 20 Jul 2016, Boris Ostrovsky wrote:
On 07/20/2016 09:41 AM, Julien Grall wrote:

On 20/07/2016 14:33, Boris Ostrovsky wrote:
On 07/20/2016 08:33 AM, Julien Grall wrote:
Hi,

On 14/07/16 14:37, Stefano Stabellini wrote:
On Wed, 13 Jul 2016, Julien Grall wrote:
Hello,

On 12/07/2016 17:58, Boris Ostrovsky wrote:
On 07/12/2016 12:10 PM, Julien Grall wrote:
On 12/07/2016 16:08, Boris Ostrovsky wrote:
On 07/12/2016 10:57 AM, Shannon Zhao wrote:
It will affect some others part of the guest if we don't increment
the
"maxmem" requested by the user. For ARM the ACPI blob will be
exposed
at a specific address that is outside of the guest RAM (see the
guest
memory layout in public/arch-arm.h).

We chose this solution over putting in the RAM because the ACPI
tables
are not easily relocatable (compare to the device tree, initrd and
kernel) so we could not take advantage of superpage in both stage-2
(hypervisor) and stage-1 (kernel) page table.
Maybe this is something ARM-specific then. For x86 we will want to
keep
maxmem unchanged.
I don't think what I described in my previous mail is
ARM-specific. The
pressure will be more important on the TLBs, if Xen does not use
superpage in
the stage 2 page tables (i.e EPT for x86) no matter the architecture.

IHMO, this seems to be a bigger drawback compare to add few more
kilobytes to
maxmem in the toolstack for the ACPI blob. You will loose them when
creating
the intermediate page table in any case.
I agree with Julien. On ARM we have to increase maxmem because I don't
think that shattering a superpage is acceptable for just a few KBs. In
fact, it's not much about increasing maxmem, but it's about keeping
the
allocation of guest memory to the value passed by the user in
"memory",
so that it can be done in the most efficient way possible. (I am
assuming users are going to allocate VMs of 2048MB, rather than
2049MB.)

I wouldn't want to end up adding to the performance tuning page on the
wiki "Make sure to add 1 more MB to the memory of your VM to get the
most out of the system."

I know that the location of the ACPI blob on x86 is different in guest
memory space, but it seems to me that the problem would be the
same. Do
you have 1 gigabyte pages in stage-2 on x86? If so, I would think
twice
about this. Otherwise, if you only have 4K and 2MB allocations,
then it
might not make that much of a difference.
Looking at the x86 code, 1 gigabyte pages seems to be supported.

Boris, do you have any opinions on this?

I don't think I understand the superpage shattering argument.  In x86
the tables live in regular RAM and a guest is free to use those
addresses as regular memory.

This apparently is different from how ARM manages the tables (you said
in an earlier message that they are not part of RAM) so I can see that
taking memory from RAM allocation to store the tables may affect how
mapping is done, potentially causing GB pages to be broken.

In fact (and I am totally speculating here) padding memory for x86 may
actually *cause* shattering because we will have (for example) 2049MB of
RAM to deal with.
Correct me if I am wrong. On your series you are populating the page
at a specific address for the ACPI tables separately to the RAM
allocation. So you will shatter GB pages if the user provides 2048MB
because the ACPI tables is accounted in the 2048MB.
And to be honest I am not convinced this was a well selected address
(0xfc000000). I am actually thinking about moving it down (this may
require changing dsdt.asl). I don't know whether I will actually do it
in this version but it is certainly a possibility.
I don't understand how this statement fits in the discussion.

The memory allocation for the ACPI blob is done by the toolstack
separately from the rest of guest memory, leading to two separate
stage-2 pagetable allocations of less than 1-gigabyte pages. Is that
correct?


If I move the table lower into memory we won't have to do any extra
allocation. The memory will have been already allocated for the guest,
we just map it and copy the tables.

I see, thanks for the explanation. I think this could work for ARM too
and should avoid the stage-2 shattering issue described above.

But you will end up to have stage-1 shattering issue if you put the ACPI tables lower into the guest RAM, reducing the overall performance. It is why I first asked Shannon to put the ACPI outside of the guest RAM.

Julien, what do you think? I agree that having the ACPI blob separate
would be cleaner, but using the same allocation scheme as x86 is
important.

While I agree that having the same scheme is important, I care a lot more about performance.

So far, I have concerns about performance on all Boris suggestions. I believe that the impact is the same on x86. However it seems that it is not important for the x86 folks.

I think the scheme I suggested is the best one, because it will maximize the theoretical performance of the guests (a guest is free to not use superpage).

I could be convinced to put the ACPI tables at the end of the guest RAM, although this would require more code in the toolstack because the ACPI base address will not be static anymore.

Regards,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.