[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] tools/tests/mem-sharing/memshrtool share-all test

Hi Andres,
thanks for taking a look at this patch!

> In terms of higher level:
> - Are these really clone VMs? In order to nominate gfns, they must be 
> allocated … so, what was allocated in the target VM before this? How would 
> you share two 16GB domains if you have 2GB free before allocating the target 
> domain (setting aside how do you deal with CoW overflow, which is a separate 
> issue). You may consider revisiting the add to physmap sharing memop.
> - Can you document when should one call this? Or at least your envisioned 
> scenario. Ties in with the question before.

While add_to_physmap would be ideal to quickly clone VMs, I haven't
found anything useful (documentation/code sample) on doing it that
way. The only way I found to clone a VM right now is using XL
save/restore and than deduplicating the pages using nominate/share. I
do this in the following order:

1. Retrieve origin VMs configuration.
2. Parse and modify the config by changing the VM's name, disk and
network interface. The disk assigned to the clone is a CoW disk (qcow2
or LVM), the network bridge is a new bridge as to avoid MAC/IP
collision with the origin VM.
3. Create a FIFO pipe on the filesystem (mkfifo /tmp/cloning)
4. Use XL to clone the VM's execution state and memory without
deduplication: xl pause <origin> && xl save -c <origin> /tmp/cloning |
xl restore -p <modified config> /tmp/cloning
5. Use the routine in this patch to deduplicate the memory
6. Unpause clone.

This is quite wasteful as you and Patrick pointed it out in
Unfortunately, I haven't found a straight forward way to duplicate
only the execution state of a VM without duplicating it's entire
memory to allow me to use add_to_physmap. If XL would have an option
in it's xl save routine to do a partial save, that would be great. I
did scan through the XL code to determine how one would do that but
I'm not even close to understanding the internals of XL.

> - I think it's high time we batch sharing calls. I have not been able to do 
> this, but it should be relatively simple to submit a hypervisor patch to 
> achieve this. For what you are trying to do, it will give you a very nice 
> boost in performance.

That sounds like something that would be very useful.

>>> +#define PAGE_SIZE_KB (XC_PAGE_SIZE/1024)
> A matter of style, but in my view this is unneeded, see below.
>>> +        pages=info.max_memkb/PAGE_SIZE_KB;
>>> 2, cleaner code, more inline with the code base.


>>> +        source_pages=source_info.max_memkb/PAGE_SIZE_KB;
> In most scenarios you would need to pause this, particularly as VMs may 
> self-modify their physmap (balloon, mmio, etc)

See above my intended usage (origin and clone should both be paused
during this operation).

>>> +
>>> +        if(pages != source_pages) {
>>> +            printf("Page count in source and destination domain doesn't 
>>> match "
> to stderr.


>>> +        for(share_page=0;share_page<=pages;++share_page) {
> The memory layout of an hvm is sparse. While tot pages will get you a lot of 
> sharing, it will not get you all. For example, for a VM with nominal 4GB of 
> RAM, the max gfn is around 4.25GB.

This is something that lacks documentation (or did I just failed
finding it?) so thanks for shedding some light on it! =) I did spend
days trying to figure out the best way of getting the list of valid
gfn's of a domain, without success. This approach did seem to work OK,
although the number of pages shared this way was never 100% as parts
of the memory fail at the nominate call, hence my continue; in the
code for those cases.

> Even for small VMs, you have gfns in the 3.75-4GB range. You should check 
> equality of max gfn, which might be a very difficult thing to achieve 
> depending on the stage of a VM's lifetime at which you call this.

Can you elaborate on this? (Is this documented anywhere? How would you
determine the max gfn of a domain?)

> And you should have a policy for dealing with physmap holes (for example, is 
> there any point in sharing the VGA mmio? yes/no, your call, argue for it, 
> document it, etc)

I guess this depends on the intended usage of the clone. For my
purposes the closer the clone is to the origin the better. Of course,
there are situations where this is simply not possible (for example
cloning a VM with PCI passthrough devices).


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.