[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCHv1] xen/balloon: disable memory hotplug in PV guests



On 03/18/2015 04:14 PM, Daniel Kiper wrote:
On Wed, Mar 18, 2015 at 01:59:58PM +0000, David Vrabel wrote:
On 18/03/15 13:57, Juergen Gross wrote:
On 03/18/2015 11:36 AM, David Vrabel wrote:
On 16/03/15 10:31, Juergen Gross wrote:
On 03/16/2015 11:03 AM, Daniel Kiper wrote:
On Mon, Mar 16, 2015 at 06:35:04AM +0100, Juergen Gross wrote:
On 03/11/2015 04:40 PM, Boris Ostrovsky wrote:
On 03/11/2015 10:42 AM, David Vrabel wrote:
On 10/03/15 13:35, Boris Ostrovsky wrote:
On 03/10/2015 07:40 AM, David Vrabel wrote:
On 09/03/15 14:10, David Vrabel wrote:
Memory hotplug doesn't work with PV guests because:

      a) The p2m cannot be expanded to cover the new sections.
Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
linear virtual mapped sparse p2m list).

This one would be non-trivial to fix.  We'd need a sparse set of
vm_area's for the p2m or similar.

      b) add_memory() builds page tables for the new sections
which
means
         the new pages must have valid p2m entries (or a BUG
occurs).
After some more testing this appears to be broken by:

25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions
above
the
end of RAM as 1:1) included 3.16.

This one can be trivially fixed by setting the new sections in
the p2m
to INVALID_P2M_ENTRY before calling add_memory().
Have you tried 3.17? As I said yesterday, it worked for me (with
4.4
Xen).
No.  But there are three bugs that prevent it from working in
3.16+ so
I'm really not sure how you had a working in a 3.17 PV guest.

This is what I have:

[build@build-mk2 linux-boris]$ ssh root@tst008 cat
/mnt/lab/bootstrap-x86_64/test_small.xm
extra="console=hvc0 debug earlyprintk=xen "
kernel="/mnt/lab/bootstrap-x86_64/vmlinuz"
ramdisk="/mnt/lab/bootstrap-x86_64/initramfs.cpio.gz"
memory=1024
maxmem = 4096
vcpus=1
maxvcpus=3
name="bootstrap-x86_64"
on_crash="preserve"
vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ]
vnc=1
vnclisten="0.0.0.0"
disk=['phy:/dev/guests/bootstrap-x86_64,xvda,w']
[build@build-mk2 linux-boris]$ ssh root@tst008 xl create
/mnt/lab/bootstrap-x86_64/test_small.xm
Parsing config from /mnt/lab/bootstrap-x86_64/test_small.xm
[build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
bootstrap-x86_64
bootstrap-x86_64                             2  1024     1
-b----       5.4
[build@build-mk2 linux-boris]$ ssh root@g-pvops uname -r
3.17.0upstream
[build@build-mk2 linux-boris]$ ssh root@g-pvops dmesg|grep
paravirtualized
[    0.000000] Booting paravirtualized kernel on Xen
[build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
/proc/meminfo
MemTotal:         968036 kB
[build@build-mk2 linux-boris]$ ssh root@tst008 xl mem-set
bootstrap-x86_64 2048
[build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
bootstrap-x86_64
bootstrap-x86_64                             2  2048     1
-b----       5.7
[build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
/proc/meminfo
MemTotal:        2016612 kB
[build@build-mk2 linux-boris]$



Regardless, it definitely doesn't work now because of the linear p2m
change.  What do you want to do about this?

Since backing out p2m changes is not an option I guess your patch is
the
only short-term alternative.

But this still looks like a regression so perhaps Juergen can take a
look to see how it can be fixed.

Hmm, the p2m list is allocated for the maximum memory size of the
domain
which is obtained from the hypervisor. In case of Dom0 it is read via
XENMEM_maximum_reservation, for a domU it is based on the E820 memory
map read via XENMEM_memory_map.

I just tested it with a 4.0-rc1 domU kernel with 512MB initial memory
and 4GB of maxmem. The E820 map looked like this:

[    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff]
reserved
[    0.000000] Xen: [mem 0x0000000000100000-0x00000000ffffffff] usable

So the complete 4GB were included, like they should. The resulting p2m
list is allocated in the needed size:

[    0.000000] p2m virtual area at ffffc90000000000, size is 800000

So what is your problem here? Can you post the E820 map and the p2m
map
info for your failing domain, please?

If you use memory hotplug then maxmem is not a limit from guest kernel
point of view (host still must allow that operation but it is another
not related issue). The problem is that p2m must be dynamically
expendable
to support it. Earlier implementation supported that thing and memory
hotplug worked without any issue.

Okay, now I get it.

The problem with the earlier p2m implementation was that it was
expendable to support only up to 512GB of RAM. So we need some way to
tell the kernel how much virtual memory it should reserve for the p2m
list if memory hotplug is enabled. We could:

a) use a configurable maximum (e.g. for 512GB RAM as today)

I would set the p2m virtual area to cover up to 512 GB (needs 1 GB of
virt space) for a 64-bit guest and up to 64 GB (needs 64 MB of virt
space) for a 32-bit guest.

Are 64 GB for 32 bit guests a sensible default? This will need more than
10% of the available virtual kernel space (taking fixmap etc. into
account). And a 64 GB sized 32 bit domain is hardly usable (you have to
play dirty tricks to get it even running).

I'd rather use a default of 4 GB which can be changed via a Kconfig
option. For 64 bits the default of 512 GB is okay, but should be
configurable as well.

Ok.

I have checked new p2m code and I think that this is reasonable solution too.

Do I need any patches for xl to be able to test this? I did:

xl mem-max 2 4096
xl mem-set 2 4096

and get:

libxl: error: libxl.c:4779:libxl_set_memory_target: memory_dynamic_max must be less than or equal to memory_static_max

xl list -l shows:

...
    {
        "domid": 2,
        "config": {
            "c_info": {
                "type": "pv",
                "name": "sles11",
                "uuid": "c53944f1-1607-e367-278a-c7980b6cfdd0",
                "run_hotplug_scripts": "True"
            },
            "b_info": {
                "max_vcpus": 1,
                "avail_vcpus": [
                    0
                ],
                "numa_placement": "True",
                "max_memkb": 524288,
                "target_memkb": 524288,
                "video_memkb": 0,
                "shadow_memkb": 5120,
                "localtime": "False",
                "disable_migrate": "False",
                "blkdev_start": "xvda",
                "device_model_version": "qemu_xen",
                "device_model_stubdomain": "False",
                "sched_params": {

                },
...

which seems to reflect only the parameters from starting the domU:

name="sles11"
description="None"
uuid="c53944f1-1607-e367-278a-c7980b6cfdd0"
memory=512
maxmem=512
vcpus=1
on_poweroff="destroy"
on_reboot="restart"
on_crash="destroy"
localtime=0
keymap="de"
builder="linux"
bootloader="/usr/bin/pygrub"
bootargs=""
extra="xencons=tty "
disk=[ 'file:/home/sles11-2,xvda,w', ]
vif=[ 'mac=00:16:3e:06:a7:21,bridge=br0', ]

David

I thought your name was Daniel? ;-)


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.