[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Xen-users] nestedhvm.

I have not gotten as far as running xen directly from openstack.
That is still a desire and I hope to work towards that.

Right now I am booting these with xl.
I will try to set the mem to max=mem and see what happens.

On 05/20/2014 12:59 PM, Andres Lagar-Cavilla wrote:
On May 20, 2014, at 12:37 PM, Tim Deegan <tim@xxxxxxx> wrote:

At 09:56 +0100 on 20 May (1400576182), Ian Campbell wrote:
Adding xen-devel and some relevant maintainers.

On 05/19/2014 11:40 AM, Ian Campbell wrote:
On Sun, 2014-05-18 at 08:02 -0400, Alvin Starr wrote:
I am trying to run nested hypervisors to do some openstack experiments.
I seem to be able to run xen-on-xen with no problems but if i try to run
kvm-on-xen the system seems to spontaneously reboot.
I get the same results with xen 4.3 or 4.4.
The dom0 is running fedora-20
The experiment environment is Centos6 with RDO
On Mon, 2014-05-19 at 23:53 -0400, Alvin Starr wrote:
Here is the serial port output.
boot log along with panic.
Which contains:
        (XEN) mm locking order violation: 260 > 222
        (XEN) Xen BUG at mm-locks.h:118
(full stack trace is below)

That lead me to
http://lists.xen.org/archives/html/xen-devel/2013-02/msg01372.html but
not to a patch. Was there one? I've grepped the git logs for hints but
not found it...
I don't believe there was, no.  I'm not convinced that making shadow
code do locked p2m lookups is the right answer, anyway, though I
suppose it would stop this particular crash.

In the meantime, at least it suggests a workaround, which is to boot
the KVM VM with max-mem == memory (or however Openstack expresses that).
The problem arises from the use of PoD in L1 in combination with nested. L1 
being the first level VM which runs the nested hypervisor. PoD being populate 
on demand covering the gap between maxmem and real memory.

It might be that you need a small tweak to nova.conf. Kinda curious as to how 
you got to run openstack with new Xen, since a lot of production I've seen uses 
traditional xenserver. A different topic though.


(XEN) ----[ Xen-4.3.2  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    23
(XEN) RIP:    e008:[<ffff82c4c01ec7bb>] p2m_flush_table+0x1db/0x1f0
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) rax: ffff8308299ed020   rbx: ffff831835cb0540   rcx: 0000000000000000
(XEN) rdx: ffff8308299e0000   rsi: 000000000000000a   rdi: ffff82c4c027d658
(XEN) rbp: ffff82c4c031b648   rsp: ffff8308299e7998   r8:  0000000000000004
(XEN) r9:  0000000000000000   r10: ffff82c4c022ce64   r11: 0000000000000003
(XEN) r12: ffff83202cf99000   r13: 0000000000000000   r14: 0000000000000009
(XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 00000000000406f0
(XEN) cr3: 0000001834178000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff8308299e7998:
(XEN)    0000000000000008 ffff83202cf99000 0000000000000006 0000000000000000
(XEN)    0000000000000009 ffff82c4c01f0431 0000000000000000 ffff831835cb0010
(XEN)    0000000000371600 ffff82c4c01f1dc5 2000000000000000 00000000016e8400
(XEN)    ffff831836e38c58 ffff8308299e7a08 0000000001836e38 ffff831836e38000
(XEN)    0000000000000000 0000000000000000 0000000000000000 ffff831835cb0010
(XEN)    00000000000ee200 0000000000000000 0000000000000200 ffff831835cb0010
(XEN)    0000000000000001 0000000000371600 0000000000000200 ffff82c4c01ecf50
(XEN)    ffff83202cf99000 0000000700000006 0000000001836e37 ffff831835cb0010
(XEN)    ffff83202cf99000 ffff8308299e7af0 0000000000000200 0000000000371600
(XEN)    00000000016e8400 ffff82c4c01f3c8f ffff8308299e7aec 0000000035cb0010
(XEN)    0000000000000001 00000000016e8400 0000000000000200 ffff82c400000007
(XEN)    ffff83202cf99000 0000000700000000 ffff83040e4402c4 ffff831835cb0010
(XEN)    0000000000000009 0000000000f9f600 00000000000ee200 0000000000000200
(XEN)    ffff83202cf99000 ffff82c4c01f6019 00000000000ee200 ffff830800000200
(XEN)    ffff831835cb04f8 ffff8308299e7f18 0000000000000003 ffff8308299e7c68
(XEN)    0000000000000010 ffff82c4c01bcf83 ffff8308299e7ba0 ffff82c4c01f1222
(XEN)    6000001800000000 ffffffff810402c4 ffff8308299e7c50 ffff8300aebdd000
(XEN)    ffff8308299e7c50 ffff8300aebdd000 0000000000000000 ffff82c4c01c85dc
(XEN)    ffffffff81039e63 0a9b00100000000f 00000000ffffffff 0000000000000000
(XEN)    00000000ffffffff 0000000000000000 00000000ffffffff ffff831835cb0010
(XEN) Xen call trace:
(XEN)    [<ffff82c4c01ec7bb>] p2m_flush_table+0x1db/0x1f0
(XEN)    [<ffff82c4c01f0431>] p2m_flush_nestedp2m+0x21/0x30
(XEN)    [<ffff82c4c01f1dc5>] p2m_set_entry+0x565/0x650
(XEN)    [<ffff82c4c01ecf50>] set_p2m_entry+0x90/0x130
(XEN)    [<ffff82c4c01f3c8f>] p2m_pod_zero_check_superpage+0x21f/0x460
(XEN)    [<ffff82c4c01f6019>] p2m_pod_demand_populate+0x699/0x890
(XEN)    [<ffff82c4c01bcf83>] hvm_emulate_one+0xc3/0x1f0
(XEN)    [<ffff82c4c01f1222>] p2m_gfn_to_mfn+0x392/0x3c0
(XEN)    [<ffff82c4c01c85dc>] handle_mmio+0x7c/0x1e0
(XEN)    [<ffff82c4c01f10e1>] p2m_gfn_to_mfn+0x251/0x3c0
(XEN)    [<ffff82c4c01eca58>] __get_gfn_type_access+0x68/0x210
(XEN)    [<ffff82c4c01c1843>] hvm_hap_nested_page_fault+0xc3/0x510
(XEN)    [<ffff82c4c011a447>] csched_vcpu_wake+0x367/0x580

Any hints on what the problem may be or a good place to start to look to
diagnose it?
You'll need to gather some logs I think. Ideally a serial console log or
if not try using "noreboot" on your hypervisor command line to try and
see the last messages before it reboots.


Alvin Starr                   ||   voice: (905)513-7688
Netvel Inc.                   ||   Cell:  (416)806-0133
alvin@xxxxxxxxxx              ||

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.