[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PoD code killing domain before it really gets started



>>> On 26.07.12 at 18:14, George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote:
> Yes, this is a very strange circumstance: because p2m_demand_populate() 
> shouldn't happen until at least one PoD entry has been created; and that 
> shouldn't happen until after c0...7ff have been populated with 4k pages.

So meanwhile I was told that this very likely is caused by an access
originating in Dom0. Just a few minutes ago I also got hold of call
stacks (as was already seen with the original messages, it
produces two instances per bad access):

...
(XEN) gpmpod(1, 8000, 9) -> 0 [dom0]
(XEN) gpmpod(1, 8200, 9) -> 0 [dom0]

[coming from

printk("gpmpod(%d, %lx, %u) -> %d [dom%d]\n", d->domain_id, gfn, order, rc, 
current->domain->domain_id);

at the end of guest_physmap_mark_populate_on_demand()]

(XEN) p2m_pod_demand_populate: Dom1 out of PoD memory! (tot=1e0 ents=8200 dom0)

[altered message at the failure point in p2m_pod_demand_populate():

-    printk("%s: Out of populate-on-demand memory! tot_pages %" PRIu32 " 
pod_entries %" PRIi32 "\n",
-           __func__, d->tot_pages, p2md->pod.entry_count);
+    printk("%s: Dom%d out of PoD memory! (tot=%"PRIx32" ents=%"PRIx32" 
dom%d)\n",
+           __func__, d->domain_id, d->tot_pages, p2md->pod.entry_count, 
current->domain->domain_id);
+WARN_ON(1);

]

(XEN) Xen WARN at p2m.c:1155
(XEN) ----[ Xen-4.0.3_21548_04a-0.9.1  x86_64  debug=n  Tainted:    C ]----
(XEN) CPU:    2
(XEN) RIP:    e008:[<ffff82c4801cbf86>] p2m_pod_demand_populate+0x836/0xab0
...
(XEN) Xen call trace:
(XEN)    [<ffff82c4801cbf86>] p2m_pod_demand_populate+0x836/0xab0
(XEN)    [<ffff82c4801676b1>] get_page_and_type_from_pagenr+0x91/0x100
(XEN)    [<ffff82c4801f02d4>] ept_pod_check_and_populate+0x104/0x1a0
(XEN)    [<ffff82c4801f0482>] ept_get_entry+0x112/0x230
(XEN)    [<ffff82c48016be98>] do_mmu_update+0x16d8/0x1930
(XEN)    [<ffff82c4801f8c51>] do_iret+0xc1/0x1a0
(XEN)    [<ffff82c4801f4189>] syscall_enter+0xa9/0xae
(XEN)
(XEN) domain_crash called from p2m.c:1156
(XEN) Domain 1 reported crashed by domain 0 on cpu#2:
(XEN) p2m_pod_demand_populate: Dom1 out of PoD memory! (tot=1e0 ents=8200 dom0)
(XEN) Xen WARN at p2m.c:1155
(XEN) ----[ Xen-4.0.3_21548_04a-0.9.1  x86_64  debug=n  Tainted:    C ]----
(XEN) CPU:    2
(XEN) RIP:    e008:[<ffff82c4801cbf86>] p2m_pod_demand_populate+0x836/0xab0
...
(XEN) Xen call trace:
(XEN)    [<ffff82c4801cbf86>] p2m_pod_demand_populate+0x836/0xab0
(XEN)    [<ffff82c480108733>] send_guest_global_virq+0x93/0xe0
(XEN)    [<ffff82c4801cbfb2>] p2m_pod_demand_populate+0x862/0xab0
(XEN)    [<ffff82c4801f02d4>] ept_pod_check_and_populate+0x104/0x1a0
(XEN)    [<ffff82c4801f0482>] ept_get_entry+0x112/0x230
(XEN)    [<ffff82c48016890b>] mod_l1_entry+0x47b/0x650
(XEN)    [<ffff82c4801f0482>] ept_get_entry+0x112/0x230
(XEN)    [<ffff82c48016b21a>] do_mmu_update+0xa5a/0x1930
(XEN)    [<ffff82c4801f8c51>] do_iret+0xc1/0x1a0
(XEN)    [<ffff82c4801f4189>] syscall_enter+0xa9/0xae
(XEN)
(XEN) domain_crash called from p2m.c:1156

This clarifies at least why there are two events (and, despite
the code having changed quite a bit, appears to still be the
case for -unstable): case MMU_NORMAL_PT_UPDATE, sub-case
PGT_l1_page_table call (in -unstable terms) get_page_from_gfn()
but ignores the return value (which ought to be NULL here), and
only partially inspects the returned type. As the type matches
none of the ones looked for, it happily proceeds into
mod_l1_entry(), which then calls get_page_from_gfn() again.

> Although, it does look as though when populating 4k pages, the code 
> doesn't actually look to see if the allocation succeeded or not... oh 
> wait, no, it actually checks rc as a condition of the while() loop -- 
> but that is then clobbered by the xc_domain_set_pod_target() call.  But 
> surely if the 4k allocation failed, the set_target() call should fail as 
> well?  And in any case, there shouldn't yet be any PoD entries to cause 
> a demand-populate.
> 
> We probably should change "if(pod_mode)" to "if(rc == 0 && pod_mode)" or 
> something like that, just to be sure.  I'll spin up a patch.

I had also included this adjustment in the debugging patch, but
this clearly isn't related to the problem.

The domain indeed has 0x1e0 pages allocated, and a huge (still
growing number) of PoD entries. And apparently this fails so
rarely because it's pretty unlikely that there's not a single clear
page that the PoD code can select as victim, plus the Dom0
space code likely also only infrequently happens to kick in at
the wrong time.

So in the end it presumably boils down to decide whether such
an out-of-band Dom0 access is valid to be done (and I think it
is). If it is, then xc_hvm_build_x86.c:setup_guest() should
make sure any actually allocated pages (those coming from the
calls to xc_domain_populate_physmap_exact()) get cleared
when pod_mode is set.

Otoh, as pointed out in a yet unanswered mail (see
http://lists.xen.org/archives/html/xen-devel/2012-07/msg01331.html),
these allocations could/should - when pod_mode is set -
similarly be done with XENMEMF_populate_on_demand set.
In such a case, _any_ Dom0 access to guest memory prior
to the call to xc_domain_set_pod_target() would kill the
domain, as there is not even a single page to be looked at
as a possible victim. Consequently, I would think that the
guest shouldn't be killed unconditionally when a PoD
operation didn't succeed - in particular not when the
access was from a foreign (i.e. the management) domain.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.