Xen project Mailing List

Re: [Xen-devel] PoD code killing domain before it really gets started

To: "Ian Campbell" <ian.campbell@xxxxxxxxxx>, "George Dunlap" <george.dunlap@xxxxxxxxxxxxx>, "Ian Jackson" <Ian.Jackson@xxxxxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Mon, 06 Aug 2012 14:57:34 +0100

Cc: xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Mon, 06 Aug 2012 13:58:54 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 26.07.12 at 18:14, George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote: > Yes, this is a very strange circumstance: because p2m_demand_populate() > shouldn't happen until at least one PoD entry has been created; and that > shouldn't happen until after c0...7ff have been populated with 4k pages. So meanwhile I was told that this very likely is caused by an access originating in Dom0. Just a few minutes ago I also got hold of call stacks (as was already seen with the original messages, it produces two instances per bad access): ... (XEN) gpmpod(1, 8000, 9) -> 0 [dom0] (XEN) gpmpod(1, 8200, 9) -> 0 [dom0] [coming from printk("gpmpod(%d, %lx, %u) -> %d [dom%d]\n", d->domain_id, gfn, order, rc, current->domain->domain_id); at the end of guest_physmap_mark_populate_on_demand()] (XEN) p2m_pod_demand_populate: Dom1 out of PoD memory! (tot=1e0 ents=8200 dom0) [altered message at the failure point in p2m_pod_demand_populate(): - printk("%s: Out of populate-on-demand memory! tot_pages %" PRIu32 " pod_entries %" PRIi32 "\n", - __func__, d->tot_pages, p2md->pod.entry_count); + printk("%s: Dom%d out of PoD memory! (tot=%"PRIx32" ents=%"PRIx32" dom%d)\n", + __func__, d->domain_id, d->tot_pages, p2md->pod.entry_count, current->domain->domain_id); +WARN_ON(1); ] (XEN) Xen WARN at p2m.c:1155 (XEN) ----[ Xen-4.0.3_21548_04a-0.9.1 x86_64 debug=n Tainted: C ]---- (XEN) CPU: 2 (XEN) RIP: e008:[<ffff82c4801cbf86>] p2m_pod_demand_populate+0x836/0xab0 ... (XEN) Xen call trace: (XEN) [<ffff82c4801cbf86>] p2m_pod_demand_populate+0x836/0xab0 (XEN) [<ffff82c4801676b1>] get_page_and_type_from_pagenr+0x91/0x100 (XEN) [<ffff82c4801f02d4>] ept_pod_check_and_populate+0x104/0x1a0 (XEN) [<ffff82c4801f0482>] ept_get_entry+0x112/0x230 (XEN) [<ffff82c48016be98>] do_mmu_update+0x16d8/0x1930 (XEN) [<ffff82c4801f8c51>] do_iret+0xc1/0x1a0 (XEN) [<ffff82c4801f4189>] syscall_enter+0xa9/0xae (XEN) (XEN) domain_crash called from p2m.c:1156 (XEN) Domain 1 reported crashed by domain 0 on cpu#2: (XEN) p2m_pod_demand_populate: Dom1 out of PoD memory! (tot=1e0 ents=8200 dom0) (XEN) Xen WARN at p2m.c:1155 (XEN) ----[ Xen-4.0.3_21548_04a-0.9.1 x86_64 debug=n Tainted: C ]---- (XEN) CPU: 2 (XEN) RIP: e008:[<ffff82c4801cbf86>] p2m_pod_demand_populate+0x836/0xab0 ... (XEN) Xen call trace: (XEN) [<ffff82c4801cbf86>] p2m_pod_demand_populate+0x836/0xab0 (XEN) [<ffff82c480108733>] send_guest_global_virq+0x93/0xe0 (XEN) [<ffff82c4801cbfb2>] p2m_pod_demand_populate+0x862/0xab0 (XEN) [<ffff82c4801f02d4>] ept_pod_check_and_populate+0x104/0x1a0 (XEN) [<ffff82c4801f0482>] ept_get_entry+0x112/0x230 (XEN) [<ffff82c48016890b>] mod_l1_entry+0x47b/0x650 (XEN) [<ffff82c4801f0482>] ept_get_entry+0x112/0x230 (XEN) [<ffff82c48016b21a>] do_mmu_update+0xa5a/0x1930 (XEN) [<ffff82c4801f8c51>] do_iret+0xc1/0x1a0 (XEN) [<ffff82c4801f4189>] syscall_enter+0xa9/0xae (XEN) (XEN) domain_crash called from p2m.c:1156 This clarifies at least why there are two events (and, despite the code having changed quite a bit, appears to still be the case for -unstable): case MMU_NORMAL_PT_UPDATE, sub-case PGT_l1_page_table call (in -unstable terms) get_page_from_gfn() but ignores the return value (which ought to be NULL here), and only partially inspects the returned type. As the type matches none of the ones looked for, it happily proceeds into mod_l1_entry(), which then calls get_page_from_gfn() again. > Although, it does look as though when populating 4k pages, the code > doesn't actually look to see if the allocation succeeded or not... oh > wait, no, it actually checks rc as a condition of the while() loop -- > but that is then clobbered by the xc_domain_set_pod_target() call. But > surely if the 4k allocation failed, the set_target() call should fail as > well? And in any case, there shouldn't yet be any PoD entries to cause > a demand-populate. > > We probably should change "if(pod_mode)" to "if(rc == 0 && pod_mode)" or > something like that, just to be sure. I'll spin up a patch. I had also included this adjustment in the debugging patch, but this clearly isn't related to the problem. The domain indeed has 0x1e0 pages allocated, and a huge (still growing number) of PoD entries. And apparently this fails so rarely because it's pretty unlikely that there's not a single clear page that the PoD code can select as victim, plus the Dom0 space code likely also only infrequently happens to kick in at the wrong time. So in the end it presumably boils down to decide whether such an out-of-band Dom0 access is valid to be done (and I think it is). If it is, then xc_hvm_build_x86.c:setup_guest() should make sure any actually allocated pages (those coming from the calls to xc_domain_populate_physmap_exact()) get cleared when pod_mode is set. Otoh, as pointed out in a yet unanswered mail (see http://lists.xen.org/archives/html/xen-devel/2012-07/msg01331.html), these allocations could/should - when pod_mode is set - similarly be done with XENMEMF_populate_on_demand set. In such a case, _any_ Dom0 access to guest memory prior to the call to xc_domain_set_pod_target() would kill the domain, as there is not even a single page to be looked at as a possible victim. Consequently, I would think that the guest shouldn't be killed unconditionally when a PoD operation didn't succeed - in particular not when the access was from a foreign (i.e. the management) domain. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.