|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: HVM/PVH Balloon crash
On 27.09.2021 00:53, Elliott Mitchell wrote:
> On Wed, Sep 15, 2021 at 08:05:05AM +0200, Jan Beulich wrote:
>> On 15.09.2021 04:40, Elliott Mitchell wrote:
>>> On Tue, Sep 07, 2021 at 05:57:10PM +0200, Jan Beulich wrote:
>>>> On 07.09.2021 17:03, Elliott Mitchell wrote:
>>>>> Could be this system is in an
>>>>> intergenerational hole, and some spot in the PVH/HVM code makes an
>>>>> assumption of the presence of NPT guarantees presence of an operational
>>>>> IOMMU. Otherwise if there was some copy and paste while writing IOMMU
>>>>> code, some portion of the IOMMU code might be checking for presence of
>>>>> NPT instead of presence of IOMMU.
>>>>
>>>> This is all very speculative; I consider what you suspect not very likely,
>>>> but also not entirely impossible. This is not the least because for a
>>>> long time we've been running without shared page tables on AMD.
>>>>
>>>> I'm afraid without technical data and without knowing how to repro, I
>>>> don't see a way forward here.
>>>
>>> Downtimes are very expensive even for lower-end servers. Plus there is
>>> the issue the system wasn't meant for development and thus never had
>>> appropriate setup done.
>>>
>>> Experimentation with a system of similar age suggested another candidate.
>>> System has a conventional BIOS. Might some dependancies on the presence
>>> of UEFI snuck into the NPT code?
>>
>> I can't think of any such, but as all of this is very nebulous I can't
>> really rule out anything.
>
> Getting everything right to recreate is rather inexact. Having an
> equivalent of `sysctl` to turn on the serial console while running might
> be handy...
>
> Luckily get things together and...
>
> (XEN) mm locking order violation: 48 > 16
> (XEN) Xen BUG at mm-locks.h:82
Would you give the patch below a try? While against current staging it
looks to apply fine to 4.14.3.
Jan
x86/PoD: defer nested P2M flushes
With NPT or shadow in use, the p2m_set_entry() -> p2m_pt_set_entry() ->
write_p2m_entry() -> p2m_flush_nestedp2m() call sequence triggers a lock
order violation when the PoD lock is held around it. Hence such flushing
needs to be deferred. Steal the approach from p2m_change_type_range().
Reported-by: Elliott Mitchell <ehem+xen@xxxxxxx>
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -24,6 +24,7 @@
#include <xen/mm.h>
#include <xen/sched.h>
#include <xen/trace.h>
+#include <asm/hvm/nestedhvm.h>
#include <asm/page.h>
#include <asm/paging.h>
#include <asm/p2m.h>
@@ -494,6 +495,13 @@ p2m_pod_offline_or_broken_replace(struct
static int
p2m_pod_zero_check_superpage(struct p2m_domain *p2m, gfn_t gfn);
+static void pod_unlock_and_flush(struct p2m_domain *p2m)
+{
+ pod_unlock(p2m);
+ p2m->defer_nested_flush = false;
+ if ( nestedhvm_enabled(p2m->domain) )
+ p2m_flush_nestedp2m(p2m->domain);
+}
/*
* This function is needed for two reasons:
@@ -514,6 +522,7 @@ p2m_pod_decrease_reservation(struct doma
gfn_lock(p2m, gfn, order);
pod_lock(p2m);
+ p2m->defer_nested_flush = true;
/*
* If we don't have any outstanding PoD entries, let things take their
@@ -665,7 +674,7 @@ out_entry_check:
}
out_unlock:
- pod_unlock(p2m);
+ pod_unlock_and_flush(p2m);
gfn_unlock(p2m, gfn, order);
return ret;
}
@@ -1144,8 +1153,10 @@ p2m_pod_demand_populate(struct p2m_domai
* won't start until we're done.
*/
if ( unlikely(d->is_dying) )
- goto out_fail;
-
+ {
+ pod_unlock(p2m);
+ return false;
+ }
/*
* Because PoD does not have cache list for 1GB pages, it has to remap
@@ -1167,6 +1178,8 @@ p2m_pod_demand_populate(struct p2m_domai
p2m_populate_on_demand, p2m->default_access);
}
+ p2m->defer_nested_flush = true;
+
/* Only reclaim if we're in actual need of more cache. */
if ( p2m->pod.entry_count > p2m->pod.count )
pod_eager_reclaim(p2m);
@@ -1229,8 +1242,9 @@ p2m_pod_demand_populate(struct p2m_domai
__trace_var(TRC_MEM_POD_POPULATE, 0, sizeof(t), &t);
}
- pod_unlock(p2m);
+ pod_unlock_and_flush(p2m);
return true;
+
out_of_memory:
pod_unlock(p2m);
@@ -1239,12 +1253,14 @@ out_of_memory:
p2m->pod.entry_count, current->domain->domain_id);
domain_crash(d);
return false;
+
out_fail:
- pod_unlock(p2m);
+ pod_unlock_and_flush(p2m);
return false;
+
remap_and_retry:
BUG_ON(order != PAGE_ORDER_2M);
- pod_unlock(p2m);
+ pod_unlock_and_flush(p2m);
/*
* Remap this 2-meg region in singleton chunks. See the comment on the
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |