[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Race between ept_get_entry / ept_set_entry


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
  • Date: Thu, 26 Aug 2010 11:35:58 +0100
  • Cc: "Li, Xin" <xin.li@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, Tim Deegan <Tim.Deegan@xxxxxxxxxx>
  • Delivery-date: Thu, 26 Aug 2010 03:38:38 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; b=jOt9dIH2e0+fiB8kJbKJ2cdLbHjggfSpKinyMcU4aO05dw0PquBbaxwgsBE1HtZoVs nYU+/ahMZtJZaVQNdzes8/rQkRfkczr4cnUd1ybeFnthTLwTg+qgsvTMNjNOZFNd/sWF vuwAc8rFg8pYt21ohMsn8MXYooX8ak9agcHPw=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

In the course of doing some fixes for my populate-on-demand testing, I
found that a Windows Server 2008 VM with 30G static max and 24G ram
(i.e., booting ballooned) crashed 1-2 times out of ten during boot,
reporting MMIO errors.

I managed to get a trace of this crash.  Strangely enough, the trace
indicated that the page the NPF occured on was populate-on-demand --
but that hvm_hap_nested_page_fault() injected a GP anyway.

The only way this would be possible is if the gfn_to_mfn_query() in
the trace function got a p2m type of p2m_popluate_on_demand, but the
gfn_to_mfn_current() in hvm_hap_nested_page_fault() got a p2m type of
p2m_mmio_dm.

Looking at the trace (snippet attached), the failed NPF happened on
d1v1; but almost simultaneously on d1v0, an NPF fault happened that
caused a populate-on-demand demand populate.  That demand populate
happened to be of a superpage that was shared with the gpa fault on
d1v1.

So, the first query on d1v1 (correctly) got a PoD; but the second
query, instead of either causing the demand-populate, or successfully
getting the result of d1v0's demand populate, returned failure,
causing the guest to crash.

I looked in the p2m-ept.c code, and noticed (once again) that
ept_get_entry() can be called without the p2m lock held.  I added
conditional locks, and am running the test again. The guest has now
booted 20 times successfully without crashing (whereas before, the
average was about 2 in 10 crashing).

Looking closely at the code, I can see one potential race:
* entry starts out PoD, not-present.
* v0 finds the entry PoD, allocates a page, calls set_p2m_entry(),
which calls ept_set_entry().
* v1 begins to walk the pagetable; at some point, it calls
ept_next_level(), which finds the flags all clear (entry->epte & 7 ==
0)
* v0 ept_set_entry() changes the p2m type from p2m_populate_on_demand
to p2m_ram_rw
* v1 ept_next_level() reads entry->avail1 and finds that it is not
p2m_populate_on_demand, so it returns GUEST_TABLE_MAP_FAILED
* v0 ept_set_entry() sets the flags to present.

Is there a good reason not to just grab the p2m lock when walking the
ept tables?  We could conceivably do some cleverness to avoid this
kind of race, but unless there's a significant performance gain, I
think the simple approach is better.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.