[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH 0 of 3] x86/mem_sharing: Improve performance of rmap, fix cascading bugs


  • To: xen-devel@xxxxxxxxxxxxx
  • From: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>
  • Date: Tue, 24 Apr 2012 15:48:20 -0400
  • Cc: andres@xxxxxxxxxxxxxx, tim@xxxxxxx
  • Delivery-date: Tue, 24 Apr 2012 19:44:02 +0000
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=content-type :mime-version:content-transfer-encoding:subject:message-id:date :from:to:cc; q=dns; s=lagarcavilla.org; b=r2krlmilVC0uOo1QVUydWh cKjeBhnWG+ODkoaVkaBJcgAnEyyeJcRWyzbidY+FWbBWfhuCn+3o3J2WXATj+GiE LqtvuxJQsnWFnoYuxMDyQMYVhsKdIyAwu5BteiJKSc1cFxe9+gHhDlXxqQsnIjkR SmQ5FetOBjNCwHuMlhWXo=
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

This is a repost of patches 2 and 3 of the series initially posted on Apr 12th.

The first patch has been split into two functionally isolated patches as per
Tim Deegan's request.

The original posting (suitably edited) follows
--------------------------------
The sharing subsystem does not scale elegantly with high degrees of page
sharing. The culprit is a reverse map that each shared frame maintains,
resolving to all domain pages pointing to the shared frame. Because the rmap is
implemented with a O(n) search linked-list, CoW unsharing can result in
prolonged search times.

The place where this becomes most obvious is during domain destruction, during
which all shared p2m entries need to be unshared. Destroying a domain with a
lot of sharing could result in minutes of hypervisor freeze-up: 7 minutes for a
2 GiB domain! As a result, errors cascade throughout the system, including soft
lockups, watchdogs firing, IO drivers timing out, etc.

The proposed solution is to mutate the rmap from a linked list to a hash table
when the number of domain pages referencing the shared frame exceeds a
threshold. This maintains minimal space use for the common case of relatively
low sharing, and switches to an O(1) data structure for heavily shared pages,
with an space overhead of one page. The threshold chosen is 256, as a single
page can fit 256 spill lists for 256 buckets in a hash table.

With these patches in place, domain destruction for a 2 GiB domain with a
shared frame including over a hundred thousand references drops from 7 minutes
to two seconds.

Signed-off-by: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>

 xen/arch/x86/mm/mem_sharing.c     |    8 +-
 xen/arch/x86/mm/mem_sharing.c     |  138 ++++++++++++++++++++++--------
 xen/arch/x86/mm/mem_sharing.c     |  170 +++++++++++++++++++++++++++++++++++--
 xen/include/asm-x86/mem_sharing.h |   13 ++-
 4 files changed, 274 insertions(+), 55 deletions(-)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.