Xen project Mailing List

Re: [Xen-devel] lock in vhpet

To: "Tim Deegan" <tim@xxxxxxx>

From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>

Date: Mon, 23 Apr 2012 10:18:40 -0700

Cc: "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>

Delivery-date: Mon, 23 Apr 2012 17:19:24 +0000

Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=p+8YYscySsG30na9gi5I5rohUqUd4TfxP0c6A+4Z8OzA SOXPlYADQAr/q4Mp+pRkwlyJQ93zp8oyw1qcdn4msJN7nKJkSuaeZvhgD58hf1I2 V00ovSkZOls5Cooxh47+V3ec4aBIy/V9sWXXAH+H7/EkJQHzQRmsmCP4siLdB6E=

List-id: Xen developer discussion <xen-devel.lists.xen.org>

> At 07:36 +0000 on 23 Apr (1335166577), Zhang, Yang Z wrote: >> The p2m lock in __get_gfn_type_access() is the culprit. Here is the >> profiling data with 10 seconds: >> >> (XEN) p2m_lock 1 lock: >> (XEN) lock: 190733(00000000:14CE5726), block: >> 67159(00000007:6AAA53F3) >> >> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs >> blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle >> is waiting for the p2m lock. And those data only for idle guest. The >> impaction is more seriously when run some workload inside guest. I >> noticed that this change was adding by cs 24770. And before it, we >> don't require the p2m lock in _get_gfn_type_access. So is this lock >> really necessary? > > Ugh; that certainly is a regression. We used to be lock-free on p2m > lookups and losing that will be bad for perf in lots of ways. IIRC the > original aim was to use fine-grained per-page locks for this -- there > should be no need to hold a per-domain lock during a normal read. > Andres, what happened to that code? Sigh, scratch a lot of nonsense I just spewed on the "hpet gfn". No actual page backing that one, no concerns. Still is the case that ram_gpa is zero in many cases going into hvmemul_do_io. There really is no point in doing a get_page(get_gfn(0)). How about the following (there's more email after this patch): # HG changeset patch # Parent ccc64793187f7d9c926343a1cd4ae822a4364bd1 x86/emulation: No need to get_gfn on zero ram_gpa. Signed-off-by: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx> diff -r ccc64793187f xen/arch/x86/hvm/emulate.c --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -60,33 +60,37 @@ static int hvmemul_do_io( ioreq_t *p = get_ioreq(curr); unsigned long ram_gfn = paddr_to_pfn(ram_gpa); p2m_type_t p2mt; - mfn_t ram_mfn; + mfn_t ram_mfn = _mfn(INVALID_MFN); int rc; - /* Check for paged out page */ - ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt); - if ( p2m_is_paging(p2mt) ) - { - put_gfn(curr->domain, ram_gfn); - p2m_mem_paging_populate(curr->domain, ram_gfn); - return X86EMUL_RETRY; - } - if ( p2m_is_shared(p2mt) ) - { - put_gfn(curr->domain, ram_gfn); - return X86EMUL_RETRY; - } - - /* Maintain a ref on the mfn to ensure liveness. Put the gfn - * to avoid potential deadlock wrt event channel lock, later. */ - if ( mfn_valid(mfn_x(ram_mfn)) ) - if ( !get_page(mfn_to_page(mfn_x(ram_mfn)), - curr->domain) ) + /* Many callers pass a stub zero ram_gpa addresses. */ + if ( ram_gfn != 0 ) + { + /* Check for paged out page */ + ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt); + if ( p2m_is_paging(p2mt) ) { - put_gfn(curr->domain, ram_gfn); + put_gfn(curr->domain, ram_gfn); + p2m_mem_paging_populate(curr->domain, ram_gfn); return X86EMUL_RETRY; } - put_gfn(curr->domain, ram_gfn); + if ( p2m_is_shared(p2mt) ) + { + put_gfn(curr->domain, ram_gfn); + return X86EMUL_RETRY; + } + + /* Maintain a ref on the mfn to ensure liveness. Put the gfn + * to avoid potential deadlock wrt event channel lock, later. */ + if ( mfn_valid(mfn_x(ram_mfn)) ) + if ( !get_page(mfn_to_page(mfn_x(ram_mfn)), + curr->domain) ) + { + put_gfn(curr->domain, ram_gfn); + return X86EMUL_RETRY; + } + put_gfn(curr->domain, ram_gfn); + } /* * Weird-sized accesses have undefined behaviour: we discard writes If contention is coming in from the emul_rep_movs path, then that might be taken care of with the following: # HG changeset patch # Parent 18b694840961cb6e3934628b23902a7979f00bac x86/emulate: Relieve contention of p2m lock in emulation of rep movs. get_two_gfns is used to query the source and target physical addresses of the emulated rep movs. It is not necessary to hold onto the two gfn's for the duration of the emulation: further calls down the line will do the appropriate unsahring or paging in, and unwind correctly on failure. Signed-off-by: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx> diff -r 18b694840961 xen/arch/x86/hvm/emulate.c --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -714,25 +714,23 @@ static int hvmemul_rep_movs( if ( rc != X86EMUL_OKAY ) return rc; + /* The query on the gfns is to establish the need for mmio. In the two mmio + * cases, a proper get_gfn for the "other" gfn will be enacted, with paging in + * or unsharing if necessary. In the memmove case, the gfn might change given + * the bytes mov'ed, and, again, proper get_gfn's will be enacted in + * __hvm_copy. */ get_two_gfns(current->domain, sgpa >> PAGE_SHIFT, &sp2mt, NULL, NULL, current->domain, dgpa >> PAGE_SHIFT, &dp2mt, NULL, NULL, P2M_ALLOC, &tg); - + put_two_gfns(&tg); + if ( !p2m_is_ram(sp2mt) && !p2m_is_grant(sp2mt) ) - { - rc = hvmemul_do_mmio( + return hvmemul_do_mmio( sgpa, reps, bytes_per_rep, dgpa, IOREQ_READ, df, NULL); - put_two_gfns(&tg); - return rc; - } if ( !p2m_is_ram(dp2mt) && !p2m_is_grant(dp2mt) ) - { - rc = hvmemul_do_mmio( + return hvmemul_do_mmio( dgpa, reps, bytes_per_rep, sgpa, IOREQ_WRITE, df, NULL); - put_two_gfns(&tg); - return rc; - } /* RAM-to-RAM copy: emulate as equivalent of memmove(dgpa, sgpa, bytes). */ bytes = *reps * bytes_per_rep; @@ -747,10 +745,7 @@ static int hvmemul_rep_movs( * can be emulated by a source-to-buffer-to-destination block copy. */ if ( ((dgpa + bytes_per_rep) > sgpa) && (dgpa < (sgpa + bytes)) ) - { - put_two_gfns(&tg); return X86EMUL_UNHANDLEABLE; - } /* Adjust destination address for reverse copy. */ if ( df ) @@ -759,10 +754,7 @@ static int hvmemul_rep_movs( /* Allocate temporary buffer. Fall back to slow emulation if this fails. */ buf = xmalloc_bytes(bytes); if ( buf == NULL ) - { - put_two_gfns(&tg); return X86EMUL_UNHANDLEABLE; - } /* * We do a modicum of checking here, just for paranoia's sake and to @@ -773,7 +765,6 @@ static int hvmemul_rep_movs( rc = hvm_copy_to_guest_phys(dgpa, buf, bytes); xfree(buf); - put_two_gfns(&tg); if ( rc == HVMCOPY_gfn_paged_out ) return X86EMUL_RETRY; Let me know if any of this helps Thanks, Andres > > Making it an rwlock would be tricky as this interface doesn't > differenctiate readers from writers. For the common case (no > sharing/paging/mem-access) it ought to be a win since there is hardly > any writing. But it would be better to make this particular lock/unlock > go away. > > Tim. > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.