[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] lock in vhpet
> At 02:36 +0000 on 25 Apr (1335321409), Zhang, Yang Z wrote: >> > > But actually, the first cs introduced this issue is 24770. When win8 >> > > booting and if hpet is enabled, it will use hpet as the time source >> > > and there have lots of hpet access and EPT violation. In EPT >> violation >> > > handler, it call get_gfn_type_access to get the mfn. The cs 24770 >> > > introduces the gfn_lock for p2m lookups, and then the issue happens. >> > > After I removed the gfn_lock, the issue goes. But in latest xen, >> even >> > > I remove this lock, it still shows high cpu utilization. >> > >> > It would seem then that even the briefest lock-protected critical >> section would >> > cause this? In the mmio case, the p2m lock taken in the hap fault >> handler is >> > held during the actual lookup, and for a couple of branch instructions >> > afterwards. >> > >> > In latest Xen, with lock removed for get_gfn, on which lock is time >> spent? >> Still the p2m_lock. > > Can you please try the attached patch? I think you'll need this one > plus the ones that take the locks out of the hpet code. Right off the bat I'm getting a multitude of (XEN) mm.c:3294:d0 Error while clearing mfn 100cbb7 And a hung dom0 during initramfs. I'm a little baffled as to why, but it's there (32 bit dom0, XenServer6). > > This patch makes the p2m lock into an rwlock and adjusts a number of the > paths that don't update the p2m so they only take the read lock. It's a > bit rough but I can boot 16-way win7 guest with it. > > N.B. Since rwlocks don't show up the the existing lock profiling, please > don't try to use the lock-profiling numbers to see if it's helping! > > Andres, this is basically the big-hammer version of your "take a > pagecount" changes, plus the change you made to hvmemul_rep_movs(). > If this works I intend to follow it up with a patch to make some of the > read-modify-write paths avoid taking the lock (by using a > compare-exchange operation so they only take the lock on a write). If > that succeeds I might drop put_gfn() altogether. You mean cmpxchg the whole p2m entry? I don't think I parse the plan. There are code paths that do get_gfn_query -> p2m_change_type -> put_gfn. But I guess those could lock the p2m up-front if they become the only consumers of put_gfn left. > > But first it will need a lot of tidying up. Noticeably missing: > - SVM code equivalents to the vmx.c changes load_pdptrs doesn't exist on svm. There are a couple of debugging get_gfn_query that can be made unlocked. And svm_vmcb_restore needs similar treatment to what you did in vmx.c. The question is who will be able to test it... > - grant-table operations still use the lock, because frankly I > could not follow the current code, and it's quite late in the evening. It's pretty complex with serious nesting, and ifdef's for arm and 32 bit. gfn_to_mfn_private callers will suffer from altering the current meaning, as put_gfn resolves to the right thing for the ifdef'ed arch. The other user is grant_transfer which also relies on the page *not* having an extra ref in steal_page. So it's a prime candidate to be left alone. > I also have a long list of uglinesses in the mm code that I found Uh, ugly stuff, how could that have happened? I have a few preliminary observations on the patch. Pasting relevant bits here, since the body of the patch seems to have been lost by the email thread: @@ -977,23 +976,25 @@ int arch_set_info_guest( ... + + if (!paging_mode_refcounts(d) + && !get_page_and_type(cr3_page, d, PGT_l3_page_table) ) replace with && !get_page_type() ) @@ -2404,32 +2373,33 @@ static enum hvm_copy_result __hvm_copy( gfn = addr >> PAGE_SHIFT; } - mfn = mfn_x(get_gfn_unshare(curr->domain, gfn, &p2mt)); + page = get_page_from_gfn(curr->domain, gfn, &p2mt, P2M_UNSHARE); replace with (flags & HVMCOPY_to_guest) ? P2M_UNSHARE : P2M_ALLOC (and same logic when checking p2m_is_shared). Not truly related to your patch bit since we're at it. Same, further down - if ( !p2m_is_ram(p2mt) ) + if ( !page ) { - put_gfn(curr->domain, gfn); + if ( page ) + put_page(page); Last two lines are redundant @@ -4019,35 +3993,16 @@ long do_hvm_op(unsigned long op, XEN_GUE case HVMOP_modified_memory: a lot of error checking has been removed. At the very least: if ( page ) { ... } else { rc = -EINVAL; goto param_fail3; } arch/x86/mm.c:do_mmu_update -> you blew up all the paging/sharing checking for target gfns of mmu updates of l2/3/4 entries. It seems that this wouldn't work anyways, that's why you killed it? +++ b/xen/arch/x86/mm/hap/guest_walk.c @@ -54,34 +54,37 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA ... + if ( !top_page ) { pfec[0] &= ~PFEC_page_present; - __put_gfn(p2m, top_gfn); + put_page(top_page); top_page is NULL here, remove put_page get_page_from_gfn_p2m, slow path: no need for p2m_lock/unlock since locking is already done by get_gfn_type_access/__put_gfn (hope those observations made sense without inlining them in the actual patch) Andres _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |