[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] lock in vhpet
> At 07:36 +0000 on 23 Apr (1335166577), Zhang, Yang Z wrote: >> The p2m lock in __get_gfn_type_access() is the culprit. Here is the >> profiling data with 10 seconds: >> >> (XEN) p2m_lock 1 lock: >> (XEN) lock: 190733(00000000:14CE5726), block: >> 67159(00000007:6AAA53F3) >> >> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs >> blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle >> is waiting for the p2m lock. And those data only for idle guest. The >> impaction is more seriously when run some workload inside guest. I >> noticed that this change was adding by cs 24770. And before it, we >> don't require the p2m lock in _get_gfn_type_access. So is this lock >> really necessary? > > Ugh; that certainly is a regression. We used to be lock-free on p2m > lookups and losing that will be bad for perf in lots of ways. IIRC the > original aim was to use fine-grained per-page locks for this -- there > should be no need to hold a per-domain lock during a normal read. > Andres, what happened to that code? The fine-grained p2m locking code is stashed somewhere and untested. Obviously not meant for 4.2. I don't think it'll be useful here: all vcpus are hitting the same gfn for the hpet mmio address. Thanks for the report Yang, it would be great to understand at which call sites the p2m lock is contended, so we can try to alleviate contention at the right place. Looking closely at the code, it would seem the only get_gfn in hvmemul_do_io is called on ram_gpa == 0 (?!). Shouldn't ram_gpa underlie the target mmio address for emulation? Notwithstanding the above, we've optimized p2m access on hvmemul_do_io to have as brief a critical section as possible: get_gfn, get_page, put_gfn, work, put_page later. So contention might arise from (bogusly) doing get_gfn(0) by every vcpu (when it should instead be get_gfn(hpet_gfn)). The only way to alleviate that contention is to use get_gfn_query_unlocked, and that will be unsafe against paging or sharing. I am very skeptical this is causing the slowdown you see, since you're not using paging or sharing. The critical section protected by the p2m lock is literally one cmpxchg. The other source of contention might come from hvmemul_rep_movs, which holds the p2m lock for the duration of the mmio operation. I can optimize that one using the get_gfn/get_page/put_gfn pattern mentioned above. The third source of contention might be the __hvm_copy to/from the target address of the hpet value read/written. That one can be slightly optimized by doing the memcpy outside of the scope of the p2m lock. > > Making it an rwlock would be tricky as this interface doesn't > differenctiate readers from writers. For the common case (no > sharing/paging/mem-access) it ought to be a win since there is hardly > any writing. But it would be better to make this particular lock/unlock > go away. > We had a discussion about rwlocks way back when. It's impossible (or nearly so) to tell when an access might upgrade to write privileges. Deadlock has a high likelihood. Andres > Tim. > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |