[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] lock in vhpet


  • To: "Tim Deegan" <tim@xxxxxxx>
  • From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
  • Date: Tue, 24 Apr 2012 06:28:56 -0700
  • Cc: "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>
  • Delivery-date: Tue, 24 Apr 2012 13:29:12 +0000
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=X6x/wVWAf4vomuoZ5dY9grgR6yOZNSXU6RfufUeYVw4T qt+3Qu/YOqHYfgstDBzfAymC2zMFDTg4+U4LLd+HipQoaaSCkJr+377pPbJOW8if gMqKVx1M9SNFwAaQ8sqjHh4C2QbgoXTqhTdYlnrxMFf4ZOEdhUGSJCrG5EePehk=
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

> At 08:26 -0700 on 23 Apr (1335169568), Andres Lagar-Cavilla wrote:
>> > At 07:36 +0000 on 23 Apr (1335166577), Zhang, Yang Z wrote:
>> >> The p2m lock in __get_gfn_type_access() is the culprit. Here is the
>> >> profiling data with 10 seconds:
>> >>
>> >> (XEN) p2m_lock 1 lock:
>> >> (XEN)   lock:      190733(00000000:14CE5726), block:
>> >> 67159(00000007:6AAA53F3)
>> >>
>> >> Those data is collected when win8 guest(16 vcpus) is idle. 16 VCPUs
>> >> blocked 30 seconds with 10 sec's profiling. It means 18% of cpu cycle
>> >> is waiting for the p2m lock. And those data only for idle guest. The
>> >> impaction is more seriously when run some workload inside guest.  I
>> >> noticed that this change was adding by cs 24770. And before it, we
>> >> don't require the p2m lock in _get_gfn_type_access. So is this lock
>> >> really necessary?
>> >
>> > Ugh; that certainly is a regression.  We used to be lock-free on p2m
>> > lookups and losing that will be bad for perf in lots of ways.  IIRC
>> the
>> > original aim was to use fine-grained per-page locks for this -- there
>> > should be no need to hold a per-domain lock during a normal read.
>> > Andres, what happened to that code?
>>
>> The fine-grained p2m locking code is stashed somewhere and untested.
>> Obviously not meant for 4.2. I don't think it'll be useful here: all
>> vcpus
>> are hitting the same gfn for the hpet mmio address.
>
> We'll have to do _something_ for 4.2 if it's introducing an 18% CPU
> overhead in an otherwise idle VM.

An hpet mmio read or write hits get_gfn twice: one for gfn zero at the
beginning of hvmemul_do_io, the other during hvm copy. The patch I sent to
Yang yesterday removes the bogus first get_gfn. The second one has to
perform a locked query.

Yang, is there any possibility to get more information here? The ability
to identify the call site that contends for the p2m lock would be crucial.
Even something as crude as sampling vcpu stack traces by hitting 'd'
repeatedly on the serial line, and seeing what sticks out frequently.

>
> In any case I think this means I probably shouldn't take the patch that
> turns on this locking for shadow VMs.  They do a lot more p2m lookups.
>
>> The other source of contention might come from hvmemul_rep_movs, which
>> holds the p2m lock for the duration of the mmio operation. I can
>> optimize
>> that one using the get_gfn/get_page/put_gfn pattern mentioned above.
>
> But wouldn't that be unsafe?  What if the p2m changes during the
> operation?  Or, conversely, could you replace all uses of the lock in
> p2m lookups with get_page() on the result and still get what you need?

I've been thinking of wrapping the pattern into a handy p2m accessor. I
see this pattern repeating itself for hvm_copy, tss/segment entry, page
table walking, nested hvm, etc, in which the consumer wants to map the
result of the translation. By the time you finish with a get_gfn_unshare,
most possible p2m transformations will have been taken care of (PoD-alloc,
page in, unshare). By taking a reference to the underlying page, paging
out is prevented, and then the vcpu can safely let go of the p2m lock.

Conceivably guest_physmap_remove_page and guest_physmap_add_entry can
still come in and change the p2m entry.

Andres

>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.