[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC V10 15/18] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor



On 07/17/2013 03:04 PM, Gleb Natapov wrote:
On Wed, Jul 17, 2013 at 12:12:35AM +0530, Raghavendra K T wrote:
I do not think it is very rare to get interrupt between
local_irq_restore() and halt() under load since any interrupt that
occurs between local_irq_save() and local_irq_restore() will be
delivered
immediately after local_irq_restore(). Of course the chance of no
other
random interrupt waking lock waiter is very low, but waiter can sleep
for much longer then needed and this will be noticeable in
performance.

Yes, I meant the entire thing. I did infact turned WARN on
w->lock==null before halt() [ though we can potentially have irq right
after that ], but did not hit so far.
Depends on your workload of course. To hit that you not only need to get
interrupt in there, but the interrupt handler needs to take contended
spinlock.


Yes. Agree.


BTW can NMI handler take spinlocks? If it can what happens if NMI is
delivered in a section protected by local_irq_save()/local_irq_restore()?


Had another idea if NMI, halts are causing problem until I saw
PeterZ's reply similar to V2 of pvspinlock posted  here:

  https://lkml.org/lkml/2011/10/23/211

Instead of halt we started with a sleep hypercall in those
  versions. Changed to halt() once Avi suggested to reuse existing sleep.

If we use older hypercall with few changes like below:

kvm_pv_wait_for_kick_op(flags, vcpu, w->lock )
{
  // a0 reserved for flags
if (!w->lock)
return;
DEFINE_WAIT
...
end_wait
}

How would this help if NMI takes lock in critical section. The thing
that may happen is that lock_waiting->want may have NMI lock value, but
lock_waiting->lock will point to non NMI lock. Setting of want and lock
have to be atomic.

True. so we are here

        non NMI lock(a)
        w->lock = NULL;
        smp_wmb();
        w->want = want;
                               NMI
                         <---------------------
                          NMI lock(b)
                          w->lock = NULL;
                          smp_wmb();
                          w->want = want;
                          smp_wmb();
                          w->lock = lock;
                         ---------------------->
        smp_wmb();
        w->lock = lock;

so how about fixing like this?

again:
        w->lock = NULL;
        smp_wmb();
        w->want = want;
        smp_wmb();
        w->lock = lock;

if (!lock || w->want != want) goto again;


kvm_pv_wait_for_kick_op() is incorrect in other ways. It will spuriously
return to a guest since not all events that wake up vcpu thread
correspond to work for guest to do.


Okay. agree.

Only question is how to retry immediately with lock_spinning in
w->lock=null cases.

/me need to experiment that again perhaps to see if we get some benefit.


So I am,
1. trying to artificially reproduce this.

2. I replaced the halt with below code,
        if (arch_irqs_disabled())
                 halt();

and ran benchmarks.
But this results in degradation because, it means we again go back
and spin in irq enabled case.

Yes, this is not what I proposed.

True.


3. Now I am analyzing the performance overhead of safe_halt in irq
enabled case.
       if (arch_irqs_disabled())
                halt();
       else
                safe_halt();
Use of arch_irqs_disabled() is incorrect here.

Oops! sill me.

If you are doing it before
local_irq_restore() it will always be false since you disabled interrupt
yourself,

This was not the case. but latter is the one I missed.

  if you do it after then it is to late since interrupt can come
between local_irq_restore() and halt() so enabling interrupt and halt
are still not atomic.  You should drop local_irq_restore() and do

   if (arch_irqs_disabled_flags(flags))
        halt();
   else
        safe_halt();

instead.


Yes, I tested with below as suggested:

//local_irq_restore(flags);

         /* halt until it's our turn and kicked. */
         if (arch_irqs_disabled_flags(flags))
                 halt();
         else
                 safe_halt();

  //local_irq_save(flags);
I am seeing only a slight overhead, but want to give a full run to
check the performance.
Without compiling and checking myself the different between previous
code and this one should be a couple asm instruction. I would be
surprised if you can measure it especially as vcpu is going to halt
(and do expensive vmexit in the process) anyway.


Yes, right.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.