[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC V6 0/11] Paravirtualized ticketlocks

On 04/05/2012 02:31 PM, Avi Kivity wrote:
On 04/02/2012 12:51 PM, Raghavendra K T wrote:
On 04/01/2012 07:23 PM, Avi Kivity wrote:
On 04/01/2012 04:48 PM, Raghavendra K T wrote:
I have patch something like below in mind to try:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d3b98b1..5127668 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1608,15 +1608,18 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
         * else and called schedule in __vcpu_run.  Hopefully that
         * VCPU is holding the lock that we need and will release it.
         * We approximate round-robin by starting at the last boosted
+     * Priority is given to vcpu that are unhalted.
-    for (pass = 0; pass<    2&&    !yielded; pass++) {
+    for (pass = 0; pass<    3&&    !yielded; pass++) {
            kvm_for_each_vcpu(i, vcpu, kvm) {
                struct task_struct *task = NULL;
                struct pid *pid;
-            if (!pass&&    i<    last_boosted_vcpu) {
+            if (!pass&&    !vcpu->pv_unhalted)
+                continue;
+            else if (pass == 1&&    i<    last_boosted_vcpu) {
                    i = last_boosted_vcpu;
-            } else if (pass&&    i>    last_boosted_vcpu)
+            } else if (pass == 2&&    i>    last_boosted_vcpu)
                if (vcpu == me)

Actually I think this is unneeded.  The loops tries to find vcpus
are runnable but not running (vcpu_active(vcpu->wq)), and halted
don't match this condition.

Oh! I think I misinterpreted your statement. hmm I got it. you told to
remove if (vcpu == me) condition.

No, the entire patch is unneeded.  My original comment was:

from the PLE handler, don't wake up a vcpu that is sleeping because it
is waiting for a kick

But the PLE handler never wakes up sleeping vcpus anyway.

I agree with you. It is already doing that. But my approach here is
little different.

In 2 classes of vcpus we have (one is subset of another when we try to
do yield_to) viz,

1) runnable and kicked < (subset of) 2) just runnable

what we are trying to do here is targeting 1) first so that we get good
lock progress.

Here was the sequence I was talking.

vcpu1 releases lock->finds that vcpu2  is next candidate ->
kick hypercall -> kvm_pv_kick_cpu_op -> set kicked flag ->

at this point we have vcpu2 waiting for getting scheduled. But
above yield call can wake *anybody*.

I agree this is not serious (rather it is overhead) when there are are less number of vcpus, But when we have more number of vcpu/vm.. it may
not scale well. My attempt was to fix that.

Let me know if I am completely missing something..

There's still a conflict with PLE in that it may trigger during the spin
phase and send a random yield_to() somewhere.  Maybe it's sufficient to
tune the PLE timeout to be longer than the spinlock timeout.

Ok ... But we also should be cautious that, we may do more halt, though we are about to get spinlock.
Need more study on this.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.