[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model





On 2/26/2016 4:11 AM, Dario Faggioli wrote:
It looks like the current code doesn't add a vcpu to the
replenishment
queue when vcpu_insert() is called. When the scheduler is
initialized,
all the vcpus are added to the replenishment queue after waking up
from
sleep. This needs to be changed (add vcpu to replq in vcpu_insert())
to
make it consistent in a sense that when rt_vcpu_insert() is called,
it
needs to have a corresponding replenishment event queued.

This way the ASSERT() is for both cases in __runq_insert() to
enforce
the fact that "when a vcpu is inserted to runq/depletedq, a
replenishment event is waiting for it".

I am ok with this (calling replq_insert() in rt_vcpu_insert()). Not
just doing that unconditionally though, as a vcpu can, e.g., be paused
when the insert_vcpu hook is called, and in that case, I don't think we
want to enqueue the replenishment event, do we?


Yes.

static void
rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
{
      struct rt_vcpu *svc = rt_vcpu(vc);
      spinlock_t *lock = vcpu_schedule_lock_irq(vc);

      clear_bit(__RTDS_scheduled, &svc->flags);
      /* not insert idle vcpu to runq */
      if ( is_idle_vcpu(vc) )
          goto out;

      if ( test_and_clear_bit(__RTDS_delayed_runq_add, &svc->flags)
&&
           likely(vcpu_runnable(vc)) )
      {
          __runq_insert(ops, svc);
          runq_tickle(ops, snext);
      }
      else
          __replq_remove(ops, svc);
out:
      vcpu_schedule_unlock_irq(lock, vc);
}

And, as said above, if you do this, try also removing the
__replq_remove() call from rt_vcpu_sleep(), this one should cover
for
that (and, actually, more than just that!).

After all this, check whether you still have the assert in
__replq_insert() triggering and let me know
So after moving the __replq_remove() to rt_context_saved(), the
assert
in __replq_insert() still fails when dom0 boots up.

Well, maybe removing __replq_remove() from rt_vcpu_sleep() is not
entirely ok, as if we do that we fail to deal with the case of when
(still in rt_vcpu_sleep()), __vcpu_on_q(svc) is true.

So, I'd say, do as I said above wrt rt_context_saved(). For
rt_vcpu_sleep(), you can try something like this:

static void
rt_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
{
     struct rt_vcpu * const svc = rt_vcpu(vc);

     BUG_ON( is_idle_vcpu(vc) );
     SCHED_STAT_CRANK(vcpu_sleep);

     if ( curr_on_cpu(vc->processor) == vc )
         cpu_raise_softirq(vc->processor, SCHEDULE_SOFTIRQ);
     else if ( __vcpu_on_q(svc) )
     {
         __q_remove(svc);
         __replq_remove(svc);  <=== *** LOOK HERE ***
     }
     else if ( svc->flags & RTDS_delayed_runq_add )
         clear_bit(__RTDS_delayed_runq_add, &svc->flags);
}

In fact, in both the first and the third case, we go will at some point
pass through rt_context_switch(), and hit the __replq_remove() that I
made you put there.

In the case in the middle, as the vcpu was just queued, and for making
it go to sleep, it is enough to remove it from the runq (or depletedq,
in the case of this scheduler), we won't go through
rt_schedule()-->rt_context_saved(), and hence the __replq_remove()
won't be called.

Sorry for the overlook, can you try this.

That being said...

(XEN) Assertion '!vcpu_on_replq(svc)' failed at sched_rt.c:524
(XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]----

(XEN) Xen call trace:
(XEN)    [<ffff82d080128c07>] sched_rt.c#__replq_insert+0x2b/0x64
(XEN)    [<ffff82d08012a072>] sched_rt.c#rt_vcpu_wake+0xf2/0x12c
(XEN)    [<ffff82d08012be2c>] vcpu_wake+0x213/0x3d4
(XEN)    [<ffff82d08012c347>] vcpu_unblock+0x4b/0x4d
(XEN)    [<ffff82d080169cea>] vcpu_kick+0x20/0x6f
(XEN)    [<ffff82d080169d65>] vcpu_mark_events_pending+0x2c/0x2f
(XEN)    [<ffff82d08010762a>]
event_2l.c#evtchn_2l_set_pending+0xa9/0xb9
(XEN)    [<ffff82d08010822a>] evtchn_send+0x158/0x183
(XEN)    [<ffff82d0801096fc>] do_event_channel_op+0xe21/0x147d
(XEN)    [<ffff82d0802439e2>] lstar_enter+0xe2/0x13c
(XEN)

... This says that the we call __replq_insert() from rt_vcpu_wake() and
find out in there that vcpu_on_replq() is true.

However, in v6 code for rt_vcpu_wake(), I can see this:

+    if( !__vcpu_on_replq(svc) )
+    {
+        __replq_insert(ops, svc);
+        ASSERT( vcpu_runnable(vc) );
+    }

which would make me think that, if vcpu_on_replq() is true, we
shouldn't be calling __replq_insert() in the first place.

So, have you made other changes wrt v6 when trying this?

The v6 doesn't have the if statement commented out when I submitted it. But I tried commenting it out, the assertion failed.

It fails in V6 with:
rt_vcpu_sleep(): removing replenishment event for all cases
rt_context_saved(): not removing replenishment events
rt_vcpu_wake(): not checking if the event is already queued.

or with:
rt_vcpu_sleep(): not removing replenishment event at all
rt_context_saved(): removing replenishment events if not runnable
rt_vcpu_wake(): not checking if the event is already queued.

Also with:
rt_vcpu_sleep(): removing replenishment event if the vcpu is on runq/depletedq
rt_context_saved(): removing replenishment events if not runnable
rt_vcpu_wake(): not checking if the event is already queued.

I added debug prints in all these functions and noticed that it could be caused by racing between spurious wakeups and context switching. See the following events for the last modification above:

(XEN) cpu1 picked idle
(XEN) d0 attempted to change d0v1's CR4 flags 00000620 -> 00040660
(XEN) cpu2 picked idle
(XEN) vcpu1 sleeps on cpu
(XEN) cpu0 picked idle
(XEN) vcpu1 context saved not runnable
(XEN) vcpu1 wakes up nowhere
(XEN) cpu0 picked vcpu1
(XEN) vcpu1 sleeps on cpu
(XEN) cpu0 picked idle
(XEN) vcpu1 context saved not runnable
(XEN) vcpu1 wakes up nowhere
(XEN) cpu0 picked vcpu1
(XEN) cpu0 picked idle
(XEN) vcpu1 context saved not runnable
(XEN) cpu0 picked vcpu0
(XEN) vcpu1 wakes up nowhere
(XEN) cpu1 picked vcpu1     *** vcpu1 is on a cpu
(XEN) cpu1 picked idle      *** vcpu1 is waiting to be context switched
(XEN) vcpu2 wakes up nowhere
(XEN) cpu0 picked vcpu0
(XEN) cpu2 picked vcpu2
(XEN) cpu0 picked vcpu0
(XEN) cpu0 picked vcpu0
(XEN) d0 attempted to change d0v2's CR4 flags 00000620 -> 00040660
(XEN) cpu0 picked vcpu0
(XEN) vcpu2 sleeps on cpu
(XEN) vcpu1 wakes up nowhere      *** vcpu1 wakes up without sleep?

(XEN) Assertion '!__vcpu_on_replq(svc)' failed at sched_rt.c:526
(XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d08012a151>] sched_rt.c#rt_vcpu_wake+0x11f/0x17b
...
(XEN) Xen call trace:
(XEN)    [<ffff82d08012a151>] sched_rt.c#rt_vcpu_wake+0x11f/0x17b
(XEN)    [<ffff82d08012bf2c>] vcpu_wake+0x213/0x3d4
(XEN)    [<ffff82d08012c447>] vcpu_unblock+0x4b/0x4d
(XEN)    [<ffff82d080169cea>] vcpu_kick+0x20/0x6f
(XEN)    [<ffff82d080169d65>] vcpu_mark_events_pending+0x2c/0x2f
(XEN)    [<ffff82d08010762a>] event_2l.c#evtchn_2l_set_pending+0xa9/0xb9
(XEN)    [<ffff82d080108312>] send_guest_vcpu_virq+0x9d/0xba
(XEN)    [<ffff82d080196cbe>] send_timer_event+0xe/0x10
(XEN)    [<ffff82d08012a7b5>] schedule.c#vcpu_singleshot_timer_fn+0x9/0xb
(XEN)    [<ffff82d080131978>] timer.c#execute_timer+0x4e/0x6c
(XEN)    [<ffff82d080131ab9>] timer.c#timer_softirq_action+0xdd/0x213
(XEN)    [<ffff82d08012df32>] softirq.c#__do_softirq+0x82/0x8d
(XEN)    [<ffff82d08012df8a>] do_softirq+0x13/0x15
(XEN)    [<ffff82d080243ad1>] cpufreq.c#process_softirqs+0x21/0x30


So, it looks like spurious wakeup for vcpu1 happens before it was completely context switched off a cpu. But rt_vcpu_wake() didn't see it on cpu with curr_on_cpu() so it fell through the first two RETURNs.

I guess the replenishment queue check is necessary for this situation?

Thanks,
Tianyang

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.