[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model
On 2/25/2016 6:31 PM, Dario Faggioli wrote: Hey again, Thanks for turning up so quickly. We are getting closer and closer, although (of course :-)) I have some more comments. However, is there a particular reason why you are keeping the RFC tag? Until you do that, it's like saying that you are chasing feedback, but you do not think yourself the patch should be considered for being upstreamed. As far as my opinion goes, this patch is not ready to go in right now (as I said, I've got questions and comments), but its status is way past RFC. Oh OK, I had the impression that RFC means request for comments and it should always be used because indeed, I'm asking for comments. On Thu, 2016-02-25 at 15:05 -0500, Tianyang Chen wrote:changes since v5: removed unnecessary vcpu_on_replq() checks deadline_queue_insert() returns a flag to indicate if it's needed to re-program the timer removed unnecessary timer checks added helper functions to remove vcpus from queues coding style Changes since v4: removed unnecessary replenishment queue checks in vcpu_wake() extended replq_remove() to all cases in vcpu_sleep() used _deadline_queue_insert() helper function for both queues _replq_insert() and _replq_remove() program timer internally Changes since v3: removed running queue. added repl queue to keep track of repl events. timer is now per scheduler. timer is init on a valid cpu in a cpupool.So, this does not belong here. It is ok to have it in this part of the email, but it should not land in the actual commit changelog, once the patch will be committed into Xen's git tree. The way to achieve the above is to put this summary of changes below the actual changelog, and below the Signed-of-by lines, after a marker that looks like this "---".Budget replenishment and enforcement are separated by adding a replenishment timer, which fires at the next most imminent release time of all runnable vcpus. A new runningq has been added to keep track of all vcpus that are on pcpus.Mmm.. Is this the proper changelog? runningq is something we discussed, and that appeared in v2, but is certainly no longer here... :-O oops... Wait, maybe you misunderstood and/or I did not make myself clear enough (in which case, sorry). I never meant to say "always stop the timer". Atually, in one of my last email I said the opposite, and I think that would be the best thing to do. Do you think there is any specific reason why we need to always stop and restart it? If not, I think we can: - have deadline_queue_remove() also return whether the element removed was the first one (i.e., the one the timer was programmed after); - if it was, re-program the timer after the new front of the queue; - if it wasn't, nothing to do. Thoughts? It was my thought originally that the timer needs to be re-programmed only when the top vcpu is taken off. So did you mean when I manipulated repl_timer->expires manually, the timer should be stopped using proper timer API? The manipulation is gone now. Also, set_timer internally disables the timer so I assume it's safe just to call set_timer() directly, right? @@ -405,22 +500,38 @@ __runq_insert(const struct scheduler *ops, struct rt_vcpu *svc) /* add svc to runq if svc still has budget */ if ( svc->cur_budget > 0 ) - { - list_for_each(iter, runq) - { - struct rt_vcpu * iter_svc = __q_elem(iter); - if ( svc->cur_deadline <= iter_svc->cur_deadline ) - break; - } - list_add_tail(&svc->q_elem, iter); - } + deadline_queue_insert(&__q_elem, svc, &svc->q_elem, runq); else { list_add(&svc->q_elem, &prv->depletedq); + ASSERT( __vcpu_on_replq(svc) );So, by doing this, you're telling me that, if the vcpu is added to the depleted queue, there must be a replenishment planned for it (or the ASSERT() would fail). But if we are adding it to the runq, there may or may not be a replenishment planned for it. Is this what we want? Why there must not be a replenishment planned already for a vcpu going into the runq (to happen at its next deadline)? It looks like the current code doesn't add a vcpu to the replenishment queue when vcpu_insert() is called. When the scheduler is initialized, all the vcpus are added to the replenishment queue after waking up from sleep. This needs to be changed (add vcpu to replq in vcpu_insert()) to make it consistent in a sense that when rt_vcpu_insert() is called, it needs to have a corresponding replenishment event queued. This way the ASSERT() is for both cases in __runq_insert() to enforce the fact that "when a vcpu is inserted to runq/depletedq, a replenishment event is waiting for it". @@ -1087,10 +1193,6 @@ static void rt_context_saved(const struct scheduler *ops, struct vcpu *vc) { struct rt_vcpu *svc = rt_vcpu(vc); - struct rt_vcpu *snext = NULL; - struct rt_dom *sdom = NULL; - struct rt_private *prv = rt_priv(ops); - cpumask_t *online; spinlock_t *lock = vcpu_schedule_lock_irq(vc); clear_bit(__RTDS_scheduled, &svc->flags); @@ -1102,14 +1204,7 @@ rt_context_saved(const struct scheduler *ops, struct vcpu *vc) likely(vcpu_runnable(vc)) ) { __runq_insert(ops, svc); - __repl_update(ops, NOW()); - - ASSERT(!list_empty(&prv->sdom)); - sdom = list_entry(prv->sdom.next, struct rt_dom, sdom_elem); - online = cpupool_domain_cpumask(sdom->dom); - snext = __runq_pick(ops, online); /* pick snext from ALL cpus */ - - runq_tickle(ops, snext); + runq_tickle(ops, svc); }So, here we are. What I meant was to make this function look more or less like this: static void rt_context_saved(const struct scheduler *ops, struct vcpu *vc) { struct rt_vcpu *svc = rt_vcpu(vc); spinlock_t *lock = vcpu_schedule_lock_irq(vc); clear_bit(__RTDS_scheduled, &svc->flags); /* not insert idle vcpu to runq */ if ( is_idle_vcpu(vc) ) goto out; if ( test_and_clear_bit(__RTDS_delayed_runq_add, &svc->flags) && likely(vcpu_runnable(vc)) ) { __runq_insert(ops, svc); runq_tickle(ops, snext); } else __replq_remove(ops, svc); out: vcpu_schedule_unlock_irq(lock, vc); } And, as said above, if you do this, try also removing the __replq_remove() call from rt_vcpu_sleep(), this one should cover for that (and, actually, more than just that!). After all this, check whether you still have the assert in __replq_insert() triggering and let me know So after moving the __replq_remove() to rt_context_saved(), the assert in __replq_insert() still fails when dom0 boots up. (XEN) Assertion '!vcpu_on_replq(svc)' failed at sched_rt.c:524 (XEN) ----[ Xen-4.7-unstable x86_64 debug=y Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82d080128c07>] sched_rt.c#__replq_insert+0x2b/0x64 (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (d0v3) (XEN) rax: 0000000000000001 rbx: ffff83023b522940 rcx: 0000000000000001 (XEN) rdx: 00000031bb1b9980 rsi: ffff83023b486760 rdi: ffff83023b486760 (XEN) rbp: ffff8300bfcffd48 rsp: ffff8300bfcffd28 r8: 0000000000000004 (XEN) r9: 00000000deadbeef r10: ffff82d08025f5a0 r11: 0000000000000206 (XEN) r12: ffff83023b486760 r13: ffff83023b522d80 r14: ffff83023b4b5000 (XEN) r15: 000000023a6e774b cr0: 0000000080050033 cr4: 00000000000406a0 (XEN) cr3: 0000000231c0d000 cr2: ffff8802200d81f8 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff8300bfcffd28: (XEN) ffff82d08019642e ffff83023b486760 ffff8300bfd47000 ffff82d080299b00 (XEN) ffff8300bfcffd88 ffff82d08012a072 ffff8300bfcffd70 ffff8300bfd47000 (XEN) 000000023a6e75c0 ffff83023b522940 ffff82d08032bc00 0000000000000282 (XEN) ffff8300bfcffdd8 ffff82d08012be2c ffff83023b4b5000 ffff83023b4f1000 (XEN) ffff8300bfd44000 ffff8300bfd47000 0000000000000001 ffff83023b4b40c0 (XEN) ffff880230c14440 0000000000000000 ffff8300bfcffde8 ffff82d08012c347 (XEN) ffff8300bfcffe08 ffff82d080169cea ffff83023b4b5000 0000000000000003 (XEN) ffff8300bfcffe18 ffff82d080169d65 ffff8300bfcffe38 ffff82d08010762a (XEN) ffff83023b4b40c0 ffff83023b4b5000 ffff8300bfcffe68 ffff82d08010822a (XEN) ffff8300bfcffe68 fffffffffffffff2 ffff88022061dcb4 0000000000000000 (XEN) ffff8300bfcffef8 ffff82d0801096fc 0000000000000001 ffff8300bfcfff18 (XEN) ffff8300bfcffef8 ffff82d080240e85 ffff8300bfcf8000 0000000000000000 (XEN) 0000000000000246 ffffffff810013aa 0000000000000003 ffffffff810013aa (XEN) ffff8300bfcffee8 ffff8300bfd44000 ffff8802205e8000 0000000000000000 (XEN) ffff880230c14440 0000000000000000 00007cff403000c7 ffff82d0802439e2 (XEN) ffffffff8100140a 0000000000000020 0000000000000003 ffff880230c71900 (XEN) ffff8802206584d0 ffff880220658000 ffff88022061dcb8 0000000000000000 (XEN) 0000000000000206 0000000000000000 ffff880223000168 ffff880223408e00 (XEN) 0000000000000020 ffffffff8100140a 0000000000000000 ffff88022061dcb4 (XEN) 0000000000000004 0001010000000000 ffffffff8100140a 000000000000e033 (XEN) Xen call trace: (XEN) [<ffff82d080128c07>] sched_rt.c#__replq_insert+0x2b/0x64 (XEN) [<ffff82d08012a072>] sched_rt.c#rt_vcpu_wake+0xf2/0x12c (XEN) [<ffff82d08012be2c>] vcpu_wake+0x213/0x3d4 (XEN) [<ffff82d08012c347>] vcpu_unblock+0x4b/0x4d (XEN) [<ffff82d080169cea>] vcpu_kick+0x20/0x6f (XEN) [<ffff82d080169d65>] vcpu_mark_events_pending+0x2c/0x2f (XEN) [<ffff82d08010762a>] event_2l.c#evtchn_2l_set_pending+0xa9/0xb9 (XEN) [<ffff82d08010822a>] evtchn_send+0x158/0x183 (XEN) [<ffff82d0801096fc>] do_event_channel_op+0xe21/0x147d (XEN) [<ffff82d0802439e2>] lstar_enter+0xe2/0x13c (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Assertion '!vcpu_on_replq(svc)' failed at sched_rt.c:524 (XEN) **************************************** Thanks, Tianyang _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |