[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] Modified RTDS scheduler to use an event-driven model instead of polling.



>> Can you summarize the design discussion with Dario on the first
>> version of the patch? The above information is ok, but it does not say
>> much about the design. As you can see in our discussion with Dario,
>> there exist several design choices to achieve this. Explicitly state
>> it here can help people understand what this patch is doing.
>>
> Yes, that would be helpful. It's not required to summarize all the
> various ML thread in the actual changelog(s), but, really, a summary and
> maybe a few links and pointers in the cover letter, would be useful, to
> give interested people that may join late the chance to get some
> context.
>
> Note that, even for single patch and not only for series, it is ok to
> include a cover letter (i.e., a [PATCH 0/x]), e.g., right for this
> purpose.
>
> So, as I was saying, I like the architecture now. There are issues, or
> at least I have some questions about specific decisions or code chunks,
> but I think this is on the right track now.

I'll strengthen the summary/cover letter on future patches.

>> > +    struct timer replenishment_timer;
>> > +    unsigned NUM_CPUS;
>>
>> First, it does not follow the convension of variable names. UPPER CASE
>> should be the macro definition and a constant, which is not the case.
>> You can use a more proper name, such as ncpus, here and add what it is
>> used for.
>>
> Agreed.

Will change.

>> >      cpumask_t tickled;          /* cpus been tickled */
>> >  };
>> >
>> > @@ -178,6 +181,8 @@ struct rt_vcpu {
>> >      unsigned flags;             /* mark __RTDS_scheduled, etc.. */
>> >  };
>> >
>> > +static void runq_tickle(const struct scheduler *, struct rt_vcpu *);
>>
>> Not sure if declaring the function is the best approach or having
>> another patch to move the function forward. But anyway, we don't have
>> to worry about this yet right now, considering there are more
>> important issues to tackle for this patch.
>>
> Sure.
>
> Personally, I would like moving the function better (in a separate
> patch). Forward declarations make it (slightly, yes, but still...) more
> difficult to find (and switch, like in editors) between call sites and
> where the actual code is.

I'd like to move it, but as mentioned that can be a separate patch.

>> > @@ -411,15 +422,77 @@ __runq_insert(const struct scheduler *ops, struct 
>> > rt_vcpu *svc)
>> >              struct rt_vcpu * iter_svc = __q_elem(iter);
>> >              if ( svc->cur_deadline <= iter_svc->cur_deadline )
>> >                      break;
>> > +            else
>> > +                ++inserted_index;
>> >           }
>> >          list_add_tail(&svc->q_elem, iter);
>> > +        return inserted_index;
>> >      }
>> >      else
>> >      {
>> > -        list_add(&svc->q_elem, &prv->depletedq);
>> > +        // If we are inserting into previously empty depletedq, rearm
>> > +        //   rearm replenish timer. It will handle further scheduling 
>> > until
>> > +        //   the depletedq is empty again (if ever)
>> > +        if( list_empty(depletedq) )
>> > +            set_timer( &prv->replenishment_timer, svc->cur_deadline );
>>
>> Hmm, can you handle the set_timer outside of this function? It should
>> be possible and make this runq_insert() function serve for only one
>> purpose.
>>
> TBH, I think I like it being here. I mean, __runq_insert is called in
> many places (at least right now), and I wouldn't like to have the check
> and the arming repeated everywhere.
>
> Then, that being said, I'm still hoping that some of these calls could
> go away, and maybe there are call sites where we know that the timer
> won't be armed in any case... So, we'll have to see about this.
>
> So, just take this comment as "I don't dislike the timer machinery to be
> in here". :-)

I thought it was an acceptable way of accomplishing this. We can focus on
trimming more later if this doesn't seem ideal.


>> > +    // If we become one of top [# CPUs] in the runq, tickle it
>> > +    // TODO: make this work when multiple tickles are required
>>
>> Why do you need multiple tickles?
>>
> Because the loop above may have been done more than one replenishment,
> which makes sense.
>
> While we are here. I've said that I don't like the fact that we have one
> big spinlock for the whole scheduler. However, we do have it right now,
> so let's make good use of it.
>
> So (but please double check what I'm about to say, and provide your
> view), it should be safe to access the various svc->vc->processor from
> inside here, since we're holding the big log. If that's the case, then
> it should be fairly easy to figure out which are the pCPUs that needs to
> reschedule (playing with the index, etc), and then use their
> vc->processor to decide what pCPU to actually tickle, isn't this the
> case?
>
> (let me know if I've explained myself...)

>> > +    // Replenishment timer, for now always on CPU 0
>> > +    init_timer(&prv->replenishment_timer, replenishment_handler, ops, 0);
>>
> Oh, yes, I wanted to discuss about this! It is true that in Xen, timers
> are bound to cpus (while, e.g., in Linux, but both the interface and the
> implementation are different).
>
> So, for know, you're doing the right thing in just putting the
> replenishment timer on one pCPU. We'll se how to improve this later on.

I expected this would a be possible later improvement.

>> > +    // Calculate number of pCPUs present.
>> > +    // Note: assumes does not change online.
>>
>> Dario, is this a valid assumption? I'm not sure if Xen supports cpu
>> hot plug and if cpu-hotplug (in case xen supports it) will invalidate
>> this assumption.
>>
> It's not, but...
>
>> > +    for_each_present_cpu(cpu)
>> > +    {
>> > +        ++(prv->NUM_CPUS);
>> > +    }
>> > +
>>
>> This is incorrect. The for_each_present_cpu() list all cpus in the
>> system. it enumerate the cpu_present_map.
>>
>> What you want here is the number of cpus for the scheduler in the
>> current cpupool. You can increase this number when  rt_alloc_pdata()
>> is called.
>>
> ... yes, exactly, that's something rather easy to fix. Just use the pCPU
> allocation and initialization routines (and the matching de-allocation
> ones) to keep the count updated.

Good idea. I knew it wasn't a purely valid assumption but wanted to
get a working patch out.

>> > -    ret.time = MIN(snext->budget, MAX_SCHEDULE); /* sched quantum */
>> > +    /* Use minimum from runq */
>> > +    ret.time = MAX_SCHEDULE;
>>
>> why it is not snext->budget? The budget enforcement only need to be
>> invoked when the VCPU scheduled to run runs out of budget;
>>
> Yep, I was thinking to make right the same comment. :-)

I believe I put that there with the intention of having some
invocations even with an empty queue,
but this may be a bug. I don't think we should need a minimum
invocation period so I may
remove this and the associated constant.

Regards
~Dagaen Golomb

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.