[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 34/46] xen/sched: add fall back to idle vcpu when scheduling unit

To: Juergen Gross <jgross@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
From: Dario Faggioli <dfaggioli@xxxxxxxx>
Date: Fri, 27 Sep 2019 17:21:52 +0200
Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
Delivery-date: Fri, 27 Sep 2019 15:21:58 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, 2019-09-27 at 09:00 +0200, Juergen Gross wrote:
> When scheduling an unit with multiple vcpus there is no guarantee all
> vcpus are available (e.g. above maxvcpus or vcpu offline). Fall back
> to
> idle vcpu of the current cpu in that case. This requires to store the
> correct schedule_unit pointer in the idle vcpu as long as it used as
> fallback vcpu.
> 
> In order to modify the runstates of the correct vcpus when switching
> schedule units merge sched_unit_runstate_change() into
> sched_switch_units() and loop over the affected physical cpus instead
> of the unit's vcpus. This in turn requires an access function to the
> current variable of other cpus.
> 
> Today context_saved() is called in case previous and next vcpus
> differ
> when doing a context switch. With an idle vcpu being capable to be a
> substitute for an offline vcpu this is problematic when switching to
> an idle scheduling unit. An idle previous vcpu leaves us in doubt
> which
> schedule unit was active previously, so save the previous unit
> pointer
> in the per-schedule resource area. If it is NULL the unit has not
> changed and we don't have to set the previous unit to be not running.
> 
> When running an idle vcpu in a non-idle scheduling unit use a
> specific
> guest idle loop not performing any non-softirq tasklets and
> livepatching in order to avoid populating the cpu caches with memory
> used by other domains (as far as possible). Softirqs are considered
> to
> be save.
> 
> In order to avoid livepatching when going to guest idle another
> variant of reset_stack_and_jump() not calling
> check_for_livepatch_work
> is needed.
> 
> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
> Acked-by: Julien Grall <julien.grall@xxxxxxx>
>
Reviewed-by: Dario Faggioli <dfaggioli@xxxxxxxx>

With one request.

> @@ -279,21 +293,11 @@ static inline void vcpu_runstate_change(
>      v->runstate.state = new_state;
>  }
>  
> -static inline void sched_unit_runstate_change(struct sched_unit
> *unit,
> -    bool running, s_time_t new_entry_time)
> +void sched_guest_idle(void (*idle) (void), unsigned int cpu)
>  {
> -    struct vcpu *v;
> -
> -    for_each_sched_unit_vcpu ( unit, v )
> -    {
> -        if ( running )
> -            vcpu_runstate_change(v, v->new_state, new_entry_time);
> -        else
> -            vcpu_runstate_change(v,
> -                ((v->pause_flags & VPF_blocked) ? RUNSTATE_blocked :
> -                 (vcpu_runnable(v) ? RUNSTATE_runnable :
> RUNSTATE_offline)),
> -                new_entry_time);
> -    }
> +    atomic_inc(&per_cpu(sched_urgent_count, cpu));
> +    idle();
> +    atomic_dec(&per_cpu(sched_urgent_count, cpu));
>  }
>  
So, I recall that during review of v1, there was a discussion about why
we were marking this as urgent. You said it was to avoid latencies,
which makes sense. Jan said this is a policy that should be set
elsewhere, which also makes sense.

Ideally, we'd check with specific tests and benchmark whether playing
with the urgent counter in here is really necessary/good. That was, in
fact, my plan, but I did not got round to actually doing that.

So, can we have a comment stating the reason why we're doing this? I'd
like for that information not to be lost in a random email thread on
xen-devel. And that would also remind us (me? :-P) to actually go and
check things with actual benchmarks, at some point.

I'd be fine if such comment would come from a follow-up patch, (as,
since it will only add comments, I expect it can go in during the
freeze).

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

References:
- [Xen-devel] [PATCH v4 00/46] xen: add core scheduling support
  - From: Juergen Gross
- [Xen-devel] [PATCH v4 34/46] xen/sched: add fall back to idle vcpu when scheduling unit
  - From: Juergen Gross

Prev by Date: [Xen-devel] [ovmf test] 141888: all pass - PUSHED
Next by Date: Re: [Xen-devel] [PATCH v4 35/46] xen/sched: make vcpu_wake() and vcpu_sleep() core scheduling aware
Previous by thread: [Xen-devel] [PATCH v4 34/46] xen/sched: add fall back to idle vcpu when scheduling unit
Next by thread: [Xen-devel] [PATCH v4 31/46] xen/sched: modify cpupool_domain_cpumask() to be an unit mask
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.