Xen project Mailing List

Re: [PATCH-for-4.17] xen/sched: fix restore_vcpu_affinity() by removing it

From: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

Date: Fri, 21 Oct 2022 12:47:35 +0200

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Henry Wang <Henry.Wang@xxxxxxx>

Delivery-date: Fri, 21 Oct 2022 10:47:52 +0000

Feedback-id: i1568416f:Fastmail

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Oct 21, 2022 at 08:58:06AM +0200, Juergen Gross wrote: > When the system is coming up after having been suspended, > restore_vcpu_affinity() is called for each domain in order to adjust > the vcpu's affinity settings in case a cpu didn't come to live again. > > The way restore_vcpu_affinity() is doing that is wrong, because the > specific scheduler isn't being informed about a possible migration of > the vcpu to another cpu. Additionally the migration is often even > happening if all cpus are running again, as it is done without check > whether it is really needed. > > As cpupool management is already calling cpu_disable_scheduler() for > cpus not having come up again, and cpu_disable_scheduler() is taking > care of eventually needed vcpu migration in the proper way, there is > simply no need for restore_vcpu_affinity(). > > So just remove restore_vcpu_affinity() completely. > > Fixes: 8a5d50dd0b04 ("xen: sched: simplify ACPI S3 resume path.") > Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > Signed-off-by: Juergen Gross <jgross@xxxxxxxx> I can only test it on a different configuration right now, but I can confirm with this patch applied the system survives S3 (and it did crashed without it at least once on this config). With the now-unused function (also noticed by Jan) dealt with, you can add my: Tested-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > --- > xen/arch/x86/acpi/power.c | 3 -- > xen/common/sched/core.c | 70 --------------------------------------- > xen/include/xen/sched.h | 1 - > 3 files changed, 74 deletions(-) > > diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c > index 1bb4d78392..b76f673acb 100644 > --- a/xen/arch/x86/acpi/power.c > +++ b/xen/arch/x86/acpi/power.c > @@ -159,10 +159,7 @@ static void thaw_domains(void) > > rcu_read_lock(&domlist_read_lock); > for_each_domain ( d ) > - { > - restore_vcpu_affinity(d); > domain_unpause(d); > - } > rcu_read_unlock(&domlist_read_lock); > } > > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c > index 83455fbde1..358fa077e3 100644 > --- a/xen/common/sched/core.c > +++ b/xen/common/sched/core.c > @@ -1196,76 +1196,6 @@ static void sched_reset_affinity_broken(const struct > sched_unit *unit) > v->affinity_broken = false; > } > > -void restore_vcpu_affinity(struct domain *d) > -{ > - unsigned int cpu = smp_processor_id(); > - struct sched_unit *unit; > - > - ASSERT(system_state == SYS_STATE_resume); > - > - rcu_read_lock(&sched_res_rculock); > - > - for_each_sched_unit ( d, unit ) > - { > - spinlock_t *lock; > - unsigned int old_cpu = sched_unit_master(unit); > - struct sched_resource *res; > - > - ASSERT(!unit_runnable(unit)); > - > - /* > - * Re-assign the initial processor as after resume we have no > - * guarantee the old processor has come back to life again. > - * > - * Therefore, here, before actually unpausing the domains, we should > - * set v->processor of each of their vCPUs to something that will > - * make sense for the scheduler of the cpupool in which they are in. > - */ > - lock = unit_schedule_lock_irq(unit); > - > - cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity, > - cpupool_domain_master_cpumask(d)); > - if ( cpumask_empty(cpumask_scratch_cpu(cpu)) ) > - { > - if ( sched_check_affinity_broken(unit) ) > - { > - sched_set_affinity(unit, unit->cpu_hard_affinity_saved, > NULL); > - sched_reset_affinity_broken(unit); > - cpumask_and(cpumask_scratch_cpu(cpu), > unit->cpu_hard_affinity, > - cpupool_domain_master_cpumask(d)); > - } > - > - if ( cpumask_empty(cpumask_scratch_cpu(cpu)) ) > - { > - /* Affinity settings of one vcpu are for the complete unit. > */ > - printk(XENLOG_DEBUG "Breaking affinity for %pv\n", > - unit->vcpu_list); > - sched_set_affinity(unit, &cpumask_all, NULL); > - cpumask_and(cpumask_scratch_cpu(cpu), > unit->cpu_hard_affinity, > - cpupool_domain_master_cpumask(d)); > - } > - } > - > - res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu))); > - sched_set_res(unit, res); > - > - spin_unlock_irq(lock); > - > - /* v->processor might have changed, so reacquire the lock. */ > - lock = unit_schedule_lock_irq(unit); > - res = sched_pick_resource(unit_scheduler(unit), unit); > - sched_set_res(unit, res); > - spin_unlock_irq(lock); > - > - if ( old_cpu != sched_unit_master(unit) ) > - sched_move_irqs(unit); > - } > - > - rcu_read_unlock(&sched_res_rculock); > - > - domain_update_node_affinity(d); > -} > - > /* > * This function is used by cpu_hotplug code via cpu notifier chain > * and from cpupools to switch schedulers on a cpu. > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index 557b3229f6..072e4846aa 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -1019,7 +1019,6 @@ void vcpu_set_periodic_timer(struct vcpu *v, s_time_t > value); > void sched_setup_dom0_vcpus(struct domain *d); > int vcpu_temporary_affinity(struct vcpu *v, unsigned int cpu, uint8_t > reason); > int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity); > -void restore_vcpu_affinity(struct domain *d); > int vcpu_affinity_domctl(struct domain *d, uint32_t cmd, > struct xen_domctl_vcpuaffinity *vcpuaff); > > -- > 2.35.3 > -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab

Attachment: signature.asc
Description: PGP signature

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.