|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] S3 resume issue in cpufreq -> get_cpu_idle_time->vcpu_runstate_get
On Tue, Aug 17, 2021 at 04:04:24PM +0200, Jan Beulich wrote:
> On 17.08.2021 15:48, Marek Marczykowski-Górecki wrote:
> > On Tue, Aug 17, 2021 at 02:29:20PM +0100, Andrew Cooper wrote:
> >> On 17/08/2021 14:21, Jan Beulich wrote:
> >>> On 17.08.2021 15:06, Andrew Cooper wrote:
> >>>> Perhaps we want the cpu_down() logic to explicitly invalidate their
> >>>> lazily cached values?
> >>> I'd rather do this on the cpu_up() path (no point clobbering what may
> >>> get further clobbered, and then perhaps not to a value of our liking),
> >>> yet then we can really avoid doing this from a notifier and instead do
> >>> it early enough in xstate_init() (taking care of XSS at the same time).
> >
> > Funny you mention notifiers. Apparently cpufreq driver does use it to
> > initialize things. And fails to do so:
> >
> > (XEN) Finishing wakeup from ACPI S3 state.
> > (XEN) CPU0: xstate: size: 0x440 (uncompressed 0x440) and states: 0x1f
> > (XEN) Enabling non-boot CPUs ...
> > (XEN) CPU1: xstate: size: 0x440 (uncompressed 0x440) and states: 0x1f
> > (XEN) ----[ Xen-4.16-unstable x86_64 debug=y Not tainted ]----
> > (XEN) CPU: 0
> > (XEN) RIP: e008:[<ffff82d04024ad2b>] vcpu_runstate_get+0x153/0x244
> > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> > (XEN) rax: 0000000000000000 rbx: ffff830049667c50 rcx: 0000000000000001
> > (XEN) rdx: 000000321d74d000 rsi: ffff830049667c50 rdi: ffff83025dcc0000
> > (XEN) rbp: ffff830049667c40 rsp: ffff830049667c10 r8: ffff83020511a820
> > (XEN) r9: ffff82d04057ef78 r10: 0180000000000000 r11: 8000000000000000
> > (XEN) r12: ffff83025dcc0000 r13: ffff830205118c60 r14: 0000000000000001
> > (XEN) r15: 0000000000000010 cr0: 000000008005003b cr4: 00000000003526e0
> > (XEN) cr3: 0000000049656000 cr2: 0000000000000028
> > (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> > (XEN) Xen code around <ffff82d04024ad2b> (vcpu_runstate_get+0x153/0x244):
> > (XEN) 48 8b 14 ca 48 8b 04 02 <4c> 8b 70 28 e9 01 ff ff ff 4c 8d 3d dd 64
> > 32 00
> > (XEN) Xen stack trace from rsp=ffff830049667c10:
> > (XEN) 0000000000000180 ffff83025dcbd410 ffff83020511bf30 ffff830205118c60
> > (XEN) 0000000000000001 0000000000000010 ffff830049667c80 ffff82d04024ae73
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 ffff830049667cb8 ffff82d0402560a9
> > (XEN) ffff830205118320 0000000000000001 ffff83020511bf30 ffff83025dc7a6f0
> > (XEN) 0000000000000000 ffff830049667d58 ffff82d040254cb1 00000001402e9f74
> > (XEN) 0000000000000000 ffff830049667d10 ffff82d040224eda 000000000025dc81
> > (XEN) 000000321d74d000 ffff82d040571278 0000000000000001 ffff830049667d28
> > (XEN) ffff82d040228b44 ffff82d0403102cf 0000000000000000 ffff82d0402283a4
> > (XEN) ffff82d040459688 ffff82d040459680 ffff82d040459240 0000000000000004
> > (XEN) 0000000000000000 ffff830049667d68 ffff82d04025510e ffff830049667db0
> > (XEN) ffff82d040221ba4 0000000000000000 0000000000000001 0000000000000001
> > (XEN) 0000000000000000 ffff830049667e00 0000000000000001 ffff82d04058a5c0
> > (XEN) ffff830049667dc8 ffff82d040203867 0000000000000001 ffff830049667df0
> > (XEN) ffff82d040203c51 ffff82d040459400 0000000000000001 0000000000000010
> > (XEN) ffff830049667e20 ffff82d040203e26 ffff830049667ef8 0000000000000000
> > (XEN) 0000000000000003 0000000000000200 ffff830049667e50 ffff82d040270bac
> > (XEN) ffff83020116a640 ffff830258ff6000 0000000000000000 0000000000000000
> > (XEN) ffff830049667e70 ffff82d0402056aa ffff830258ff61b8 ffff82d0405701b0
> > (XEN) ffff830049667e88 ffff82d04022963c ffff82d0405701a0 ffff830049667eb8
> > (XEN) Xen call trace:
> > (XEN) [<ffff82d04024ad2b>] R vcpu_runstate_get+0x153/0x244
This is xen/common/sched/core.c:322. get_sched_res(v->processor) is
NULL at this point for CPU1.
The only place that can calls set_sched_res() and doesn't expect the
previous value to be valid, is cpu_schedule_up(). For non-BSP its called
_only_ from notifier at CPU_UP_PREPARE (cpu_schedule_callback()), but
that notifier explicitly exclude suspend/resume case:
static int cpu_schedule_callback(
struct notifier_block *nfb, unsigned long action, void *hcpu)
{
unsigned int cpu = (unsigned long)hcpu;
int rc = 0;
/*
* All scheduler related suspend/resume handling needed is done in
* cpupool.c.
*/
if ( system_state > SYS_STATE_active )
return NOTIFY_DONE;
But, nothing in cpupool.c is calling into set_sched_res().
On the other hand, sched_rm_cpu() (which I believe is called as part of
parking the CPU) calls cpu_schedule_down(), which then calls
set_sched_res(cpu, NULL).
In short: scheduler for parked CPUs is not re-initialized during resume.
But cpufreq expects it to be...
> > (XEN) [<ffff82d04024ae73>] F get_cpu_idle_time+0x57/0x59
> > (XEN) [<ffff82d0402560a9>] F cpufreq_statistic_init+0x191/0x210
> > (XEN) [<ffff82d040254cb1>] F cpufreq_add_cpu+0x3cc/0x5bb
> > (XEN) [<ffff82d04025510e>] F cpufreq.c#cpu_callback+0x27/0x32
> > (XEN) [<ffff82d040221ba4>] F notifier_call_chain+0x6c/0x96
> > (XEN) [<ffff82d040203867>] F cpu.c#cpu_notifier_call_chain+0x1b/0x36
> > (XEN) [<ffff82d040203c51>] F cpu_up+0xaf/0xc8
> > (XEN) [<ffff82d040203e26>] F enable_nonboot_cpus+0x6b/0x1f8
> > (XEN) [<ffff82d040270bac>] F power.c#enter_state_helper+0x152/0x60a
> > (XEN) [<ffff82d0402056aa>] F
> > domain.c#continue_hypercall_tasklet_handler+0x4c/0xb9
> > (XEN) [<ffff82d04022963c>] F tasklet.c#do_tasklet_work+0x76/0xac
> > (XEN) [<ffff82d040229920>] F do_tasklet+0x58/0x8a
> > (XEN) [<ffff82d0402e6607>] F domain.c#idle_loop+0x74/0xdd
> > (XEN)
> > (XEN) Pagetable walk from 0000000000000028:
> > (XEN) L4[0x000] = 000000025dce1063 ffffffffffffffff
> > (XEN) L3[0x000] = 000000025dce0063 ffffffffffffffff
> > (XEN) L2[0x000] = 000000025dcdf063 ffffffffffffffff
> > (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) FATAL PAGE FAULT
> > (XEN) [error_code=0000]
> > (XEN) Faulting linear address: 0000000000000028
> > (XEN) ****************************************
> >
> > This is after adding brutal `this_cpu(xcr0) = 0` in xstate_init().
>
> And presumably again only with "smt=0"?
Yes. With smt=1 suspend works fine on this particular machine. At least
with only dom0 running (haven't tried with any domU yet)...
> In any event, for us to not mix
> things, may I ask that you start a new thread for this further issue?
Sure.
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Attachment:
signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |