[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] resume from S3 sleep not working in Dom0 - Xen4.2.1




fix-suspend-scheduler-v2
fix-suspend-scheduler-revert-affinity-part
s3-timerirq

All of these fixes have been proposed to the xen-devel list, but have
not yet been accepted, for one reason, or another.
And I don't think comments on them have seen follow-ups.

Jan

I guess it's worth bringing this up again;

s3-timerirq: this was empirical hack which for some reason is needed on stable 4.2 we use, but not on latest unstable, didn't really investigate further since it appeared fixed later on anyway..

fix-suspend-scheduler/revert-affinity: the big objection here was the part which reverts one of the hunks in Keir's commit. I tried for quite few days to find a working fix which does not do this revert using posted suggestions, but was not succesfull:

- there was a crash in xen scheduler, which was fixable using your suggestion of masking softirqs during s3 (ugly) - there was also a crash in xen acpi cpufreq driver, which was similarily fixable using a bandaid s3 condition (ugly) - unfortunately this turned out to not be all, xen did not crash anymore at this point but dom0 kernel did around the time it enables cpus, in multiple places: at this point I didn't have a good explanation for it, my opinion of aggravating hunk was rather low, so I uttered a hearty curse and stuck a revert into private patchqueue.

The dom0 kernel crashes were as follows:

1)

[   60.657751] Enabling non-boot CPUs ...
[   60.657958] installing Xen timer for CPU 1
[   60.657987] cpu 1 spinlock event irq 279
[   60.658101] Disabled fast string operations
[   60.658466] CPU1 is up
[   60.658736] installing Xen timer for CPU 2
[   60.658784] cpu 2 spinlock event irq 285
[   60.659764] Disabled fast string operations
[ 60.661811] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 60.661817] IP: [<ffffffff8105f700>] build_sched_domains+0x770/0(XEN) *** Serial input -> Xen (type 'CTRL-a' three times to switch input to DOM0)




2)
.332997] installing Xen timer for CPU 2emory
[   36.333061] cpu 2 spinlock event irq 285
[   36.333343] Disabled fast string operations
[   36.334939] CPU2 is up
[   36.335213] installing Xen timer for CPU 3
[   36.335244] cpu 3 spinlock event irq 291
[   36.335561] Disabled fast string operations
[   36.337461] CPU3 is up
[   36.339513] ACPI: Waking up from system sleep state S3
[ 36.350193] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[   36.350211] IP: [<ffffffff81055f9a>] find_busiest_group+0x38a/0xbb0
[   36.350236] PGD 2f19067 PUD 2ec7067 PMD 0
[   36.350252] Oops: 0000 [#1] SMP
[   36.350263] CPU 1
[ 36.350267] Modules linked in: xt_mac ipt_MASQUERADE ebtable_filter ebtables iscsi_scst(O) xt_tcpudp scst_vdisk(O) xt_state crc32c xt_multiport libcrc32c iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack scst_cdrom(O) nf_defrag_ipv4 ip_tables scst(O) x_tables bridge stp llc nls_cp437 isofs zram(C) snd_hda_codec_hdmi snd_hda_codec_conexant microcode arc4 psmouse serio_raw i915 drm_kms_helper drm iwlwifi(O) mac80211(O) cfg80211(O) thinkpad_acpi nvram snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc i2c_algo_bit intel_agp video intel_gtt tpm_tis tpm tpm_bios sdhci_pci sdhci ehci_hcd e1000e
[   36.350437]
[ 36.350445] Pid: 2730, comm: bash Tainted: G C O 3.2.23-orc #19 LENOVO 42404EU/42404EU [ 36.350463] RIP: e030:[<ffffffff81055f9a>] [<ffffffff81055f9a>] find_busiest_group+0x38a/0xbb0
[   36.350481] RSP: e02b:ffff880002b71228  EFLAGS: 00010046
[ 36.350490] RAX: 0000000000000040 RBX: 0000000000000000 RCX: 0000000000000000 [ 36.350500] RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000 [ 36.350510] RBP: ffff880002b713b8 R08: ffff880026109f00 R09: 0000000000000000 [ 36.350519] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 36.350529] R13: ffff880026109f80 R14: ffffffffffffffff R15: ffff880026109f98 [ 36.350547] FS: 00007fc41e295700(0000) GS:ffff88002dc40000(0000) knlGS:0000000000000000
[   36.350558] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 36.350566] CR2: 0000000000000004 CR3: 0000000026329000 CR4: 0000000000002660 [ 36.350577] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 36.350587] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 36.350598] Process bash (pid: 2730, threadinfo ffff880002b70000, task ffff880027a7db40)
[   36.350608] Stack:
[ 36.350613] 00ffffff00000002 0000000300000001 ffff880002b71498 ffff880002b71534 [ 36.350630] 00ffffff00000002 0000000100000001 ffff8800262cf000 0000000000000008 [ 36.350646] ffffffff00000000 0000000000000000 0000000000000000 ffff88002dc4e2c8
[   36.350662] Call Trace:
[   36.350677]  [<ffffffff8105b158>] load_balance+0xb8/0x840
[   36.350690]  [<ffffffff8101b909>] ? sched_clock+0x9/0x10
[   36.350706]  [<ffffffff8108ccad>] ? sched_clock_cpu+0xbd/0x110
[   36.350718]  [<ffffffff81052b1c>] ? update_shares+0xcc/0x100
[   36.350735]  [<ffffffff8157b9b5>] __schedule+0x875/0x8d0
[   36.350749]  [<ffffffff81073ae2>] ? try_to_del_timer_sync+0x92/0x130
[   36.350762]  [<ffffffff8157bd3f>] schedule+0x3f/0x60
[   36.350773]  [<ffffffff8157c24d>] schedule_timeout+0x16d/0x320
[   36.350786]  [<ffffffff810728e0>] ? usleep_range+0x50/0x50
[   36.350800]  [<ffffffff8157de2e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
[ 36.350817] [<ffffffff8130c340>] acpi_ec_transaction_unlocked+0x134/0x1d8
[   36.350830]  [<ffffffff81086b90>] ? add_wait_queue+0x60/0x60
[   36.350842]  [<ffffffff8130c6c6>] acpi_ec_transaction+0x196/0x239
[   36.350856]  [<ffffffff8157de2e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
[   36.350869]  [<ffffffff8130c8a0>] acpi_ec_write+0x40/0x42
[   36.350881]  [<ffffffff8130c9a8>] acpi_ec_space_handler+0x9e/0xfc
[   36.350894]  [<ffffffff8130c90a>] ? acpi_ec_burst_disable+0x3d/0x3d
[ 36.350909] [<ffffffff813159c6>] acpi_ev_address_space_dispatch+0x179/0x1c8
[   36.350924]  [<ffffffff8131aafe>] acpi_ex_access_region+0x23e/0x24b
[   36.350936]  [<ffffffff8106e82c>] ? __sysctl_head_next+0x11c/0x130
[   36.350951]  [<ffffffff8131ae15>] acpi_ex_field_datum_io+0xf9/0x17a
[ 36.350965] [<ffffffff8131b148>] acpi_ex_write_with_update_rule+0xb5/0xc1
[   36.350989]  [<ffffffff8131acfa>] acpi_ex_insert_into_field+0x1ef/0x211
[ 36.351003] [<ffffffff8132b5a7>] ? acpi_ut_allocate_object_desc_dbg+0x45/0x7f
[   36.351018]  [<ffffffff8131980e>] acpi_ex_write_data_to_field+0x194/0x1c2
[ 36.351031] [<ffffffff813131e4>] ? acpi_ds_init_object_from_op+0x137/0x231
[   36.351044]  [<ffffffff8131d94f>] acpi_ex_store_object_to_node+0xa3/0xe2
[   36.351056]  [<ffffffff8131da51>] acpi_ex_store+0xc3/0x256
[   36.351066]  [<ffffffff8131b62b>] acpi_ex_opcode_1A_1T_1R+0x353/0x4a5
[   36.351078]  [<ffffffff8131260c>] acpi_ds_exec_end_op+0xf7/0x3e7
[   36.351092]  [<ffffffff81325ae7>] acpi_ps_parse_loop+0x7bd/0x94e
[   36.351105]  [<ffffffff81324ed9>] acpi_ps_parse_aml+0x96/0x275
[   36.351119]  [<ffffffff81326394>] acpi_ps_execute_method+0x1ce/0x276
[   36.351131]  [<ffffffff8132165b>] acpi_ns_evaluate+0xdf/0x1aa
[   36.351144]  [<ffffffff81320c9d>] acpi_evaluate_object+0xfb/0x1f4
[   36.351156]  [<ffffffff8130f8ee>] acpi_device_sleep_wake+0x95/0xc7
[ 36.351168] [<ffffffff8130fa60>] acpi_disable_wakeup_device_power+0x6e/0xc9
[   36.351182]  [<ffffffff813085e2>] acpi_disable_wakeup_devices+0x7b/0x95
[   36.351194]  [<ffffffff81308710>] acpi_pm_finish+0x39/0x55
[   36.351208]  [<ffffffff810a6034>] suspend_devices_and_enter+0x104/0x310
[   36.351222]  [<ffffffff810a63a7>] enter_state+0x167/0x190
[   36.351234]  [<ffffffff810a4d27>] state_store+0xb7/0x130
[   36.351246]  [<ffffffff812b54df>] kobj_attr_store+0xf/0x30
[   36.351260]  [<ffffffff811d382f>] sysfs_write_file+0xef/0x170
[   36.351274]  [<ffffffff811668d3>] vfs_write+0xb3/0x180
[   36.351286]  [<ffffffff81166bfa>] sys_write+0x4a/0x90
[   36.351300]  [<ffffffff81585d02>] system_call_fastpath+0x16/0x1b
[ 36.351308] Code: ff 48 8b bd a0 fe ff ff 44 88 85 78 fe ff ff e8 5d fb ff ff 44 0f b6 85 78 fe ff ff 0f 1f 44 00 00 49 8b 7d 10 4c 8b 4d 98 31 d2 <8b> 4f 04 4c 89 c8 48 c1 e0 0a 48 f7 f1 48 8b 4d a0 48 85 c9 48
[   36.351435] RIP  [<ffffffff81055f9a>] find_busiest_group+0x38a/0xbb0
[   36.351450]  RSP <ffff880002b71228>
[   36.351456] CR2: 0000000000000004
[   36.351465] ---[ end trace 5ad2b14b3a9050ae ]---
[ 36.352362] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[   36.352379] IP: [<ffffffff812ba531>] rb_next+0x1/0x50
[   36.352394] PGD 0
[   36.352402] Oops: 0000 [#2] SMP
[   36.352411] CPU 1
[ 36.352416] Modules linked in: xt_mac ipt_MASQUERADE ebtable_filter ebtables iscsi_scst(O) xt_tcpudp scst_vdisk(O) xt_state crc32c xt_multiport libcrc32c iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack scst_cdrom(O) nf_defrag_ipv4 ip_tables scst(O) x_tables bridge stp llc nls_cp437 isofs zram(C) snd_hda_codec_hdmi snd_hda_codec_conexant microcode arc4 psmouse serio_raw i915 drm_kms_helper drm iwlwifi(O) mac80211(O) cfg80211(O) thinkpad_acpi nvram snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc i2c_algo_bit intel_agp video intel_gtt tpm_tis tpm tpm_bios sdhci_pci sdhci ehci_hcd e1000e
[   36.352573]
[ 36.352580] Pid: 2730, comm: bash Tainted: G D C O 3.2.23-orc #19 LENOVO 42404EU/42404EU
[   36.352596] RIP: e030:[<ffffffff812ba531>]  [<fffffff




3)

[   47.833362] Resuming Xen processor info
(XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
(XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
(XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
(XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
(XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
(XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
(XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
(XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28
[   47.886297] Enabling non-boot CPUs ...
[   47.890082] installing Xen timer for CPU 1
[   47.894257] cpu 1 spinlock event irq 48
[   47.899013] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000008
[   47.906740] IP: [<ffffffff8149196b>] __cpuidle_register_device+0x2b/0x100
[   47.913578] PGD 34a4067 PUD 3ac3067 PMD 0
[   47.917825] Oops: 0000 [#1] SMP
[   47.921108] Modules linked in: ipt_MASQUERADE ebtable_filter ebtables 
iscsi_scst(O) xt_tcpudp xt_state xt_multiport iptable_filter scst_vdisk(O) 
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
scst_cdrom(O) ip_tables scst(O) x_tables nls_cp437 isofs bridge stp llc zram(C) 
zsmalloc(C) hid_generic usbhid hid coretemp crc32c_intel ghash_clmulni_intel 
aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul microcode psmouse 
serio_raw arc4 iwldvm mac80211 i915 drm_kms_helper drm iwlwifi intel_agp 
i2c_algo_bit cfg80211 intel_gtt video ahci libahci e1000e [last unloaded: 
tpm_bios]
[   47.974636] CPU 0
[   47.976456] Pid: 2468, comm: pm-suspend Tainted: G         C O 3.8.0-orc #19 
Intel Corporation SandyBridge Platform/Emerald Lake
[   47.988310] RIP: e030:[<ffffffff8149196b>]  [<ffffffff8149196b>] 
__cpuidle_register_device+0x2b/0x100
[   47.997605] RSP: e02b:ffff880025685c98  EFLAGS: 00010286
[   48.002970] RAX: 0000000000000000 RBX: ffff88002de40000 RCX: 0000000000000000
[   48.010154] RDX: ffff880025685fd8 RSI: 0000000000000007 RDI: ffff88002de40000
[   48.017336] RBP: ffff880025685cb8 R08: 0000000000021120 R09: 0000000000000000
[   48.024520] R10: 0000000000000030 R11: 0000000000000000 R12: ffff88002de40000
[   48.031742] R13: 00000000ffffffde R14: 00000000ffffffea R15: 0000000000000000
[   48.038927] FS:  00007fb599d0e700(0000) GS:ffff88002de00000(0000) 
knlGS:0000000000000000
[   48.047060] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[   48.052859] CR2: 0000000000000008 CR3: 000000000345b000 CR4: 0000000000002660
[   48.060043] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   48.067223] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   48.074450] Process pm-suspend (pid: 2468, threadinfo ffff880025684000, task 
ffff880003558000)
[   48.083102] Stack:
[   48.085179]  ffff88002de40000 ffff88002de40000 00000000ffffffde 
ffffffff81a6b480
[   48.092622]  ffff880025685cd8 ffffffff81491cc1 0000000000000001 
ffff88002de40000
[   48.100064]  ffff880025685cf8 ffffffff813046df 0000000000000001 
0000000000000001
[   48.107517] Call Trace:
[   48.110029]  [<ffffffff81491cc1>] cpuidle_register_device+0x31/0x80
[   48.116348]  [<ffffffff813046df>] intel_idle_cpu_init+0xbf/0x120
[   48.122423]  [<ffffffff813047b0>] cpu_hotplug_notify+0x70/0x80
[   48.128310]  [<ffffffff815a619d>] notifier_call_chain+0x4d/0x70
[   48.134281]  [<ffffffff8107969e>] __raw_notifier_call_chain+0xe/0x10
[   48.140686]  [<ffffffff81053bb0>] __cpu_notify+0x20/0x40
[   48.146050]  [<ffffffff81594c7c>] _cpu_up+0xf1/0x138
[   48.151070]  [<ffffffff8158ab39>] enable_nonboot_cpus+0x99/0xd0
[   48.157090]  [<ffffffff81097b8d>] suspend_devices_and_enter+0x25d/0x330
[   48.163752]  [<ffffffff81097def>] pm_suspend+0x18f/0x1f0
[   48.169117]  [<ffffffff81096dea>] state_store+0x8a/0x100
[   48.174483]  [<ffffffff812ac29f>] kobj_attr_store+0xf/0x30
[   48.180022]  [<ffffffff811c005f>] sysfs_write_file+0xef/0x170
[   48.185943]  [<ffffffff8115c253>] vfs_write+0xb3/0x180
[   48.191056]  [<ffffffff8115c592>] sys_write+0x52/0xa0
[   48.196160]  [<ffffffff815a614e>] ? do_page_fault+0xe/0x10
[   48.201700]  [<ffffffff815aa7d9>] system_call_fastpath+0x16/0x1b
[   48.207758] Code: 66 66 66 66 90 55 48 89 e5 48 83 ec 20 48 89 5d e0 4c 89 6d f0 
48 89 fb 4c 89 75 f8 4c 89 65 e8 41 be ea ff ff ff e8 75 0a 00 00<48>  8b 78 08 
49 89 c5 e8 19 80 c1 ff 84 c0 74 53 8b 43 04 49 c7
[   48.226658] RIP  [<ffffffff8149196b>] __cpuidle_register_device+0x2b/0x100
[   48.233582]  RSP<ffff880025685c98>
[   48.237131] CR2: 0000000000000008

[   48.240521] ---[ end trace 535ebe28cd06b143 ]---




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.