[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13
On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli wrote: > [Adding Juergen] > > On Wed, 2020-03-18 at 23:10 +0100, Marek Marczykowski-Górecki wrote: > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper wrote: > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote: > > > > Hi, > > > > > > > > In my test setup (inside KVM with nested virt enabled), I rather > > > > frequently get Xen crash on resume from S3. Full message below. > > > > > > > > This is Xen 4.13.0, with some patches, including "sched: fix > > > > resuming > > > > from S3 with smt=0". > > > > > > > > Contrary to the previous issue, this one does not happen always - > > > > I > > > > would say in about 40% cases on this setup, but very rarely on > > > > physical > > > > setup. > > > > > > > > This is _without_ core scheduling enabled, and also with smt=off. > > > > > > > > Do you think it would be any different on xen-unstable? I cat > > > > try, but > > > > it isn't trivial in this setup, so I'd ask first. > > > > > Well, Juergen has fixed quite a few issues. > > Most of them where triggering with core-scheduling enabled, and I don't > recall any of them which looked similar or related to this. > > Still, it's possible that the same issue causes different symptoms, and > hence that maybe one of the patches would fix this too. I've tested on master (d094e95fb7c), and reproduced exactly the same crash (pasted below for the completeness). But there is more: additionally, in most (all?) cases after resume I've got soft lockup in Linux dom0 in smp_call_function_single() - see below. It didn't happened before and the only change was Xen 4.13 -> master. Xen crash: (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133 (XEN) ----[ Xen-4.14-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: e008:[<ffff82d08023a3c5>] credit2.c#csched2_unit_wake+0x14f/0x151 (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (d0v1) (XEN) rax: ffff8301ba8fafb0 rbx: ffff8300912238b0 rcx: 0000000000000000 (XEN) rdx: ffff8301ba8d81f0 rsi: 0000000000000000 rdi: ffff8301ba8d8016 (XEN) rbp: ffff830170db7d30 rsp: ffff830170db7d10 r8: deadbeefdeadf00d (XEN) r9: deadbeefdeadf00d r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: ffff8300912239a0 r13: ffff82d080433780 r14: 0000000000000000 (XEN) r15: 0000005bdb5286ad cr0: 0000000080050033 cr4: 0000000000000660 (XEN) cr3: 000000010e53c000 cr2: 00005ec1b2f56280 (XEN) fsb: 000079872ee29700 gsb: ffff88813ff00000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen code around <ffff82d08023a3c5> (credit2.c#csched2_unit_wake+0x14f/0x151): (XEN) df e8 f9 c5 ff ff eb ad <0f> 0b 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 (XEN) Xen stack trace from rsp=ffff830170db7d10: (XEN) ffff830090a33000 ffff8300912238b0 ffff8300912238b0 ffff8301ba8d8010 (XEN) ffff830170db7d78 ffff82d08024253b 0000000000000202 ffff8301ba8d8010 (XEN) ffff830090a33000 ffff8300a864b000 000079872c600010 0000000000000000 (XEN) 0000000000000001 ffff830170db7d90 ffff82d080206e09 ffff8300a864b000 (XEN) ffff830170db7da8 ffff82d080206f1c 0000000000000000 ffff830170db7ec0 (XEN) ffff82d080204de7 ffff8301ba8cb001 ffff830170db7fff 0000000470db7e10 (XEN) 0000000000000000 ffff82e0021d0160 ffff88813ff15b28 ffff8301ba8cb000 (XEN) ffff8301ba8cb000 ffff8301ba88b000 ffff830170db7e10 0000001200000004 (XEN) 0000798728000005 0000000000000001 0000000000000005 000079872ee286e0 (XEN) 000079872c109e77 000000030000001c 00007986ec0013c0 ffff010a00000005 (XEN) 000000000002a240 000000000002bb30 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000001 00000005d1ea5ab2 0000000000000001 (XEN) 7ba0548d00000000 ffff830170db7ef8 ffff8301ba88b000 0000000000000001 (XEN) 0000000000000000 0000000000000000 ffff830170db7ee8 ffff82d0802d779d (XEN) ffff8301ba88b000 0000000000000000 0000000000000000 00007cfe8f2480e7 (XEN) ffff82d080355432 ffff88813a1bef00 000079872ee28590 000079872ee28590 (XEN) ffff8881358e9c40 ffff88813a1bef00 ffff88813a1bef01 0000000000000282 (XEN) 0000000000000000 ffffc90001923e08 0000000000000000 0000000000000024 (XEN) ffffffff8100148a 0000000000000000 0000000000000000 000079872c600010 (XEN) 0000010000000000 ffffffff8100148a 000000000000e033 0000000000000282 (XEN) Xen call trace: (XEN) [<ffff82d08023a3c5>] R credit2.c#csched2_unit_wake+0x14f/0x151 (XEN) [<ffff82d08024253b>] F vcpu_wake+0xdd/0x3ff (XEN) [<ffff82d080206e09>] F domain_unpause+0x2f/0x3b (XEN) [<ffff82d080206f1c>] F domain_unpause_by_systemcontroller+0x40/0x60 (XEN) [<ffff82d080204de7>] F do_domctl+0x9e1/0x16f1 (XEN) [<ffff82d0802d779d>] F pv_hypercall+0x548/0x560 (XEN) [<ffff82d080355432>] F lstar_enter+0x112/0x120 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2133 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... Linux dom0 soft lockup: [ 524.742089] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [systemd:1] [ 524.747897] Modules linked in: joydev br_netfilter xt_physdev xen_netback bridge stp llc loop ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter snd_hda_codec_generic ledtrig_audio ppdev snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core edac_mce_amd snd_hwdep snd_seq snd_seq_device snd_pcm pcspkr snd_timer snd parport_pc e1000e soundcore parport i2c_piix4 xenfs ip_tables dm_thin_pool dm_persistent_data libcrc32c dm_bio_prison bochs_drm drm_kms_helper drm_vram_helper ttm drm serio_raw ehci_pci ehci_hcd virtio_console virtio_scsi ata_generic pata_acpi floppy qemu_fw_cfg xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn uinput pkcs8_key_parser [ 524.768696] CPU: 1 PID: 1 Comm: systemd Tainted: G W 5.4.25-1.qubes.x86_64 #1 [ 524.771407] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014 [ 524.775056] RIP: e030:smp_call_function_single+0xe0/0x110 [ 524.776755] Code: 65 48 33 0c 25 28 00 00 00 75 3b c9 c3 4c 89 c2 4c 89 c9 48 89 e6 e8 5f fe ff ff 8b 54 24 18 83 e2 01 74 0b f3 90 8b 54 24 18 <83> e2 01 75 f5 eb ca 8b 05 3b 92 e0 01 85 c0 75 80 0f 0b e9 79 ff [ 524.783649] RSP: e02b:ffffc90000c0fc60 EFLAGS: 00000202 [ 524.788857] RAX: 0000000000000000 RBX: ffff888136632540 RCX: 0000000000000040 [ 524.791207] RDX: 0000000000000003 RSI: ffffffff82824c60 RDI: ffffffff820107c0 [ 524.793610] RBP: ffffc90000c0fca0 R08: 0000000000000000 R09: ffff88813b0007e8 [ 524.795737] R10: 0000000000000000 R11: ffffffff8265b6e8 R12: 0000000000000001 [ 524.797847] R13: ffffc90000c0fdb0 R14: ffffffff82feb744 R15: ffff88813b7c6800 [ 524.800156] FS: 000074e59239e5c0(0000) GS:ffff88813ff00000(0000) knlGS:0000000000000000 [ 524.802883] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 524.804661] CR2: 000074e59345a400 CR3: 00000001337e0000 CR4: 0000000000000660 [ 524.807097] Call Trace: [ 524.807959] ? perf_cgroup_attach+0x70/0x70 [ 524.809433] ? _raw_spin_unlock_irqrestore+0x14/0x20 [ 524.811179] ? cgroup_move_task+0x109/0x150 [ 524.812623] task_function_call+0x4d/0x80 [ 524.814179] ? perf_cgroup_switch+0x190/0x190 [ 524.815738] perf_cgroup_attach+0x3f/0x70 [ 524.817125] cgroup_migrate_execute+0x35e/0x420 [ 524.818704] cgroup_attach_task+0x159/0x210 [ 524.820158] ? find_inode_fast.isra.0+0x8e/0xb0 [ 524.822055] cgroup_procs_write+0xd0/0x100 [ 524.823692] cgroup_file_write+0x9b/0x170 [ 524.825220] kernfs_fop_write+0xce/0x1b0 [ 524.826598] vfs_write+0xb6/0x1a0 [ 524.827776] ksys_write+0x67/0xe0 [ 524.828969] do_syscall_64+0x5b/0x1a0 [ 524.830083] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 524.831599] RIP: 0033:0x74e5933894b7 [ 524.832696] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ 524.838570] RSP: 002b:00007ffdfc2df548 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 524.841100] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 000074e5933894b7 [ 524.843469] RDX: 0000000000000005 RSI: 00007ffdfc2df70a RDI: 0000000000000017 [ 524.846368] RBP: 00007ffdfc2df70a R08: 0000000000000000 R09: 00007ffdfc2df590 [ 524.848816] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005 [ 524.851009] R13: 00006149cb4f3800 R14: 0000000000000005 R15: 000074e59345a700 -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |