Xen project Mailing List

Re: Random crash during guest start on x86

To: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>

Date: Tue, 26 Jul 2022 12:29:43 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=49ZnBPgO9dXMGmbO6aviSQ5MQWRp+CWRettBFSUT1kY=; b=J4gr54AfkCK0IEvLn+7gyQPjt8+i8vNPMLDOxUqDYCtDGj/P0q3kMuCBbOERnB95H5zL+FAcA51FOexxsGq/VNZeWVrKxjSoNnEKfr0l7DYDfe27TsupZGcg+8HZ6fYN2KgyDT4iVCHHcsPtN+HD06pLk505eHS7t5eSkUicsC839xyJ4fbtPDsa1F19x+Cy3tCMu3c0bREpzLfdB9ZTy3g6Z9VjfAYLJ6Ws/OzmJ4yYMDzb/RJx2JquukmANcxgGlahWGT+gtvkDTfsm84CzGAC4mTSCMhHyTa9ot2yiLVryLvbONJpVH/fLZCEXP+7AK3M0gnbfgR5qpz4AqsQUg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kc6kvMyfxHwEf9wB40gGX+fFzqV2l4FIiMfONskT8uQ/s+V4OGuZYxw40a3itSUMjVcRIFkaIm8a7MWVgSMNIjIbE6AAJ3mG48FX/fKZI4SRW3MZGTy9/ig/7sAuqsqDJ3hhldBKOhUg9+8cMibyPr12TJH9JpGcR38N9B6hSq7KZyNejuwL7N7W+uAqTUzk/Zf2Czmnv3Yg2MSH4aadAtZsc7aoaNZzPBPdbuDvZSLwyZo/pwobEsC8gDelHEvRWA9QYS/4/0r7z8wBGqt37c2EBzuVLhpIzPmFfqe6EnOlsJh20r41I7IzPcO+6ZPy0BkGfMR/gQoHSsluGdZG/A==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 26 Jul 2022 10:30:10 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 26.07.2022 09:51, Bertrand Marquis wrote: > > >> On 25 Jul 2022, at 17:16, Jan Beulich <jbeulich@xxxxxxxx> wrote: >> >> On 25.07.2022 17:51, Bertrand Marquis wrote: >>> On our CI we have randomly a crash during guest boot on x86. >> >> Afaict of a PV guest. >> >>> We are running on qemu x86_64 using Xen staging. >> >> Which may introduce unusual timing. An issue never hit on actual hardware >> _may_ (but doesn't have to be) one in qemu itself. >> >>> The crash is happening randomly (something like 1 out of 20 times). >>> >>> This is always happening on the first guest we start, we never got it after >>> first guest was successfully started. >>> >>> Please tell me if you need any other info. >>> >>> Here is the guest kernel log: >>> [...] >>> [ 6.679020] general protection fault, maybe for address 0x8800: 0000 [#1] >>> PREEMPT SMP NOPTI >>> [ 6.679020] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.6 #1 >>> [ 6.679020] RIP: e030:error_entry+0xaf/0xe0 >>> [ 6.679020] Code: 29 89 c8 48 39 84 24 88 00 00 00 74 15 48 81 bc 24 88 00 >>> 00 00 63 10 e0 81 75 03 0f 01 f8 90 90 90 c3 48 89 8c 24 88 00 00 00 <0f> >>> 01 f8 90 90 90 eb 11 0f 20 d8 90 90 90 90 90 48 25 ff e7 ff ff >> >> This is SWAPGS, which supposedly a PV guest should never hit. Data further >> down suggests the kernel is still in the process of patching alternatives, >> which may be the reason for the insn to still be there (being at a point >> where exceptions are still unexpected). > > So the exception path is using alternative code ? Sounds logic with the error > output. > But does explain the original error. SWAPGS sits pretty early on all kernel entry paths. If any instance of it is subject to alternatives patching, then prior to patching such paths may not be taken when running as PV guest under Xen. >>> [ 6.679020] RSP: e02b:ffffffff82803a90 EFLAGS: 00000046 >>> [ 6.679020] RAX: 0000000000008800 RBX: 0000000000000000 RCX: >>> ffffffff81e00fa7 >>> [ 6.679020] RDX: 0000000000000000 RSI: ffffffff81e009f8 RDI: >>> 00000000000000eb >>> [ 6.679020] RBP: 0000000000000000 R08: 0000000000000000 R09: >>> 0000000000000000 >>> [ 6.679020] R10: 0000000000000000 R11: 0000000000000000 R12: >>> 0000000000000000 >>> [ 6.679020] R13: 0000000000000000 R14: 0000000000000000 R15: >>> 0000000000000000 >>> [ 6.679020] FS: 0000000000000000(0000) GS:ffff88801f200000(0000) >>> knlGS:0000000000000000 >>> [ 6.679020] CS: 10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 6.679020] CR2: 0000000000000000 CR3: 000000000280c000 CR4: >>> 0000000000050660 >>> [ 6.679020] Call Trace: >>> [ 6.679020] <TASK> >>> [ 6.679020] RIP: e030:native_irq_return_iret+0x0/0x2 >>> [ 6.679020] Code: 41 5d 41 5c 5d 5b 41 5b 41 5a 41 59 41 58 58 59 5a 5e 5f >>> 48 83 c4 08 eb 0a 0f 1f 00 90 66 0f 1f 44 00 00 f6 44 24 20 04 75 02 <48> >>> cf 57 0f 01 f8 eb 12 0f 20 df 90 90 90 90 90 48 81 e7 ff e7 ff >>> [ 6.679020] RSP: e02b:ffffffff82803b48 EFLAGS: 00000046 ORIG_RAX: >>> 000000000000e030 >>> [ 6.679020] RAX: 0000000000008800 RBX: ffffffff82803be0 RCX: >>> ffffffff81e00f95 >>> [ 6.679020] RDX: ffffffff81e00f94 RSI: ffffffff81e00f95 RDI: >>> 00000000000000eb >>> [ 6.679020] RBP: 00000000000000eb R08: 0000000090001f0f R09: >>> 0000000000000007 >>> [ 6.679020] R10: ffffffff81e00f94 R11: ffffffff8285a6c0 R12: >>> 0000000000000000 >>> [ 6.679020] R13: ffffffff81e00f94 R14: 0000000000000006 R15: >>> 0000000000000006 >>> [ 6.679020] ? asm_exc_general_protection+0x8/0x30 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1b/0x27 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1b/0x27 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1c/0x27 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1b/0x27 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1c/0x27 >>> [ 6.679020] RIP: e030:insn_get_opcode.part.0+0xab/0x180 >>> [ 6.679020] Code: 00 00 8b 43 4c a9 c0 07 08 00 0f 84 bf 00 00 00 c6 43 1c >>> 01 31 c0 5b 5d c3 83 e2 03 be 01 00 00 00 eb b7 89 ef e8 65 e4 ff ff <89> >>> 43 4c a8 30 75 21 e9 8e 00 00 00 0f b6 7b 03 40 84 ff 75 73 8b >>> [ 6.679020] RSP: e02b:ffffffff82803b70 EFLAGS: 00000246 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1b/0x27 >>> [ 6.679020] insn_get_modrm+0x6c/0x120 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1b/0x27 >>> [ 6.679020] insn_get_sib+0x40/0x80 >>> [ 6.679020] insn_get_displacement+0x82/0x100 >>> [ 6.679020] insn_decode+0xf8/0x100 >>> [ 6.679020] optimize_nops+0x60/0x1e0 >>> [ 6.679020] ? rcu_nmi_exit+0x2b/0x140 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1b/0x27 >>> [ 6.679020] ? native_iret+0x3/0x7 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1c/0x27 >>> [ 6.679020] ? restore_regs_and_return_to_kernel+0x1b/0x27 >>> [ 6.679020] apply_alternatives+0x165/0x2e0 >> >> I have to admit that I'm a little lost with these "modern" stack traces, >> where contexts apparently switch without being clearly annotated. It is >> looking a little as if a #GP fault was happening somewhere here (hence >> the asm_exc_general_protection further up), but I cannot work out where >> (what insn) that would have come from. >> >> You may want to add some debugging code to the hypervisor to tell you >> where exactly that #GP (if there is one in the first place) originates >> from. With that it may then become a little more clear what's actually >> going on (and why the behavior is random). > > I will check what I can do there but as the crash is very random and only > happening during our CI tests, this is not really easy to reproduce. > If you have any example of code to do the debugging, I could run some > tests with it. Well, you want to show_execution_state() on the guest registers in do_general_protection() or perhaps pv_emulate_privileged_op(), but only for the first (or first few) #GP for every guest (or else things likely get too noisy), and presumably also only when the guest is in kernel mode. The resulting (guest) stack trace then would need taking apart, with the guest kernel binary on the side. >> As a final remark - you've Cc-ed the x86 hypervisor maintainers, but at >> least from the data which is available so far this is more likely a >> kernel issue. So kernel folks might be of more help ... > > I wanted to check if this could be a know issue first. The problem is > happening in the kernel, I agree, but only when it started as a Xen guest > so I assumed it could be related to Xen. It is quite likely related to Xen, yes, but then still quite likely to the Xen-specific parts in the kernel. In the end it all boils down to where the first (suspected) #GP is coming from. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.