[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems

To: Jan Beulich <JBeulich@xxxxxxxx>
From: Andre Przywara <andre.przywara@xxxxxxx>
Date: Wed, 30 May 2012 16:02:48 +0200
Cc: jeremy@xxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, konrad.wilk@xxxxxxxxxx, "Shin, Jacob" <Jacob.Shin@xxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, hpa@xxxxxxxxx, mingo@xxxxxxx, tglx@xxxxxxxxxxxxx
Delivery-date: Wed, 30 May 2012 14:04:52 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 05/30/2012 03:33 PM, Jan Beulich wrote:

On 30.05.12 at 15:10, Andre Przywara<andre.przywara@xxxxxxx>  wrote:

Because we are behind a family check before tweaking the topology
bit, we can use the standard rd/wrmsr variants for the CPUID feature
register.
This fixes a crash when using the kernel as a Xen Dom0 on affected
Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
yet (this will be fixed in another patch).


I'm not following: If the AMD variants (putting a special value into
%edi) can be freely replaced by the non-AMD variants, why did
the AMD special ones get used in the first place?

Older CPUs (K8) needed the AMD variants, starting with family 10h we canuse the normal versions.

Further, I can't see how checking_wrmsrl() is being paravirtualized
any better than wrmsrl_amd_safe() - both have nothing but an
exception handling fixup attached to the wrmsr invocation. Care
to point out what actual crash it is that was seen?

AFAIK, the difference is between the "l" and the regs version forrd/wrmsr. We have a patch already here to fix this. Will send it outsoon. Jacob, can you comment on this?


The crash dump:

[    1.601021] ------------[ cut here ]------------

[ 1.601025] kernel BUG at/buildrepos/linux-2.6-stable/arch/x86/include/asm/paravirt.h:133!

[    1.601031] invalid opcode: 0000 [#1] SMP
[    1.601038] CPU 0
[    1.601041] Modules linked in:
[    1.601047]

[ 1.601050] Pid: 0, comm: swapper/0 Not tainted 3.4.0 #1 AMD VirgoPlatform/Annapurna[ 1.601061] RIP: e030:[<ffffffff8169b4fe>] [<ffffffff8169b4fe>]init_amd+0x21d/0x603

[    1.601072] RSP: e02b:ffffffff81c01df8  EFLAGS: 00010246

[ 1.601076] RAX: 0000000000000000 RBX: ffffffff81ca7500 RCX:0000000000000000[ 1.601081] RDX: 0000000000000020 RSI: 000000000000c000 RDI:ffffffff81c01e78[ 1.601086] RBP: ffffffff81c01ea8 R08: 0000000000004003 R09:00000000ffffffff[ 1.601090] R10: 0000000000000000 R11: 00000000ffffffff R12:ffffffff81ca7574[ 1.601095] R13: ffffffff81ca7514 R14: ffffffff81ca754c R15:ffffffff81ca756c[ 1.601103] FS: 0000000000000000(0000) GS:ffff8801ce600000(0000)knlGS:0000000000000000

[    1.601108] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b

[ 1.601112] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4:0000000000040660[ 1.601117] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000[ 1.601122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:0000000000000400[ 1.601127] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000,task ffffffff81c14020)

[    1.601131] Stack:

[ 1.601134] ffffffff81c01ea8 ffffffff8169a7d8 00000016000000170000000000000000[ 1.601146] ffff8801c1c3ac00 ffff8801c1c002d0 ffffffff81c01e48ffffffff81140118[ 1.601157] ffffffff811415cc ffff8801c1c04220 ffff8801c1c0248000000000000000d0

[    1.601169] Call Trace:
[    1.601175]  [<ffffffff8169a7d8>] ? get_cpu_cap+0x1f2/0x201
[    1.601183]  [<ffffffff81140118>] ? init_kmem_cache_cpus+0x3e/0x4d
[    1.601189]  [<ffffffff811415cc>] ? sysfs_slab_alias+0x60/0x8d
[    1.601195]  [<ffffffff8169aa3d>] identify_cpu+0x256/0x34d
[    1.601201]  [<ffffffff81141649>] ? kmem_cache_alloc+0x50/0xe4
[    1.601209]  [<ffffffff81cd1c51>] identify_boot_cpu+0x10/0x3c
[    1.601216]  [<ffffffff81cd2052>] check_bugs+0x9/0x2d
[    1.601222]  [<ffffffff81cc7e08>] start_kernel+0x445/0x461
[    1.601227]  [<ffffffff81cc77d6>] ? kernel_init+0x1d8/0x1d8
[    1.601233]  [<ffffffff81cc72cf>] x86_64_start_reservations+0xba/0xc1
[    1.601240]  [<ffffffff81041797>] ? xen_setup_runstate_info+0x2c/0x36
[    1.601247]  [<ffffffff81ccbdc4>] xen_start_kernel+0x58b/0x592

[ 1.601251] Code: 0f 85 d3 00 00 00 31 c0 48 83 3d cd 5c 58 00 00 488d 7d b0 b9 08 00 00 00 f3ab c7 45 b4 05 10 01 c0 c7 45 cc 3a 20 5a 9c 75 04 <0f> 0b eb fe 4c 8d75 b0 4c 89 f7 ff 14 25 b0 11

c2 81 85 c0 8b
[    1.601374] RIP  [<ffffffff8169b4fe>] init_amd+0x21d/0x603
[    1.601381]  RSP <ffffffff81c01df8>
[    1.601391] ---[ end trace a7919e7f17c0a725 ]---
[    1.601397] Kernel panic - not syncing: Attempted to kill the idle task!

Finally, I would question whether re-enabling the topology
extensions under Xen shouldn't be skipped altogether, perhaps
even on Dom0 (as the hypervisor is controlling this MSR, but in
any case on DomU - the hypervisor won't allow (read: ignore,
not fault on) the write anyway (and will log a message for each
(v)CPU that attempts this).


This is probably right. Let me think about this.

Thanks for picking this up.

Regards,
Andre.

Signed-off-by: Andre Przywara<andre.przywara@xxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # 3.4+
---
  arch/x86/kernel/cpu/amd.c |    4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 146bb62..80ccd99 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
            !cpu_has(c, X86_FEATURE_TOPOEXT)) {
                u64 val;

-               if (!rdmsrl_amd_safe(0xc0011005,&val)) {
+               if (!rdmsrl_safe(0xc0011005,&val)) {
                        val |= 1ULL<<  54;
-                       wrmsrl_amd_safe(0xc0011005, val);
+                       checking_wrmsrl(0xc0011005, val);
                        rdmsrl(0xc0011005, val);
                        if (val&  (1ULL<<  54)) {
                                set_cpu_cap(c, X86_FEATURE_TOPOEXT);



--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  - From: Jacob Shin
- Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  - From: Jan Beulich

References:
- [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  - From: Andre Przywara
- Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] [PATCH 0/2] qemu-xen-trad/block: fixes for NetBSD block-raw.
Next by Date: Re: [Xen-devel] V4V
Previous by thread: Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
Next by thread: Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.