[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Xen panic due to xstate mismatch
Oh cool, thanks a lot for the explanation.
I added the "vzeroupper" and Xen crashes so it looks like the CPUID emulation is buggy. Also I was able to try it using a VM (same debian testing) running on virt-manager+kvm and it works fine (Xen in debug mode). I will have a look by printing the xstate when running on virt-manager+KVM and I will also run the xen-cpuid command to see the difference just by curiosity as with your test we already spotted the issue. Thanks again for your enlightenment. I will continue my testing later today and if you need me to test something else you are welcome, just ask I will do my best. Guillaume
On 02/02/2025 4:58 pm, Guillaume wrote:
> I attached the output of the `xl dmesg`. This is the 4.19.1 kernel I
> rebuild but I have the same issue with master (just for info).
Thanks. This is a TigerLake CPU, and:
> (XEN) Mitigating GDS by disabling AVX while virtualised - protections
> are best-effort
is why Xen is ignoring AVX.
Now, as to the bug. From the panic line, you're seeing:
> XSTATE 0x0000000000000003, uncompressed hw size 0x340 != xen size 0x240
xstate is XCR0_SSE | XCR0_X87, and the correct size for this
configuration is 0x240.
There reason why it matters is because this is the amount of data the
processor will write out/read in for the XSAVE/XRSTOR instructions,
which are used for context switching. These instructions are also
available in userspace.
Here, VirtualBox is claiming that with AVX disabled, it will still write
out the AVX registers. This is buggy, but we're going to have to narrow
it down further.
Can you try building Xen with this additional line
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index af9e345a7ace..5a5011ba8b10 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -789,6 +789,8 @@ static void __init noinline xstate_check_sizes(void)
*/
check_new_xstate(&s, X86_XCR0_SSE | X86_XCR0_X87);
+ asm volatile ("vzeroupper");
+
if ( cpu_has_avx )
check_new_xstate(&s, X86_XCR0_YMM);
and see if the result crashes or boots?
One possible bug is that VirtualBox is shadowing XCR0 and the real
setting in hardware is 0x7 (including XCR0_AVX) rather than 0x3. In
this case, the reported size is correct, and VirtualBox is failing to
honour the XSETBV setting.
Alternatively, another bug is that XCR0 is really 0x3, but the CPUID
emulation for max size is wrong, in which case the XSAVE/etc
instructions wont actually access beyond 0x240, and "all" that's wrong
is that we'll allocate a larger buffer than necessary.
The VZEROUPPER (an AVX instruction) should distinguish these two cases.
If Xen crashes with it in place, then the XCR0 register is correct and
it's CPUID which is buggy. If Xen boots with that in place, then
Virtualbox is shadowing XCR0 with a different value behind Xen's back.
~Andrew
|