[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V3 (resend) 01/19] x86: Create per-domain mapping of guest_root_pt



Hi Jan,

On 16/05/2024 08:17, Jan Beulich wrote:
On 15.05.2024 20:25, Elias El Yandouzi wrote:
However, I noticed quite a weird bug while doing some testing. I may
need your expertise to find the root cause.

Looks like you've overflowed the dom0 kernel stack, most likely because
of recurring nested exceptions.

In the case where I have more vCPUs than pCPUs (and let's consider we
have one pCPU for two vCPUs), I noticed that I would always get a page
fault in dom0 kernel (5.10.0-13-amd64) at the exact same location. I did
a bit of investigation but I couldn't come to a clear conclusion.
Looking at the stack trace [1], I have the feeling the crash occurs in a
loop or a recursive call.

I tried to identify where the crash occurred using addr2line:

  > addr2line -e vmlinux-5.10.0-29-amd64 0xffffffff810218a0
debian/build/build_amd64_none_amd64/arch/x86/xen/mmu_pv.c:880

It turns out to point on the closing bracket of the function
xen_mm_unpin_all()[2].

I thought the crash could happen while returning from the function in
the assembly epilogue but the output of objdump doesn't even show the
address.

The only theory I could think of was that because we only have one pCPU,
we may never execute one of the two vCPUs, and never setup the mapping
to the guest_root_pt in write_ptbase(), hence the page fault. This is
just a random theory, I couldn't find any hint suggesting it would be
the case though. Any idea how I could debug this?

I guess you want to instrument Xen enough to catch the top level fault (or
the 2nd from top, depending on where the nesting actually starts) to see
why that happens. Quite likely some guest mapping isn't set up properly.


Julien helped me with this one and I believe we have identified the problem.

As you've suggested, I wrote the mapping of the guest root PT in our per-domain section, root_pt_l1tab, within write_ptbase() function as we'd always be in the case v == current plus switch_cr3_cr4() would always flush local tlb.

However, there exists a path, in toggle_guest_mode(), where we could call update_cr3()/make_cr3() without calling write_ptbase() and hence not maintain mappings properly. Instead toggle_guest_mode() has a partly open-coded version of write_ptbase().

Would you rather like to see the mappings written in make_cr3() or in toggle_guest_mode() within the pseudo open-coded version of write_ptbase()?

Elias




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.