Xen project Mailing List

Re: [Xen-devel] PV guest with PCI passthrough crash on Xen 4.8.3 inside KVM when booted through OVMF

To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

From: Jason Andryuk <jandryuk@xxxxxxxxx>

Date: Mon, 27 Nov 2023 10:56:29 -0500

Cc: Frediano Ziglio <frediano.ziglio@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Mon, 27 Nov 2023 15:56:48 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Nov 27, 2023 at 6:27 AM Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, Nov 27, 2023 at 11:20:36AM +0000, Frediano Ziglio wrote: > > On Sun, Nov 26, 2023 at 2:51 PM Marek Marczykowski-Górecki > > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > On Mon, Feb 19, 2018 at 06:30:14PM +0100, Juergen Gross wrote: > > > > On 16/02/18 20:02, Andrew Cooper wrote: > > > > > On 16/02/18 18:51, Marek Marczykowski-Górecki wrote: > > > > >> On Fri, Feb 16, 2018 at 05:52:50PM +0000, Andrew Cooper wrote: > > > > >>> On 16/02/18 17:48, Marek Marczykowski-Górecki wrote: > > > > >>>> Hi, > > > > >>>> > > > > >>>> As in the subject, the guest crashes on boot, before kernel output > > > > >>>> anything. I've isolated this to the conditions below: > > > > >>>> - PV guest have PCI device assigned (e1000e emulated by QEMU in > > > > >>>> this case), > > > > >>>> without PCI device it works > > > > >>>> - Xen (in KVM) is started through OVMF; with seabios it works > > > > >>>> - nested HVM is disabled in KVM > > > > >>>> - AMD IOMMU emulation is disabled in KVM; when enabled qemu > > > > >>>> crashes on > > > > >>>> boot (looks like qemu bug, unrelated to this one) > > > > >>>> > > > > >>>> Version info: > > > > >>>> - KVM host: OpenSUSE 42.3, qemu 2.9.1, > > > > >>>> ovmf-2017+git1492060560.b6d11d7c46-4.1, AMD > > > > >>>> - Xen host: Xen 4.8.3, dom0: Linux 4.14.13 > > > > >>>> - Xen domU: Linux 4.14.13, direct boot > > > > >>>> > > > > >>>> Not sure if relevant, but initially I've tried booting xen.efi > > > > >>>> /mapbs > > > > >>>> /noexitboot and then dom0 kernel crashed saying something about > > > > >>>> conflict > > > > >>>> between e820 and kernel mapping. But now those options are > > > > >>>> disabled. > > > > >>>> > > > > >>>> The crash message: > > > > >>>> (XEN) d1v0 Unhandled invalid opcode fault/trap [#6, ec=0000] > > > > >>>> (XEN) domain_crash_sync called from entry.S: fault at > > > > >>>> ffff82d080218720 entry.o#create_bounce_frame+0x137/0x146 > > > > >>>> (XEN) Domain 1 (vcpu#0) crashed on cpu#1: > > > > >>>> (XEN) ----[ Xen-4.8.3 x86_64 debug=n Not tainted ]---- > > > > >>>> (XEN) CPU: 1 > > > > >>>> (XEN) RIP: e033:[<ffffffff826d9156>] > > > > >>> This is #UD, which is most probably hitting a BUG(). addr2line > > > > >>> this ^ > > > > >>> to find some code to look at. > > > > >> addr2line failed me > > > > > > > > > > By default, vmlinux is stripped and compressed. Ideally you want to > > > > > addr2line the vmlinux artefact in the root of your kernel build, which > > > > > is the plain elf with debugging symbols. > > > > > > > > > > Alternatively, use scripts/extract-vmlinux on the binary you actually > > > > > booted, which might get you somewhere. > > > > > > > > > >> , but System.map says its xen_memory_setup. And it > > > > >> looks like the BUG() is the same as I had in dom0 before: > > > > >> "Xen hypervisor allocated kernel memory conflicts with E820 map". > > > > > > > > > > Juergen: Is there anything we can do to try and insert some dummy > > > > > exception handlers right at PV start, so we could at least print out a > > > > > oneliner to the host console which is a little more helpful than Xen > > > > > saying "something unknown went wrong" ? > > > > > > > > You mean something like commit 42b3a4cb5609de757f5445fcad18945ba9239a07 > > > > added to kernel 4.15? > > > > > > > > > > > > > >> > > > > >> Disabling e820_host in guest config solved the problem. Thanks! > > > > >> > > > > >> Is this some bug in Xen or OVMF, or is it expected behavior and > > > > >> e820_host > > > > >> should be avoided? > > > > > > > > > > I don't really know. e820_host is a gross hack which shouldn't really > > > > > be present. The actually problem is that Linux can't cope with the > > > > > memory layout it was given (and I can't recall if there is anything > > > > > Linux could potentially to do cope). OTOH, the toolstack, which knew > > > > > about e820_host and chose to lay the guest out in an overlapping way > > > > > is > > > > > probably also at fault. > > > > > > > > The kernel can cope with lots of E820 scenarios (e.g. by relocating > > > > initrd or the p2m map), but moving itself out of the way is not > > > > possible. > > > > > > I'm afraid I need to resurrect this thread... > > > > > > With recent kernel (6.6+), the host_e820=0 workaround is not an option > > > anymore. It makes Linux not initialize xen-swiotlb (due to > > > f9a38ea5172a3365f4594335ed5d63e15af2fd18), so PCI passthrough doesn't > > > work at all. While I can add yet another layer of workaround (force > > > xen-swiotlb with iommu=soft), that's getting unwieldy. > > > > > > Furthermore, I don't get the crash message anymore, even with debug > > > hypervisor and guest_loglvl=all. Not even "Domain X crashed" in `xl > > > dmesg`. It looks like the "crash" shutdown reason doesn't reach Xen, and > > > it's considered clean shutdown (I can confirm it by changing various > > > `on_*` settings (via libvirt) and observing which gets applied). > > > > > > Most tests I've done with 6.7-rc1, but the issue I observed on 6.6.1 > > > already. > > > > > > This is on Xen 4.17.2. And the L0 is running Linux 6.6.1, and then uses > > > QEMU 8.1.2 + OVMF 202308 to run Xen as L1. > > > > > > > So basically you start the domain and it looks like it's shutting down > > cleanly from logs. > > Can you see anything from the guest? Can you turn on some more > > debugging at guest level? > > No, it crashes before printing anything to the console, also with > earlyprintk=xen. > > > I tried to get some more information from the initial crash but I > > could not understand which guest code triggered the bug. > > I'm not sure which one is it this time (because I don't have Xen > reporting guest crash...) but last time it was here: > https://github.com/torvalds/linux/blob/master/arch/x86/xen/setup.c#L873-L874 Hi Marek, I too have run into this "Xen hypervisor allocated kernel memory conflicts with E820 map" error when running Xen under KVM & OVMF with SecureBoot. OVMF built without SecureBoot did not trip over the issue. It was a little while back - I have some notes though. Non-SecureBoot (XEN) [0000000000810000, 00000000008fffff] (ACPI NVS) (XEN) [0000000000900000, 000000007f8eefff] (usable) SecureBoot (XEN) [0000000000810000, 000000000170ffff] (ACPI NVS) (XEN) [0000000001710000, 000000007f0edfff] (usable) Linux (under Xen) is checking that _pa(_text) (= 0x1000000) is RAM, but it is not. Looking at the E820 map, there is type 4, NVS, region defined: [0000000000810000, 000000000170ffff] (ACPI NVS) When OVMF is built with SMM (for SecureBoot) and S3Supported is true, the memory range 0x900000-0x170ffff is additionally marked ACPI NVS and Linux trips over this. It becomes usable RAM under Non-SecureBoot so Linux boots fine. What I don't understand is why there is even a check that _pa(_text) is RAM. Xen logs that it places dom0 way up high in memory, so the physical address of the kernel pages are much higher than 0x1000000. The value 0x1000000 for _pa(_text) doesn't match reality. Maybe there are some expectations for the ACPI NVS and other reserved regions to be 1-1 mapped? I tried removing the BUG mentioned above, but it still failed to boot. I think I also removed a second BUG, but unfortunately I don't have notes on either. The other thing I noticed is _pa() uses phys_base to shift its output, but phys_base is not updated under Xen. If _pa() is supposed to reference the high memory addresses Xen assigned, then maybe that should be updated? Booting under SecureBoot was tangential to my goal at the time, so I didn't pursue it further. Regards, Jason

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.