Xen project Mailing List

Re: Kernel panic when passing through 2 identical PCI devices

From: Joost Roeleveld <joost@xxxxxxxxxxxx>

Date: Mon, 02 Jun 2025 14:33:06 +0000

Delivery-date: Mon, 02 Jun 2025 14:33:14 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Monday, 2 June 2025 15:43:37 CEST you wrote:

On 02.06.2025 14:28, J. Roeleveld wrote:
> I have a domain to which I pass through 4 PCI devices:
>2 NVMe drives
>83:00.0   Samsung 980 NVMe
>84:00.0   Samsung 980 NVMe
>
>2 HBA Controllers
>86:00.0   LSI SAS3008
>87:00.0   LSI SAS3008
>
> This works fine with Xen version 4.18.4_pre1.
> However, when trying to update to 4.19, this fails.

To make it explicit: The domain in question is a PV one.

Yes. I tried to convert it to PVH in the past, but PCI-passthrough wasn't working at all. And nothing I found since shows that it should be working now.

> Checking the output during boot, I think I found something. But my
> knowledge is insufficient to figure out what is causing what I am seeing
> and how to fix this.
>
> From the below (where I only focus on the 2 NVMe drives), it is similar to
> the succesfull boot up until it tries to "claiming resource
>0000:84:00.0/0". At which point sysfs fails because the entry for "84" is
> already present.
What would be interesting is to know why / how this 2nd registration
happens.

Only guess I can make: They are both the same brand/model/size. Only serial number differs

It's the same (guest) kernel version afaics, so something must
behave differently on the host. Are you sure the sole (host side)
difference is the hypervisor version? I.e. the Dom0 kernel version is the
same in the failing and successful cases? I ask because there's very little
Xen itself does that would play into pass-through device discovery /
resource setup by a (PV) guest (which doesn't mean Xen can't screw things
up). The more relevant component is the xen-pciback driver in Dom0.

I can confirm it's dependent on the Xen version. Kernel version = 6.12.21 I get a succesful boot with Xen version 4.18.4_pre1. When I use Xen version 4.19.1, the boot fails due to this issue. The kernel and initramfs does not differ between the boot.

Sadly the log provided does, to me at least, not have enough data to draw
conclusions. Some instrumenting of the guest kernel may be necessary ...

The host boots using UEFI: === (xen.cfg in the EFI partition) === [global] default=xen [xen] options=dom0_mem=24576M,max:24576M dom0_max_vcpus=4 dom0_vcpus_pin gnttab_max_frames=512 sched=credit console=vga extra_guest_irqs=768,1024 kernel=gentoo-6.12.21.efi dozfs root=ZFS=zhost/host/root by=id elevator=noop logo.nologo triggers=zfs quiet refresh softlevel=prexen nomodeset nfs.callback_tcpport=32764 lockd.nlm_udpport=32768 lockd.nlm_tcpport=32768 xen-pciback.hide=(83:00.0)(84:00.0)(86:00.0)(87:00.0) xen- pciback.passthrough=1 ramdisk=initramfs-6.12.21-gentoo-host.img === Please let me know what other information you need and if there is anything I can try/test to get more information. Does the mailing list allow gzipped text files as attachment? Or how would you prefer the kernel-config of the host and guest? If there are tests to do, please give me several to try as I need to schedule downtime for reboots. Many thanks in advance, Joost

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.