[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [BUG] XHCI_NO_64BIT_SUPPORT on ASM1042A USB controller breaks PCIE passthrough
On 13.10.2025 15:15, Aliz 'Randomdude' wrote: > Hi all. Many thanks for Xen. > > I'm attempting to perform PCI passthrough of my RocketU 1144D USB > controller from an XCP-ng host (XCP-ng 8.3.0, kernel 4.19.0+1) to a > Linux guest. This card uses a PLX PCIe switch IC and four ASM1042A USB > controller ICs, of which I forward a single ASM1042A. > > The ASM1042A is detected in the guest VM and initially appears to work > OK, but after I dd some gigabytes to an attached USB disk device, the > controller appears to go away: > > [ 81.076381] xhci_hcd 0000:00:09.0: xHCI host not responding to stop > endpoint command > [ 81.079319] xhci_hcd 0000:00:09.0: xHCI host controller not > responding, assume dead > [ 81.081503] xhci_hcd 0000:00:09.0: HC died; cleaning up > [ 81.083388] usb 5-1: USB disconnect, device number 2 > > At this point, the controller is unusable until I reset it (via > /sys/bus/pci/devices/../remove and /sys/bus/pci/rescan). I am able to > trigger this behavior reliably, although sometimes some 30GB must be > transferred before symptoms appear. > > The guest is running a 6.12.50 kernel I built from vanilla sources. > > After much head-scratching, I discovered that some older guest kernels > function correctly, and do not exhibit the bug, allowing sustained use > of the controller. > > I then proceeded to bisect my way to the following Linux kernel patch > (see > https://lists-ec2.96boards.org/archives/list/linux-stable-mirror@xxxxxxxxxxxxxxxx/thread/WEVQDDJC72LMLPQY37JOZZNKMJ7OHHFL/): > >> I've confirmed that both the ASMedia ASM1042A and ASM3242 have the same >> problem as the ASM1142 and ASM2142/ASM3142, where they lose some of the >> upper bits of 64-bit DMA addresses. As with the other chips, this can >> cause problems on systems where the upper bits matter, and adding the >> XHCI_NO_64BIT_SUPPORT quirk completely fixes the issue. >> Cc: stable@xxxxxxxxxxxxxxx >> Signed-off-by: Forest Crossman cyrozap@xxxxxxxxx >> Signed-off-by: Mathias Nyman mathias.nyman@xxxxxxxxxxxxxxx >> --- >> drivers/usb/host/xhci-pci.c | 8 ++++++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> >> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c >> index 1f989a49c8c6..5bbccc9a0179 100644 >> --- a/drivers/usb/host/xhci-pci.c >> +++ b/drivers/usb/host/xhci-pci.c >> @@ -66,6 +66,7 @@ >> #define PCI_DEVICE_ID_ASMEDIA_1042A_XHCI 0x1142 >> #define PCI_DEVICE_ID_ASMEDIA_1142_XHCI 0x1242 >> #define PCI_DEVICE_ID_ASMEDIA_2142_XHCI 0x2142 >> +#define PCI_DEVICE_ID_ASMEDIA_3242_XHCI 0x3242 >> >> >> static const char hcd_name[] = "xhci_hcd"; >> >> >> @@ -276,11 +277,14 @@ static void xhci_pci_quirks(struct device *dev, struct >> xhci_hcd *xhci) >> pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) >> xhci->quirks |= XHCI_BROKEN_STREAMS; >> if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && >> - pdev->device == PCI_DEVICE_ID_ASMEDIA_1042A_XHCI) >> + pdev->device == PCI_DEVICE_ID_ASMEDIA_1042A_XHCI) { >> xhci->quirks |= XHCI_TRUST_TX_LENGTH; >> + xhci->quirks |= XHCI_NO_64BIT_SUPPORT; >> + } >> if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && >> (pdev->device == PCI_DEVICE_ID_ASMEDIA_1142_XHCI || >> - pdev->device == PCI_DEVICE_ID_ASMEDIA_2142_XHCI)) >> + pdev->device == PCI_DEVICE_ID_ASMEDIA_2142_XHCI || >> + pdev->device == PCI_DEVICE_ID_ASMEDIA_3242_XHCI)) >> xhci->quirks |= XHCI_NO_64BIT_SUPPORT; >> >> >> if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && > > Reverting this patch fixes my immediate issue - the USB controller now > functions as expected. However, I am way out of my depth here and > strongly suspect that doing so will break things in subtle ways, and > so this is where I hand off to the experts for proper analysis. In > particular, I'd be interested to learn under which circumstances > reverting this patch is dangerous - does 'systems where the upper bits > matter' apply only to something relatively exotic? I ask in order to > determine if it is safe to revert this patch in my homelab-grade > setup. I fear that with this report xen-devel@ isn't a useful list to send to; you rather want to report to the corresponding Linux list. Jan > In case it is useful, here are further details of my set-up: > > * Dell R710 with BIOS 6.0.0 > * 2x E5630 CPU and 64GB RAM > * XCP-ng 8.3.0 on the host > * Guest OS is Linux 6.12.0, built from vanilla kernel.org sources > * Guest runs in PVHVM mode > * PCI controller is the RocketU 1144D, which uses a PLX PEX8609 PCIe > switch IC connected to four ASM1042A controllers (allowing me to > forward each controller to a seperate VM) > * The firmware on the ASM1042A is up-to-date AFAICT > * The forwarded PCI device is connected to a JMS578-based disk array > containing three mechanical disks > * The problem exhibits in the guest VM after I run 'dd if=/dev/urandom > of=/dev/<disk> bs=1M count=10240 conv=sync', although it sometimes > needs up to three invokations > * After reverting the patch, I can run the above command without > problems ten times > * The same hardware works OK in ESXi. > > I'm happy to provide further details, and please accept my apologies > in advance for any breach of etiquette - I don't report this kind of > bug very often. >
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |