[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Graphical glitches (not refreshing?) with Linux's xe driver + Xen 4.19



On Tue, Feb 24, 2026 at 04:58:25PM +0100, Marek Marczykowski-Górecki wrote:
> On Fri, Feb 13, 2026 at 02:23:06AM +0100, Marek Marczykowski-Górecki wrote:
> > On Thu, Feb 12, 2026 at 04:11:50PM +0100, Roger Pau Monné wrote:
> > > On Tue, Feb 10, 2026 at 07:06:20PM +0100, Marek Marczykowski-Górecki 
> > > wrote:
> > > > Hi,
> > > > 
> > > > Recently I started testing compatibility with Intel Lunar Lake. This is
> > > > the first one that uses "xe" instead of "i915" Linux driver for iGPU.
> > > > I test it with Qubes OS 4.3, which uses Xen 4.19.4 and PV dom0 running
> > > > Linux 6.17.9 in this test.
> > > 
> > > Not sure it's going to help a lot, but does using a PVH dom0 make any
> > > difference?
> > 
> > Ok, now with the correct Xen version, it's better with PVH dom0. At
> > least on the login screen and few applications (from both dom0 and domU)
> > I don't see the glitches anymore. I can't do a full test, because PCI
> > passthrough doesn't seem to work with PVH dom0 on Xen 4.19 - and I need
> > it to start most VMs.
> > 
> > So, if the above test is representative, it's only about PV dom0.
> 
> Some further observations:
> 
> 1. My initial impression that Xen 4.17.6 is not affected is false.
> Apparently I got lucky and didn't waited long enough for glitches to
> appear. Unfortunately this means I have no way to bisect this...
> 
> 1a. Updated test procedure - either:
>   - start Qubes OS in full (including default system domUs) and try to
>     open an app in one of them (for example file manager or pdf viewer)
>   - start Linux up to lightdm login page, log in, log out, click on a
>     few lightdm menus (session type selector, poewroff menu etc)
> 
> The second version works even if toolstack version in dom0 doesn't match
> Xen version. If no glitches are observed after doing either of those
> procedures, assume it's good.
> 
> 2. Xen staging is affected too. As well as Xen staging-4.19 without
> any qubes patches.
> 
> 3. After enabling CONFIG_DEBUG in Xen, the xe.ko fails to load firmware:
> 
>     xe 0000:00:02.0: [drm] Tile0: GT0: Using GuC firmware from 
> xe/lnl_guc_70.bin version 70.53.0
>     xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: load failed: status = 
> 0x40000056, time = 0ms, freq = 1850MHz (req 1850MHz), done = -1
>     xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 
> 0, BootROM = 0x2B, UKernel = 0x00, MIA = 0x00, Auth = 0x01
>     xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: firmware production part check 
> failure
>     xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: Failed to initialize uC 
> (-EPROTO)
>     xe 0000:00:02.0: probe with driver xe failed with error -71
> 
> CONFIG_DEBUG is the only change between "xe.ko loads fine but there are
> glitches later on" and "xe.ko fails to load at all". Full console logs:
> https://gist.github.com/marmarek/47b5e62a2cdbae6678c2aecc5283cd3f, there
> are 3 files:
>   - CONFIG_DEBUG=n
>   - CONFIG_DEBUG=y
>   - CONFIG_DEBUG=y + iommu=debug
> 
> 4. Updating to Linux 7.0-rc1 doesn't help, for example:
> https://openqa.qubes-os.org/tests/168119#step/desktop_linux_manager_create_qube/11
> 
> Generally, it does feel like a bug in xe.ko, but I can't exclude some issue
> on Xen side too (especially given point 3 above).

After waiting some time (Linux 6.19.5 this time), Xen CONFIG_DEBUG=n, I get 
some timeout messages:

    [    8.122120] xe 0000:00:02.0: [drm] [ENCODER:204:DDI A/PHY A] failed to 
retrieve link info, disabling eDP
    [    8.148476] xe 0000:00:02.0: [drm] Tile0: GT0: Using GuC firmware from 
xe/lnl_guc_70.bin version 70.53.0
    [    8.803845] xe 0000:00:02.0: [drm] Tile0: GT0: ccs1 fused off
    [    8.804208] xe 0000:00:02.0: [drm] Tile0: GT0: ccs2 fused off
    [    8.804556] xe 0000:00:02.0: [drm] Tile0: GT0: ccs3 fused off
    [    8.822426] xe 0000:00:02.0: [drm] Tile0: GT1: Using GuC firmware from 
xe/lnl_guc_70.bin version 70.53.0
    [    8.827140] xe 0000:00:02.0: [drm] Tile0: GT1: Using HuC firmware from 
xe/lnl_huc.bin version 9.4.13
    [    8.829478] xe 0000:00:02.0: [drm] Tile0: GT1: Using GSC firmware from 
xe/lnl_gsc_1.bin version 104.0.5.1429
    [    8.852923] xe 0000:00:02.0: [drm] Tile0: GT1: vcs1 fused off
    [    8.853513] xe 0000:00:02.0: [drm] Tile0: GT1: vcs2 fused off
    [    8.854090] xe 0000:00:02.0: [drm] Tile0: GT1: vcs3 fused off
    [    8.854706] xe 0000:00:02.0: [drm] Tile0: GT1: vcs4 fused off
    [    8.855310] xe 0000:00:02.0: [drm] Tile0: GT1: vcs5 fused off
    [    8.855904] xe 0000:00:02.0: [drm] Tile0: GT1: vcs6 fused off
    [    8.856495] xe 0000:00:02.0: [drm] Tile0: GT1: vcs7 fused off
    [    8.857079] xe 0000:00:02.0: [drm] Tile0: GT1: vecs1 fused off
    [    8.857675] xe 0000:00:02.0: [drm] Tile0: GT1: vecs2 fused off
    [    8.858272] xe 0000:00:02.0: [drm] Tile0: GT1: vecs3 fused off
    [    8.975881] xe 0000:00:02.0: [drm] Registered 3 planes with drm panic
    [    8.976586] [drm] Initialized xe 1.1.0 for 0000:00:02.0 on minor 0
    [    8.980882] ACPI: video: Video Device [GFX0] (multi-head: yes  rom: no  
post: no)
    [    9.033754] xe 0000:00:02.0: [drm] Tile0: GT1: found GSC cv104.1.0
    ...
    [ 1218.319232] xe 0000:00:02.0: [drm] Tile0: GT0: Engine reset: 
engine_class=rcs, logical_mask: 0x1, guc_id=3
    [ 1218.319890] xe 0000:00:02.0: [drm] Tile0: GT0: Timedout job: seqno=9883, 
lrc_seqno=9883, guc_id=3, flags=0x0 in Xorg [3245]
    [ 1218.320736] xe 0000:00:02.0: [drm] Xe device coredump has been created
    [ 1218.321140] xe 0000:00:02.0: [drm] Check your 
/sys/class/drm/card0/device/devcoredump/data
    [ 1222.285626] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] flip_done 
timed out
    [ 1232.525685] xe 0000:00:02.0: [drm] *ERROR* flip_done timed out
    [ 1232.526280] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] commit wait 
timed out
    [ 1242.765717] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] flip_done 
timed out
    [ 1253.005696] xe 0000:00:02.0: [drm] *ERROR* flip_done timed out
    [ 1253.006248] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] commit wait 
timed out
    [ 1263.245599] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] flip_done 
timed out

The glitches appear much earlier, though.
Would content of /sys/class/drm/card0/device/devcoredump/data be useful
for debugging this?

Full log at https://openqa.qubes-os.org/tests/168813/file/serial0.txt
(warning, almost 200MB of those errors...)


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.