[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: NULL pointer dereference in cpufreq_update_limits(?) under Xen PV dom0 - regression in 6.13
On Thu, Mar 27, 2025 at 11:14 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 27.03.2025 01:51, Marek Marczykowski-Górecki wrote: > > Hi, > > > > I've got a report[1] that 6.13.6 crashes as listed below. It worked fine in > > 6.12.11. We've tried few simple things to narrow the problem down, but > > without much success. > > > > This is running in Xen 4.17.5, PV dom0, which probably is relevant here. > > This is running on AMD Ryzen 9 7950X3D, with ASRock X670E Taichi > > motherboard. > > There are few more details in the original report (link below). > > > > The kernel package (including its config saved into /boot) is here: > > https://yum.qubes-os.org/r4.2/current/host/fc37/rpm/kernel-latest-6.13.6-1.qubes.fc37.x86_64.rpm > > https://yum.qubes-os.org/r4.2/current/host/fc37/rpm/kernel-latest-modules-6.13.6-1.qubes.fc37.x86_64.rpm > > > > The crash message: > > [ 9.367048] BUG: kernel NULL pointer dereference, address: > > 0000000000000070 > > [ 9.368251] #PF: supervisor read access in kernel mode > > [ 9.369273] #PF: error_code(0x0000) - not-present page > > [ 9.370346] PGD 0 P4D 0 > > [ 9.371222] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI > > [ 9.372114] CPU: 0 UID: 0 PID: 128 Comm: kworker/0:2 Not tainted > > 6.13.6-1.qubes.fc37.x86_64 #1 > > [ 9.373184] Hardware name: ASRock X670E Taichi/X670E Taichi, BIOS 3.20 > > 02/21/2025 > > [ 9.374183] Workqueue: kacpi_notify acpi_os_execute_deferred > > [ 9.375124] RIP: e030:cpufreq_update_limits+0x10/0x30 > > [ 9.375840] Code: 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 > > 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 05 98 e4 21 02 > > <48> 8b 40 70 48 85 c0 74 06 e9 a2 36 38 00 cc e9 ec fe ff ff 66 66 > > [ 9.377009] RSP: e02b:ffffc9004058be28 EFLAGS: 00010246 > > [ 9.377667] RAX: 0000000000000000 RBX: ffff888005bf4800 RCX: > > ffff88805d635fa8 > > [ 9.378415] RDX: ffff888005bf4800 RSI: 0000000000000085 RDI: > > 0000000000000000 > > [ 9.379127] RBP: ffff888005cd7800 R08: 0000000000000000 R09: > > 8080808080808080 > > [ 9.379887] R10: ffff88800391abc0 R11: fefefefefefefeff R12: > > ffff888004e8aa00 > > [ 9.380669] R13: ffff88805d635f80 R14: ffff888004e8aa15 R15: > > ffff8880059baf00 > > [ 9.381514] FS: 0000000000000000(0000) GS:ffff88805d600000(0000) > > knlGS:0000000000000000 > > [ 9.382345] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 9.383045] CR2: 0000000000000070 CR3: 000000000202c000 CR4: > > 0000000000050660 > > [ 9.383786] Call Trace: > > [ 9.384335] <TASK> > > [ 9.384886] ? __die+0x23/0x70 > > [ 9.385456] ? page_fault_oops+0x95/0x190 > > [ 9.386036] ? exc_page_fault+0x76/0x190 > > [ 9.386636] ? asm_exc_page_fault+0x26/0x30 > > [ 9.387215] ? cpufreq_update_limits+0x10/0x30 > > [ 9.387805] acpi_processor_notify.part.0+0x79/0x150 > > [ 9.388402] acpi_ev_notify_dispatch+0x4b/0x80 > > [ 9.389013] acpi_os_execute_deferred+0x1a/0x30 > > [ 9.389610] process_one_work+0x186/0x3b0 > > [ 9.390205] worker_thread+0x251/0x360 > > [ 9.390765] ? srso_alias_return_thunk+0x5/0xfbef5 > > [ 9.391376] ? __pfx_worker_thread+0x10/0x10 > > [ 9.391957] kthread+0xd2/0x100 > > [ 9.392493] ? __pfx_kthread+0x10/0x10 > > [ 9.393043] ret_from_fork+0x34/0x50 > > [ 9.393575] ? __pfx_kthread+0x10/0x10 > > [ 9.394090] ret_from_fork_asm+0x1a/0x30 > > [ 9.394621] </TASK> > > [ 9.395106] Modules linked in: gpio_generic amd_3d_vcache acpi_pad(-) > > loop fuse xenfs dm_thin_pool dm_persistent_data dm_bio_prison amdgpu amdxcp > > i2c_algo_bit drm_ttm_helper ttm crct10dif_pclmul drm_exec crc32_pclmul > > gpu_sched > > crc32c_intel drm_suballoc_helper polyval_clmulni drm_panel_backlight_quirks > > polyval_generic drm_buddy ghash_clmulni_intel sha512_ssse3 > > drm_display_helper sha256_ssse3 sha1_ssse3 xhci_pci cec nvme sp5100_tco > > xhci_hcd nvme_core nvme_auth > > video wmi xen_acpi_processor xen_privcmd xen_pciback xen_blkback > > xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua > > uinput dm_multipath > > [ 9.398698] CR2: 0000000000000070 > > [ 9.399266] ---[ end trace 0000000000000000 ]--- > > [ 9.399880] RIP: e030:cpufreq_update_limits+0x10/0x30 > > [ 9.400528] Code: 84 00 00 00 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 > > 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 05 98 e4 21 02 > > <48> 8b 40 70 48 85 c0 74 06 e9 a2 36 38 00 cc e9 ec fe ff ff 66 66 > > [ 9.401673] RSP: e02b:ffffc9004058be28 EFLAGS: 00010246 > > [ 9.402316] RAX: 0000000000000000 RBX: ffff888005bf4800 RCX: > > ffff88805d635fa8 > > [ 9.403060] RDX: ffff888005bf4800 RSI: 0000000000000085 RDI: > > 0000000000000000 > > [ 9.403819] RBP: ffff888005cd7800 R08: 0000000000000000 R09: > > 8080808080808080 > > [ 9.404581] R10: ffff88800391abc0 R11: fefefefefefefeff R12: > > ffff888004e8aa00 > > [ 9.405332] R13: ffff88805d635f80 R14: ffff888004e8aa15 R15: > > ffff8880059baf00 > > [ 9.406063] FS: 0000000000000000(0000) GS:ffff88805d600000(0000) > > knlGS:0000000000000000 > > [ 9.406830] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 9.407561] CR2: 0000000000000070 CR3: 000000000202c000 CR4: > > 0000000000050660 > > [ 9.408318] Kernel panic - not syncing: Fatal exception > > [ 9.409022] Kernel Offset: disabled > > (XEN) Hardware Dom0 crashed: 'noreboot' set - not rebooting. > > > > Looking at the call trace, it's likely related to ACPI, and Xen too, so > > I'm adding relevant lists too. > > > > Any ideas? > > > > #regzbot introduced: v6.12.11..v6.13.6 > > That code looks to have been introduced for 6.9, so I wonder if so far you > merely > were lucky not to have observed any "highest perf changed" notification. See > 9c4a13a08a9b ("ACPI: cpufreq: Add highest perf change notification"), which > imo > merely adds a 2nd path to a pre-existing problem: cpufreq_update_limits() > assumes > that cpufreq_driver is non-NULL, and only checks > cpufreq_driver->update_limits. > But of course the assumption there may be legitimate, and it's logic elsewhere > which is or has become flawed. cpufreq_update_limits() needs to ensure that the driver is there. The attached patch should address this issue, Marek please verify. Attachment:
cpufreq-update-limits-fix.patch
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |