Xen project Mailing List

Re: [PATCH] x86/CPU: don't hard-code MTRR availability

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Thu, 27 Mar 2025 10:50:40 +0100

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Delivery-date: Thu, 27 Mar 2025 09:50:50 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Mar 27, 2025 at 10:15:03AM +0100, Jan Beulich wrote: > On 27.03.2025 09:21, Roger Pau Monné wrote: > > On Tue, Mar 25, 2025 at 08:18:11AM +0100, Jan Beulich wrote: > >> In particular if we're running virtualized, the underlying hypervisor > >> (which may be another Xen) may not surface MTRRs, and offer PAT only. > > > > At least for Xen, I think we offer MTRR uniformly, even on PVH > > guests? > > By default we do, but we discussed the option of offering PAT-only > environments > beyond leaving it open whether people disabling MTRR support in their guest > config are outside of supported terrain. > > >> Fixes: 5a281883cdc3 ("Hardcode many cpu features for x86/64 -- we know > >> 64-bit") > >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > > > > Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > > > My main concern is whether the !mtrr path is still functional. Have > > you tried booting the resulting hypervisor with MTRR masked on CPUID? > > > > (or alternatively short-circuiting cpu_has_mtrr == 0?) > > I didn't think this would be a prereq here. If we boot in an environment truly > lacking MTRR, we'll crash on the first MTRR MSR access - none of those > accesses > use the safe accessors. Right, hopefully we don't have unprotected MTRR MSR accesses, so cpu_has_mtrr == 0 should prevent those. > Since you asked, I tried booting with "cpuid=no-mtrr". > As I'm doing this on a system with console, all I can say is that it takes > minutes to reach the point where we'd start setting up Dom0 (which itself then > takes so long that I timed out waiting for it to make progress), due to all > screen output becoming unbelievably slow after AP bringup. Surely something's > screwed somewhere, as VRAM accesses being slow (or fast) shouldn't depend on > AP > bringup having completed. I actually suspect it's not just VRAM accesses which > are slow, but that we're left running in uncachable mode altogether for > whatever > reason. > > What this maybe useful for is to figure out the reason of "Platform timer > appears to have unexpectedly wrapped <N> times", which I saw appear once. > > Given this, I'm actually uncertain whether it is legitimate then to take your > ack. I think it might be best if we can figure out what causes those issues (and possibly fix them) before taking this patch? Albeit you could argue that running excruciatingly slow is better than just crashing of an unhandled #GP from a rdmsr. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.