[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/EPT: relax iPAT for "invalid" MFNs



On Tue, Jun 11, 2024 at 04:53:22PM +0200, Jan Beulich wrote:
> On 11.06.2024 15:52, Roger Pau Monné wrote:
> > On Tue, Jun 11, 2024 at 01:52:58PM +0200, Jan Beulich wrote:
> >> On 11.06.2024 13:08, Roger Pau Monné wrote:
> >>> I really wonder whether Xen has enough information to figure out
> >>> whether a hole (MMIO region) is supposed to be accessed as UC or
> >>> something else.
> >>
> >> It certainly hasn't, and hence is erring on the (safe) side of forcing
> >> UC.
> > 
> > Except that for the vesa framebuffer at least this is a bad choice :).
> 
> Well, yes, that's where we want WC to be permitted. But for that we only
> need to avoid setting iPAT; we still can uniformly hand back UC. Except
> (as mentioned elsewhere earlier) if the guest uses MTRRs rather than PAT
> to arrange for WC.

If we want to get this into 4.19, we likely want to go your proposed
approach then, as it's less risky.

I think a comment would be helpful to note that the fix here to not
enforce iPAT by still return UC is mostly done to accommodate vesa
regions mapped with PAT attributes to use WC.

I would also like to add some kind of note that special casing
!mfn_valid() might not be needed, but that removing it must be done
carefully to not cause regressions.

> >>>  Maybe the mfn_valid() check should be
> >>> inverted, and return WB when the underlying mfn is RAM, and otherwise
> >>> use the guest MTRRs to decide the cache attribute?
> >>
> >> First: Whether WB is correct for RAM isn't known. With some peculiar device
> >> assigned, the guest may want/need part of its RAM be e.g. WC or WT. (It's
> >> only without any physical devices assigned that we can be quite sure that
> >> WB is good for all of RAM.) Therefore, second, I think respecting MTRRs for
> >> RAM is less likely to cause problems than respecting them for MMIO.
> >>
> >> I think at this point the main question is: Do we want to do things at 
> >> least
> >> along the lines of this v1, or do we instead feel certain enough to switch
> >> the mfn_valid() to a comparison against INVALID_MFN (and perhaps moving it
> >> up to almost the top of the function)?
> > 
> > My preferred option would be the later, as that would remove a special
> > casing.  However, I'm unsure how much fallout this could cause - those
> > caching changes are always tricky and lead to unexpected fallout.
> 
> Which is the very reason why I tried to avoid going to far with this.
> 
> > OTOH the current !mfn_valid() check is very restrictive, as it forces
> > all MMIO to UC.
> 
> Which is why, in this v1, I'm relaxing only the iPAT part.
> 
> >  So by removing it we allow guest chosen types to take
> > effect, which are likely less restrictive than UC (whether those are
> > correct is another question).
> 
> No, guest chosen types still wouldn't come into play, due to what the
> switch() further down in the function does for p2m_mmio_direct.

Indeed.  That should also be removed if we decide for MMIO cache
attributes to be controlled by guest MTRRs.

> 
> >> One caveat here that I forgot to
> >> mention before: MFNs taken out of EPT entries will never be INVALID_MFN, 
> >> for
> >> the truncation that happens when populating entries. In that case we rely 
> >> on
> >> mfn_valid() to be "rejecting" them.
> > 
> > The only caller where mfns from EPT entries are passed to
> > epte_get_entry_emt() is in resolve_misconfig() AFAICT, and in that
> > case the EPT entry must be present for epte_get_entry_emt() to be
> > called.  So it seems to me that epte_get_entry_emt() can never be
> > called from an mfn constructed from an INVALID_MFN EPT entry (but it's
> > worth adding an assert for it).
> 
> Are you sure? I agree for the first of those two calls, but the second
> isn't quite as obvious. There we'd need to first prove that we will
> never create non-present super-page entries. Yet if I'm not mistaken
> for PoD we may create such.

I should go look then, didn't know PoD would do that.

Regards, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.