[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Xen Security Advisory 154 (CVE-2016-2270) - x86: inconsistent cachability flags on guest mappings



> x86: enforce consistent cachability of MMIO mappings
> 
> We've been told by Intel that inconsistent cachability between
> multiple mappings of the same page can affect system stability only
> when the affected page is an MMIO one. Since the stale data issue is
> of no relevance to the hypervisor (since all guest memory accesses go
> through proper accessors and validation), handling of RAM pages
> remains unchanged here. Any MMIO mapped by domains however needs to be
> done consistently (all cachable mappings or all uncachable ones), in
> order to avoid Machine Check exceptions. Since converting existing
> cachable mappings to uncachable (at the time an uncachable mapping
> gets established) would in the PV case require tracking all mappings,
> allow MMIO to only get mapped uncachable (UC, UC-, or WC).
> 
> This also implies that in the PV case we mustn't use the L1 PTE update
> fast path when cachability flags get altered.
> 
> Since in the HVM case at least for now we want to continue honoring
> pinned cachability attributes for pages not mapped by the hypervisor,
> special case handling of r/o MMIO pages (forcing UC) gets added there.
> Arguably the counterpart change to p2m-pt.c may not be necessary, since
> UC- (which already gets enforced there) is probably strict enough.
> 
> Note that the shadow code changes include fixing the write protection
> of r/o MMIO ranges: shadow_l1e_remove_flags() and its siblings, other
> than l1e_remove_flags() and alike, return the new PTE (and hence
> ignoring their return values makes them no-ops).
> 
> This is CVE-2016-2270 / XSA-154.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

...

> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -770,8 +770,17 @@ int epte_get_entry_emt(struct domain *d,
>      if ( v->domain != d )
>          v = d->vcpu ? d->vcpu[0] : NULL;
>  
> -    if ( !mfn_valid(mfn_x(mfn)) )
> +    if ( !mfn_valid(mfn_x(mfn)) ||
> +         rangeset_contains_range(mmio_ro_ranges, mfn_x(mfn),
> +                                 mfn_x(mfn) + (1UL < order) - 1) )
> +    {
> +        *ipat = 1;
>          return MTRR_TYPE_UNCACHABLE;
> +    }
> +
> +    if ( rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
> +                                 mfn_x(mfn) + (1UL < order) - 1) )
> +        return -1;
>  
>      switch ( hvm_get_mem_pinned_cacheattr(d, gfn, order, &type) )
>      {

This doesn't look right. That second 'if(rangeset_overlaps_range(…))'
is tautologically false, because if it *is* true, the first if()
statement happens first and it's never reached.

The reason I'm looking is because that first if() statement is
happening for MMIO regions where it probably shouldn't. This means that
guests are mapping MMIO BARs of assigned devices and getting *forced*
UC (because *ipat=1) instead of taking the if(direct_mmio) path
slightly further down — which wouldn't set the 'ignore PAT' bit, and
would thus allow them to enable WC through their PAT.

It makes me wonder if the first was actually intended to be
'!mfn_valid() && rangeset_contains_range(…)' — with logical && rather
than ||. That would make some sense because it's then explicitly
refusing to map pages which are in mmio_ro_ranges *and* mfn_valid().

And then there's a 'if (direct_mmio) return UC;' further down which
looks like it'd do the right thing for the use case I'm actually
testing. I may see if I can construct a straw man patch, but I'm kind
of unfamiliar with this code so it should be taken with a large pinch
of salt...

There is a separate question of whether it's actually safe to let the
guest map an MMIO page with both UC and WC simultaneously. Empirically,
it seems to be OK — I hacked a guest kernel not to enforce the mutual
exclusion, mapped the BAR with both UC and WC and ran two kernel
threads, reading and writing the whole BAR in a number of iterations.
The WC thread went a lot faster than the UC one, so it will have often
been touching the same locations as the UC thread as it 'overtook' it,
and nothing bad happened. This seems reasonable, as the dire warnings
and machine checks are more about *cached* vs. uncached mappings, not
WC vs. UC. But it would be good to have a definitive answer from Intel
and AMD about whether it's safe.

-- 
dwmw2

¹ Or is that the problem — is mfn_valid() supposed to be true for MMIO
BARs, and this is actually an SR-IOV problem with resource assignment
failing to do that for VFs?

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.