|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/AMD: expose SYSCFG, TOM, and TOM2 to Dom0
On 27.05.2021 10:33, Roger Pau Monné wrote:
> On Wed, May 26, 2021 at 02:59:00PM +0200, Jan Beulich wrote:
>> Sufficiently old Linux (3.12-ish) accesses these MSRs in an unguarded
>> manner. Furthermore these MSRs, at least on Fam11 and older CPUs, are
>> also consulted by modern Linux, and their (bogus) built-in zapping of
>> #GP faults from MSR accesses leads to it effectively reading zero
>> instead of the intended values, which are relevant for PCI BAR placement
>> (which ought to all live in MMIO-type space, not in DRAM-type one).
>>
>> For SYSCFG, only certain bits get exposed. In fact, whether to expose
>> MtrrVarDramEn is debatable: It controls use of not just TOM, but also
>> the IORRs. Introduce (consistently named) constants for the bits we're
>> interested in and use them in pre-existing code as well.
>
> I think we should also allow access to the IORRs MSRs for coherency
> (c001001{6,9}) for the hardware domain.
Hmm, originally I was under the impression that these could conceivably
be written by OSes, and hence would want dealing with separately. But
upon re-reading I see that they are supposed to be set by the BIOS alone.
So yes, let me add them for read access, taking care of the limitation
that I had to spell out.
This raises the question then though whether to also include SMMAddr
and SMMMask in the set - the former does get accessed by Linux as well,
and was one of the reasons for needing 6eef0a99262c ("x86/PV:
conditionally avoid raising #GP for early guest MSR reads").
Especially for SMMAddr, and maybe also for IORR_BASE, returning zero
for DomU-s might be acceptable. The respective masks, however, can
imo not sensibly be returned as zero. Hence even there I'd leave DomU
side handling (see below) for a later time.
>> As a welcome side effect, verbosity on/of debug builds gets (perhaps
>> significantly) reduced.
>>
>> Note that at least as far as those MSR accesses by Linux are concerned,
>> there's no similar issue for DomU-s, as the accesses sit behind PCI
>> device matching logic. The checked for devices would never be exposed to
>> DomU-s in the first place. Nevertheless I think that at least for HVM we
>> should return sensible values, not 0 (as svm_msr_read_intercept() does
>> right now). The intended values may, however, need to be determined by
>> hvmloader, and then get made known to Xen.
>
> Could we maybe come up with a fixed memory layout that hvmloader had
> to respect?
>
> Ie: DRAM from 0 to 3G, MMIO from 3G to 4G, and then the remaining
> DRAM from 4G in a contiguous single block?
>
> hvmloader would have to place BARs that don't fit in the 3G-4G hole at
> the end of DRAM (ie: after TOM2).
Such a fixed scheme may be too limiting, I'm afraid.
>> --- a/xen/arch/x86/msr.c
>> +++ b/xen/arch/x86/msr.c
>> @@ -339,6 +339,19 @@ int guest_rdmsr(struct vcpu *v, uint32_t
>> *val = msrs->tsc_aux;
>> break;
>>
>> + case MSR_K8_SYSCFG:
>> + case MSR_K8_TOP_MEM1:
>> + case MSR_K8_TOP_MEM2:
>> + if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>> + goto gp_fault;
>> + if ( !is_hardware_domain(d) )
>> + return X86EMUL_UNHANDLEABLE;
>
> It might be clearer to also handle the !is_hardware_domain case here,
> instead of deferring to svm_msr_read_intercept:
>
> if ( is_hardware_domain(d) )
> rdmsrl(msr, *val);
> else
> *val = 0;
As said in the post-commit-message remark, I don't think returning 0
here is appropriate. I'd be willing to move DomU handling here, but
only once it's sane.
>> @@ -279,10 +286,7 @@
>> #define MSR_K8_TOP_MEM1 0xc001001a
>> #define MSR_K7_CLK_CTL 0xc001001b
>> #define MSR_K8_TOP_MEM2 0xc001001d
>> -#define MSR_K8_SYSCFG 0xc0010010
>>
>> -#define K8_MTRRFIXRANGE_DRAM_ENABLE 0x00040000 /* MtrrFixDramEn bit */
>> -#define K8_MTRRFIXRANGE_DRAM_MODIFY 0x00080000 /* MtrrFixDramModEn bit */
>> #define K8_MTRR_RDMEM_WRMEM_MASK 0x18181818 /* Mask: RdMem|WrMem */
>
> That last one seem to be unused, I wonder if you could also drop it as
> part of this cleanup?
It's for an unrelated set of MSRs, so I thought I'd leave it alone
here. Otoh the value is at best bogus anyway, as it assumes both
halves of fixed-size MTRRs get accessed separately. So I guess
since you ask for it, I'll drop it at this occasion.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |