[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/PV32: restore PAE-extended-CR3 logic
On Tue, Apr 04, 2023 at 12:31:31PM +0200, Jan Beulich wrote: > On 04.04.2023 12:12, Roger Pau Monné wrote: > > On Wed, Feb 15, 2023 at 03:54:11PM +0100, Jan Beulich wrote: > >> While the PAE-extended-CR3 VM assist is a 32-bit only concept, it still > >> applies to guests also when run on a 64-bit hypervisor: The "extended > >> CR3" format has to be used there as well, to fit the address in the only > >> 32-bit wide register there. As a result it was a mistake that the check > >> was never enabled for that case, and was then mistakenly deleted in the > >> course of removal of 32-bit-Xen code (218adf199e68 ["x86: We can assume > >> CONFIG_PAGING_LEVELS==4"]). > >> > >> Similarly during Dom0 construction kernel awareness needs to be taken > >> into account, and respective code was again mistakenly never enabled for > >> 32-bit Dom0 when running on 64-bit Xen (and thus wrongly deleted by > >> 5d1181a5ea5e ["xen: Remove x86_32 build target"]). > >> > >> At the same time restrict enabling of the assist for Dom0 to just the > >> 32-bit case. Furthermore there's no need for an atomic update there. > >> > >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > >> --- > >> I was uncertain whether to add a check to the CR3 guest read path, > >> raising e.g. #GP(0) when the value read wouldn't fit but also may not > >> be converted to "extended" format (overflow is possible there in > >> principle because of the control tools "slack" in promote_l3_table()). > >> > >> In that context I was puzzled to find no check on the CR3 guest write > >> path even in 4.2: A guest (bogusly) setting the PCD or PWT bits (or any > >> of the low reserved ones) could observe anomalous behavior rather than > >> plain failure. > >> > >> As to a Fixes: tag - it's pretty unclear which of the many original > >> 32-on-64 changes to blame. I don't think the two cited commits should > >> be referenced there, as they didn't break anything that wasn't already > >> broken. > >> > >> --- a/xen/arch/x86/mm.c > >> +++ b/xen/arch/x86/mm.c > >> @@ -1520,6 +1520,23 @@ static int promote_l3_table(struct page_ > >> unsigned int partial_flags = page->partial_flags; > >> l3_pgentry_t l3e = l3e_empty(); > >> > >> + /* > >> + * PAE pgdirs above 4GB are unacceptable if a 32-bit guest does not > >> + * understand the weird 'extended cr3' format for dealing with > >> high-order > >> + * address bits. We cut some slack for control tools (before vcpu0 is > >> + * initialised). > > > > Don't we then need some check in the vCPU init path to assure that the > > cr3 is < 32bits if we allow those to initially be set? > > > > Or will the initialization unconditionally overwrite any previous cr3 > > value? > > That's not the way I understand this "cut some slack". Instead I read it > to be meant to cover for the VM-assist bit not being set, yet. Beyond > that it is assumed to be tool stack's responsibility to constrain > addresses suitably. If it doesn't, it'll simply break the guest. (There > is some guessing on my part involved here, as the original introduction > of that code didn't further explain things.) If it's just the guest that's broken I would think it's fine. As long as such mismatch doesn't cause issues in the hypervisor internal state. Did you see a toolstack setting such entries before pae_extended_cr3 is set? Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |