 
	
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.16] Revert "x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents"
 On 25.11.2021 18:28, Andrew Cooper wrote:
> On 25/11/2021 10:43, Roger Pau Monné wrote:
>> On Thu, Nov 25, 2021 at 11:25:36AM +0100, Jan Beulich wrote:
>>> On 24.11.2021 22:11, Andrew Cooper wrote:
>>>> OSSTest has identified a 3rd regression caused by this change.  Migration
>>>> between Xen 4.15 and 4.16 on the nocera pair of machines (AMD Opteron 4133)
>>>> fails with:
>>>>
>>>>   xc: error: Failed to set CPUID policy: leaf 00000000, subleaf ffffffff, 
>>>> msr ffffffff (22 = Invalid argument): Internal error
>>>>   xc: error: Restore failed (22 = Invalid argument): Internal error
>>>>
>>>> which is a safety check to prevent resuming the guest when the CPUID data 
>>>> has
>>>> been truncated.  The problem is caused by shrinking of the max policies, 
>>>> which
>>>> is an ABI that needs handling compatibly between different versions of Xen.
>>>>
>>>> Furthermore, shrinking of the default policies also breaks things in some
>>>> cases, because certain cpuid= settings in a VM config file which used to 
>>>> have
>>>> an effect will now be silently discarded.
>>>>
>>>> This reverts commit 540d911c2813c3d8f4cdbb3f5672119e5e768a3d, as well as 
>>>> the
>>>> partial fix attempt in 81da2b544cbb003a5447c9b14d275746ad22ab37 (which 
>>>> added
>>>> one new case where cpuid= settings might not apply correctly) and restores 
>>>> the
>>>> same behaviour as Xen 4.15.
>>>>
>>>> Fixes: 540d911c2813 ("x86/CPUID: shrink max_{,sub}leaf fields according to 
>>>> actual leaf contents")
>>>> Fixes: 81da2b544cbb ("x86/cpuid: prevent shrinking migrated policies max 
>>>> leaves")
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>> While not strictly needed with Roger having given his already,
>>> Acked-by: Jan Beulich <jbeulich@xxxxxxxx>
>>> to signal my (basic) agreement with the course of action taken.
>>> Nevertheless I fear this is going to become yet one more case where
>>> future action is promised, but things then die out.
>> I'm certainly happy to look at newer versions of this patch, but I
>> think we should consider doing the shrinking only on the toolstack
>> said, and only after all the manipulations on the policy have been
>> performed.
> 
> Correct.
> 
> The max policies cannot be shrunk - they are, by definition, the upper
> bounds that we audit against.  (More precisely, they must never end up
> lower than an older Xen used to offer on the same configuration, and
> must not be lower anything the user may opt in to.)
I disagree: For one, the user cannot opt in to anything beyond max policy.
Or else that policy isn't "max" anymore. The user may opt in to a higher
than useful max (sub)leaf, but that's independent. I'm also not convinced
older Xen mistakenly offering too much can be taken as an excuse that we
can't ever go below that. We've done so in the past iirc, with workarounds
added elsewhere. Older Xen offering too high a max (sub)leaf again is
independent. Max (sub)leaf requests from the user should, contrary to my
view when submitting the original change, be honored. This would then
automatically include migrating-in guests.
> The default policies can in principle be shrunk, but only if the
> toolstack learns to grow max leaf too (which it will need to). 
> Nevertheless, actually shrinking the default policies is actively
> unhelpful, because it is wasting time doing something which the
> toolstack needs to undo later.
I agree.
> The policy for new domains should be shrunk, but only after every other
> adjustment is made.  This is one small aspect of teaching the toolstack
> to properly understand CPUID (and MSR) policies, and has always been on
> the plan.
And this not being the case yet is getting too prominent for my taste
with the need to raise the max Intel leaves we know of for things like
AMX or KeyLocker. I didn't get shrinking done right; apologies for that.
But I continue to think that proper shrinking ought to be a prereq to
that, without delaying that work (effectively complete for AMX, partial
for KeyLocker) almost indefinitely.
Jan
 
 
 | 
|  | Lists.xenproject.org is hosted with RackSpace, monitoring our |