[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/ucode: Refresh raw CPU policy after microcode load
On 04.05.2023 10:08, Andrew Cooper wrote: > On 04/05/2023 8:08 am, Jan Beulich wrote: >> On 03.05.2023 20:58, Andrew Cooper wrote: >>> Loading microcode can cause new features to appear. >> Or disappear (LWP)? While I don't think we want to panic() in this >> case (we do on the S3 resume path when recheck_cpu_features() fails >> on the BSP), it would seem to me that we want to go a step further >> than you do and at least warn when a feature went amiss. Or is that >> too much of a scope-creeping request right here? > > You're correct that I ought to discuss the disappear case. But like > livepatching, it's firmly in the realm of "the end user takes > responsibility for trying this in their test system before running it in > production". Okay, with the case at least suitably mentioned Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > For LWP specifically, we ought to explicitly permit its disappearance in > recheck_cpu_features(), because this specific example is known to exist, > and known safe as Xen never used or virtualised LWP functionality. Right, but iirc we did expose it to guests earlier on. And I've used it as a known example only anyway. Who knows what vendors decide to make disappear the next time round ... > Crashing on S3 I may guess you meant to continue "... is bad anyway"? >>> @@ -677,6 +678,9 @@ static long cf_check microcode_update_helper(void *data) >>> spin_lock(µcode_mutex); >>> microcode_update_cache(patch); >>> spin_unlock(µcode_mutex); >>> + >>> + /* Refresh the raw CPU policy, in case the features have changed. >>> */ >>> + calculate_raw_cpu_policy(); >> I understand this is in line with what we do during boot, but there >> and here I wonder whether this wouldn't better deal with possible >> asymmetries (e.g. in case ucode loading failed on one of the CPUs), >> along the lines of what we do near the end of identify_cpu() for >> APs. (Unlike the question higher up, this is definitely only a >> remark here, not something I'd consider dealing with right in this >> change.) > > Asymmetry is an increasingly theoretical problem. Yeah, it exists in > principle, but Xen has no way of letting you explicitly get into that > situation. > > This too falls firmly into the "end user takes responsibility for > testing it properly first" category. > > We have explicit symmetric assumptions/requirements elsewhere (e.g. for > a single system, there's 1 correct ucode blob). > > We can acknowledge that asymmetry exists, but there is basically nothing > Xen can do about it other than highlight that something is very wrong on > the system. Odds are that a system which gets into such a state won't > survive much longer. Indeed. Hence the desire to at least log the fact, such that investigation of the sudden death won't take long. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |