[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 08/10] x86/mtrr: let cache_aps_delayed_init replace mtrr_aps_delayed_init
On 28.09.22 18:32, Juergen Gross wrote: On 28.09.22 18:12, Borislav Petkov wrote:On Wed, Sep 28, 2022 at 03:43:56PM +0200, Juergen Gross wrote:Would you feel better with adding a new enum member CPUHP_AP_CACHECTRL_ONLINE? This would avoid a possible source of failure during resume in case no slot for CPUHP_AP_ONLINE_DYN is found (quite improbable, but in theory possible).Let's keep that in the bag for the time when we get to cross that bridge.You wouldn't want to do that there, as there are multiple places where pm_sleep_enable_secondary_cpus() is being called.We want all of them, I'd say. They're all some sort of suspend AFAICT. But yes, if we get to do it, that would need a proper audit.Additionally not all cases are coming in via pm_sleep_enable_secondary_cpus(), as there is e.g. a call of suspend_enable_secondary_cpus() from kernel_kexec(), which wants to have the same handling.Which means, more hairy.arch_thaw_secondary_cpus_begin() and arch_thaw_secondary_cpus_end() are the functions to mark start and end of the special region where the delayed MTRR setup should happen.Yap, it seems like the best solution at the moment. Want me to do a proper patch and test it on real hw?I can do that. Okay, lets define what is meant by "that" just to be on the same page. The idea to use a hotplug callback seems to be rather risky IMHO. At least CPUHP_AP_ONLINE_DYN seems to be way too late, as there are several device drivers hooking in with the same or lower priority already. And device drivers might rely on PAT settings in PTEs of MTRR being setup correctly. Another problematic case is CPUHP_AP_MICROCODE_LOADER, which is explicitly doing cache writeback and invalidation, which seems to be risky without having a sane PAT/MTRR state of the processor. It should be noted that the microcode loader is registered via late_initcall(), so boot isn't affected by the delayed MTRR/PAT init when booting. So the only secure way to use a hotplug callback would be to have a rather early preregistered slot in enum cpuhp_state. Regarding resume and kexec I'm no longer sure doing the delayed MTRR/PAT init is such a great idea. It might save some milliseconds, but the risks mentioned above with e.g. microcode loading should apply. So right now I'm inclined to be better on the safe side by not adding any cpu hotplug hook, but to use just the same "delayed AP init" flag as today, just renaming it. This would leave the delayed MTRR/PAT init in place for resume and kexec cases, but deferring the MTRR/PAT cleanup due to this potential issue seems not appropriate, as the cleanup isn't changing the behavior here. We should, however, have a discussion in parallel or later, whether the whole thaw_secondary_cpus() handling is really okay or whether it should be changed in some way. Juergen Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc Attachment:
OpenPGP_signature
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |