|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v7 08/10] x86/microcode: Synchronize late microcode loading
On Wed, Jun 05, 2019 at 08:09:43AM -0600, Jan Beulich wrote:
>>>> On 27.05.19 at 10:31, <chao.gao@xxxxxxxxx> wrote:
>> This patch ports microcode improvement patches from linux kernel.
>>
>> Before you read any further: the early loading method is still the
>> preferred one and you should always do that. The following patch is
>> improving the late loading mechanism for long running jobs and cloud use
>> cases.
>>
>> Gather all cores and serialize the microcode update on them by doing it
>> one-by-one to make the late update process as reliable as possible and
>> avoid potential issues caused by the microcode update.
>>
>> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
>> Tested-by: Chao Gao <chao.gao@xxxxxxxxx>
>> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
>> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
>> Cc: Kevin Tian <kevin.tian@xxxxxxxxx>
>> Cc: Jun Nakajima <jun.nakajima@xxxxxxxxx>
>> Cc: Ashok Raj <ashok.raj@xxxxxxxxx>
>> Cc: Borislav Petkov <bp@xxxxxxx>
>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> Changes in v7:
>> - Check whether 'timeout' is 0 rather than "<=0" since it is unsigned int.
>> - reword the comment above microcode_update_cpu() to clearly state that
>> one thread per core should do the update.
>>
>> Changes in v6:
>> - Use one timeout period for rendezvous stage and another for update stage.
>> - scale time to wait by the number of remaining cpus to respond.
>> It helps to find something wrong earlier and thus we can reboot the
>> system earlier.
>> ---
>> xen/arch/x86/microcode.c | 171
>> ++++++++++++++++++++++++++++++++++++++++++-----
>> 1 file changed, 155 insertions(+), 16 deletions(-)
>>
>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> index 23cf550..f4a417e 100644
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -22,6 +22,7 @@
>> */
>>
>> #include <xen/cpu.h>
>> +#include <xen/cpumask.h>
>
>It seems vanishingly unlikely that you would need this explicit #include
>here, but it certainly isn't wrong.
>
>> @@ -270,31 +296,90 @@ bool microcode_update_cache(struct microcode_patch
>> *patch)
>> return true;
>> }
>>
>> -static long do_microcode_update(void *patch)
>> +/* Wait for CPUs to rendezvous with a timeout (us) */
>> +static int wait_for_cpus(atomic_t *cnt, unsigned int expect,
>> + unsigned int timeout)
>> {
>> - int error, cpu;
>> -
>> - error = microcode_update_cpu(patch);
>> - if ( error )
>> + while ( atomic_read(cnt) < expect )
>> {
>> - microcode_ops->free_patch(microcode_cache);
>> - return error;
>> + if ( !timeout )
>> + {
>> + printk("CPU%d: Timeout when waiting for CPUs calling in\n",
>> + smp_processor_id());
>> + return -EBUSY;
>> + }
>> + udelay(1);
>> + timeout--;
>> }
>
>There's no comment here and nothing in the description: I don't
>recall clarification as to whether RDTSC is fine to be issued by a
>thread when ucode is being updated by another thread on the
>same core.
Yes. I think it is fine.
Ashok, could you share your opinion on this question?
>
>> +static int do_microcode_update(void *patch)
>> +{
>> + unsigned int cpu = smp_processor_id();
>> + unsigned int cpu_nr = num_online_cpus();
>> + unsigned int finished;
>> + int ret;
>> + static bool error;
>>
>> - microcode_update_cache(patch);
>> + atomic_inc(&cpu_in);
>> + ret = wait_for_cpus(&cpu_in, cpu_nr, MICROCODE_CALLIN_TIMEOUT_US);
>> + if ( ret )
>> + return ret;
>>
>> - return error;
>> + ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
>> + /*
>> + * Load microcode update on only one logical processor per core.
>> + * Here, among logical processors of a core, the one with the
>> + * lowest thread id is chosen to perform the loading.
>> + */
>> + if ( !ret && (cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu))) )
>
>At the very least it's not obvious whether this hyper-threading-centric
>view ("logical processor") also applies to AMD's compute unit model
>(which reuses cpu_sibling_mask). It does, as the respective MSRs are
>per-compute-unit rather than per-core, but I'd appreciate if the
>wording could be adjusted to explicitly name both cases (multiple
>threads per core and multiple cores per CU).
OK. Will do
>
>> + {
>> + ret = microcode_ops->apply_microcode(patch);
>> + if ( !ret )
>> + atomic_inc(&cpu_updated);
>> + }
>> + /*
>> + * Increase the wait timeout to a safe value here since we're
>> serializing
>
>I'm struggling with the "increase": I don't see anything being increased
>here. You simply use a larger timeout than above.
>
>> + * the microcode update and that could take a while on a large number of
>> + * CPUs. And that is fine as the *actual* timeout will be determined by
>> + * the last CPU finished updating and thus cut short
>> + */
>> + atomic_inc(&cpu_out);
>> + finished = atomic_read(&cpu_out);
>> + while ( !error && finished != cpu_nr )
>> + {
>> + /*
>> + * During each timeout interval, at least a CPU is expected to
>> + * finish its update. Otherwise, something goes wrong.
>> + */
>> + if ( wait_for_cpus(&cpu_out, finished + 1,
>> + MICROCODE_UPDATE_TIMEOUT_US) && !error )
>> + {
>> + error = true;
>> + panic("Timeout when finishing updating microcode (finished
>> %d/%d)",
>> + finished, cpu_nr);
>
>Why the setting of "error" when you panic anyway?
>
>And please use format specifiers matching the types of the
>further arguments (i.e. twice %u here, but please check other
>code as well).
>
>Furthermore (and I'm sure I've given this comment before) if
>you really hit the limit, how many panic() invocations are there
>going to be? You run this function on all CPUs after all.
"error" is to avoid calling of panic() on multiple CPUs simultaneously.
Roger is right: atomic primitives should be used here.
>
>On the whole, taking a 256-thread system as example, you
>allow the whole process to take over 4 min without calling
>panic().
>Leaving aside guests, I don't think Xen itself would
>survive this in all cases. We've found the need to process
>softirqs with far smaller delays, in particular from key handlers
>producing lots of output. At the very least there should be a
>bold warning logged if the system had been in stop-machine
>state for, say, longer than 100ms (value subject to discussion).
>
In theory, if you mean 256 cores, yes. Do you think a configurable and
run-time changeable upper bound for the whole process can address your
concern? The default value for this upper bound can be set to a large
value (for example, 1s * the number of online core) and the admin can
ajust/lower the upper bound according to the way (serial or parallel) to
perform the update and other requirements. Once the upper bound is
reached, we would call panic().
>> + }
>> +
>> + finished = atomic_read(&cpu_out);
>> + }
>> +
>> + /*
>> + * Refresh CPU signature (revision) on threads which didn't call
>> + * apply_microcode().
>> + */
>> + if ( cpu != cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> + ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
>
>Another option would be for the CPU doing the update to simply
>propagate the new value to all its siblings' cpu_sig values.
Will do.
>
>> @@ -337,12 +429,59 @@ int
>> microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>> if ( patch )
>> microcode_ops->free_patch(patch);
>> ret = -EINVAL;
>> - goto free;
>> + goto put;
>> }
>>
>> - ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
>> - do_microcode_update, patch);
>> + atomic_set(&cpu_in, 0);
>> + atomic_set(&cpu_out, 0);
>> + atomic_set(&cpu_updated, 0);
>> +
>> + /* Calculate the number of online CPU core */
>> + nr_cores = 0;
>> + for_each_online_cpu(cpu)
>> + if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> + nr_cores++;
>> +
>> + printk(XENLOG_INFO "%d cores are to update their microcode\n",
>> nr_cores);
>> +
>> + /*
>> + * We intend to disable interrupt for long time, which may lead to
>> + * watchdog timeout.
>> + */
>> + watchdog_disable();
>> + /*
>> + * Late loading dance. Why the heavy-handed stop_machine effort?
>> + *
>> + * - HT siblings must be idle and not execute other code while the other
>> + * sibling is loading microcode in order to avoid any negative
>> + * interactions cause by the loading.
>> + *
>> + * - In addition, microcode update on the cores must be serialized until
>> + * this requirement can be relaxed in the future. Right now, this is
>> + * conservative and good.
>> + */
>> + ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
>> + watchdog_enable();
>> +
>> + if ( atomic_read(&cpu_updated) == nr_cores )
>> + {
>> + spin_lock(µcode_mutex);
>> + microcode_update_cache(patch);
>> + spin_unlock(µcode_mutex);
>> + }
>> + else if ( atomic_read(&cpu_updated) == 0 )
>> + microcode_ops->free_patch(patch);
>> + else
>> + {
>> + printk("Updating microcode succeeded on part of CPUs and failed
>> on\n"
>> + "others due to an unknown reason. A system with different\n"
>> + "microcode revisions is considered unstable. Please reboot
>> and\n"
>> + "do not load the microcode that triggers this warning\n");
>> + microcode_ops->free_patch(patch);
>> + }
>
>As said on an earlier patch, I think the cache can be updated if at
>least one CPU loaded the blob successfully. Additionally I'd like to
>ask that you log the number of successfully updated cores. And
>finally perhaps "differing" instead of "different" and omit "due to
>an unknown reason"?
Will do.
Thanks
Chao
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |