[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v8 14/16] x86/microcode: Synchronize late microcode loading
On Mon, Aug 05, 2019 at 10:43:16AM +0000, Jan Beulich wrote: >On 01.08.2019 12:22, Chao Gao wrote: >> @@ -234,6 +267,35 @@ bool microcode_update_cache(struct microcode_patch >> *patch) >> } >> >> /* >> + * Wait for a condition to be met with a timeout (us). >> + */ >> +static int wait_for_condition(int (*func)(void *data), void *data, >> + unsigned int timeout) >> +{ >> + while ( !func(data) ) >> + { >> + if ( !timeout-- ) >> + { >> + printk("CPU%u: Timeout in %s\n", smp_processor_id(), __func__); > >I don't think __func__ is helpful here for problem analysis. Instead >I think you would want to log either func or __builtin_return_address(0). Will do. > >> @@ -283,37 +345,105 @@ static int microcode_update_cpu(const struct >> microcode_patch *patch) >> return err; >> } >> >> -static long do_microcode_update(void *patch) >> +static int do_microcode_update(void *patch) >> { >> - unsigned int cpu; >> + unsigned int cpu = smp_processor_id(); >> + unsigned int cpu_nr = num_online_cpus(); >> + int ret; >> >> - /* store the patch after a successful loading */ >> - if ( !microcode_update_cpu(patch) && patch ) >> + /* Mark loading an ucode is in progress */ >> + cmpxchg(&loading_state, LOADING_EXITED, LOADING_ENTERED); > >Why cmpxchg(), especially when you don't check whether the store >has actually happened? (Same further down then.) loading_state is set to "LOADING_EXITED" by the caller. So the first CPU coming here would store "LOADING_ENTERED" to it. And we don't need special handling for the CPU that sets the state, so the return value isn't checked. Here are my considerations about how to set the state: 1) We cannot set the state in the caller. Because we would use this state to block #NMI handling. Setting the state in the caller may break stop_machine_run mechanism with the patch 16. 2) The first CPU reaching here should set the state (it means we cannot choose one CPU, e.g. BSP, to do it). With this, #NMI handling is disabled system-wise before any CPU calls in. Otherwise, if there is an exception for a CPU, it may be still in #NMI handling, when its sibling thread starts loading an ucode. 3) A simple assignment on every CPU is problematic in some cases. For example, if one CPU comes in after other CPUs timed out and left, it might set the state to "LOADING_ENTERED" and be blocked in #NMI handling forever with patch 16. That's why I choose to use a cmpxhg(). Regarding the cmpxchg() in error-handling below, it can be replaced by a simple assignment. But I am not sure whether it is better to use cmpxchg() considering cache line bouncing. > >> + cpumask_set_cpu(cpu, &cpu_callin_map); >> + ret = wait_for_condition(wait_cpu_callin, (void *)(unsigned long)cpu_nr, >> + MICROCODE_CALLIN_TIMEOUT_US); >> + if ( ret ) >> { >> - spin_lock(µcode_mutex); >> - microcode_update_cache(patch); >> - spin_unlock(µcode_mutex); >> - patch = NULL; >> + cmpxchg(&loading_state, LOADING_ENTERED, LOADING_ABORTED); >> + return ret; >> } >> >> - if ( microcode_ops->end_update ) >> - microcode_ops->end_update(); >> + /* >> + * Load microcode update on only one logical processor per core, or in >> + * AMD's term, one core per compute unit. The one with the lowest thread >> + * id among all siblings is chosen to perform the loading. >> + */ >> + if ( (cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu))) ) >> + { >> + static unsigned int panicked = 0; >> + bool monitor; >> + unsigned int done; >> + unsigned long tick = 0; >> >> - cpu = cpumask_next(smp_processor_id(), &cpu_online_map); >> - if ( cpu < nr_cpu_ids ) >> - return continue_hypercall_on_cpu(cpu, do_microcode_update, patch); >> + ret = microcode_ops->apply_microcode(patch); >> + if ( !ret ) >> + { >> + unsigned int cpu2; >> >> - /* Free the patch if no CPU has loaded it successfully. */ >> - if ( patch ) >> - microcode_free_patch(patch); >> + atomic_inc(&cpu_updated); >> + /* Propagate revision number to all siblings */ >> + for_each_cpu(cpu2, per_cpu(cpu_sibling_mask, cpu)) >> + per_cpu(cpu_sig, cpu2).rev = this_cpu(cpu_sig).rev; >> + } >> >> - return 0; >> + /* >> + * The first CPU reaching here will monitor the progress and emit >> + * warning message if the duration is too long (e.g. >1 second). >> + */ >> + monitor = !atomic_inc_return(&cpu_out); >> + if ( monitor ) >> + tick = rdtsc_ordered(); >> + >> + /* Waiting for all cores or computing units finishing update */ >> + done = atomic_read(&cpu_out); >> + while ( panicked && done != nr_cores ) > >Don't you mean "!panicked" here, or || instead of && ? Otherwise I don't >see how the loop would ever be entered. Sorry. It should be !panicked. Other comments are reasonable and I will follow your suggestions. Thanks Chao _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |