[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 04/12] x86/mce: handle LMCE locally
On 03/20/17 08:24 -0600, Jan Beulich wrote: > >>> On 17.03.17 at 07:46, <haozhong.zhang@xxxxxxxxx> wrote: [..] > > @@ -1704,10 +1717,11 @@ static void mce_softirq(void) > > { > > int cpu = smp_processor_id(); > > unsigned int workcpu; > > + bool nowait = !this_cpu(mce_in_process); > > > > mce_printk(MCE_VERBOSE, "CPU%d enter softirq\n", cpu); > > > > - mce_barrier_enter(&mce_inside_bar); > > + mce_barrier_enter(&mce_inside_bar, nowait); > > > > /* > > * Everybody is here. Now let's see who gets to do the > > @@ -1720,10 +1734,10 @@ static void mce_softirq(void) > > > > atomic_set(&severity_cpu, cpu); > > > > - mce_barrier_enter(&mce_severity_bar); > > + mce_barrier_enter(&mce_severity_bar, nowait); > > if (!mctelem_has_deferred(cpu)) > > atomic_set(&severity_cpu, cpu); > > - mce_barrier_exit(&mce_severity_bar); > > + mce_barrier_exit(&mce_severity_bar, nowait); > > > > /* We choose severity_cpu for further processing */ > > if (atomic_read(&severity_cpu) == cpu) { > > The logic here looks pretty suspicious even without your changes, > but I think we should try hard to not make it worse. I think you > need to avoid setting severity_cpu in the LMCE case (with, > obviously, further resulting adjustments). Ah yes, this patch introduces a race condition between mce_cmn_handler() and mce_softirq() on different CPUs: mce_cmn_handler() is handling a LMCE on CPUx and mce_softirq() is handling another LMCE on CPUy, and both are modifying the global severity_cpu. As both check severity_cpu later, their modifications will interfere with each other. I'll not let mce_cmn_handler() and mce_softirq() access severity_cpu when handling LMCE. Thanks, Haozhong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |