[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 12/19] x86/mce: handle LMCE locally
On 02/22/17 06:53 -0700, Jan Beulich wrote: > >>> On 17.02.17 at 07:39, <haozhong.zhang@xxxxxxxxx> wrote: > > --- a/xen/arch/x86/cpu/mcheck/barrier.c > > +++ b/xen/arch/x86/cpu/mcheck/barrier.c > > @@ -20,7 +20,7 @@ void mce_barrier_enter(struct mce_softirq_barrier *bar) > > { > > int gen; > > > > - if (!mce_broadcast) > > + if ( !mce_broadcast || __get_cpu_var(lmce_in_process) ) > > this_cpu() please instead of __get_cpu_var() (which we should get > rid of rather sooner than later). will use this_cpu() > > > @@ -462,6 +474,7 @@ void mcheck_cmn_handler(const struct cpu_user_regs > > *regs) > > uint64_t gstatus; > > mctelem_cookie_t mctc = NULL; > > struct mca_summary bs; > > + bool *lmce_in_process = &__get_cpu_var(lmce_in_process); > > > > mce_spin_lock(&mce_logout_lock); > > > > @@ -505,6 +518,8 @@ void mcheck_cmn_handler(const struct cpu_user_regs > > *regs) > > } > > mce_spin_unlock(&mce_logout_lock); > > > > + *lmce_in_process = bs.lmce; > > You don't need a new local variable for this. > > > @@ -1709,6 +1724,7 @@ static void mce_softirq(void) > > { > > int cpu = smp_processor_id(); > > unsigned int workcpu; > > + bool lmce = per_cpu(lmce_in_process, cpu); > > Is this flag valid to be looked at anymore at this point in time? MCIP > was cleared a lot earlier, so there may well have been a 2nd #MC > in between. In any event you again don#t need the local variable > here afaict. A non-LMCE MC# coming in between does not cause problem. As lmce_in_process on all CPUs are cleared, mce_softirq() on all CPUs will sync with each other as before and finally one of them will handle the pending LMCE. I think the problem is one flag is not enough rather than non needed. One lmce_in_process flag misses the following case: 1) mcheck_cmn_handler() first handles a non-LMCE MC on CPU#n and raises MACHINE_CHECK_SOFTIRQ. 2) Before mce_softirq() gets chance to run on CPU#n, another LMCE comes to CPU#n. Then mcheck_cmn_handler() sets lmce_in_process on CPU#n and raises MACHINE_CHECK_SOFTIRQ again. 3) mce_softirq() finally gets change to run on CPU#n. It sees lmce_in_process is set and consequently handles pending MCEs on CPU#n w/o waiting for other CPUs. However, one of the pending MCEs is not LMCE. So I'm considering to introduce another local flag "mce_in_process" to indicate whether there is a non-LMCE MC is pending for softirq. 1) When a non-LMCE MC# comes to CPU#n, mcheck_cmn_handler() sets mce_in_process flag on CPU#n. 2) When a LMCE comes to CPU#n, mcheck_cmn_handler() sets lmce_in_process flag on CPU#n. 3) When mce_softirq() starts, it clears lmce_in_process flag if mce_in_process is set, so it will not handle non-LMCE MC w/o waiting for other CPUs. 4) mce_softirq() clears both flags after exiting all MC barriers. Haozhong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |