|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 04/12] x86/mce: handle LMCE locally
>>> On 17.03.17 at 07:46, <haozhong.zhang@xxxxxxxxx> wrote:
> @@ -52,8 +52,8 @@ void mce_barrier_exit(struct mce_softirq_barrier *bar)
> }
> }
>
> -void mce_barrier(struct mce_softirq_barrier *bar)
> +void mce_barrier(struct mce_softirq_barrier *bar, bool nowait)
> {
> - mce_barrier_enter(bar);
> - mce_barrier_exit(bar);
> + mce_barrier_enter(bar, nowait);
> + mce_barrier_exit(bar, nowait);
> }
I don't think this currently unused function needs the extra parameter.
Should a future user find this necessary, it can be done then.
> --- a/xen/arch/x86/cpu/mcheck/mce.c
> +++ b/xen/arch/x86/cpu/mcheck/mce.c
> @@ -42,6 +42,13 @@ DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *,
> poll_bankmask);
> DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, no_cmci_banks);
> DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, mce_clear_banks);
>
> +/*
> + * Flag to indicate that at least one non-local MCE's on this CPU have
... one non-local MCE on this CPU has ...
> + * not been completed handled. It's set by mcheck_cmn_handler() and
> + * cleared by mce_softirq().
> + */
> +DEFINE_PER_CPU(bool, mce_in_process);
static
Also please consider whether mce_in_progress might not be a
better alternative name, and whether the non-local aspect
shouldn't also be expressed by the name chosen.
> @@ -1704,10 +1717,11 @@ static void mce_softirq(void)
> {
> int cpu = smp_processor_id();
> unsigned int workcpu;
> + bool nowait = !this_cpu(mce_in_process);
>
> mce_printk(MCE_VERBOSE, "CPU%d enter softirq\n", cpu);
>
> - mce_barrier_enter(&mce_inside_bar);
> + mce_barrier_enter(&mce_inside_bar, nowait);
>
> /*
> * Everybody is here. Now let's see who gets to do the
> @@ -1720,10 +1734,10 @@ static void mce_softirq(void)
>
> atomic_set(&severity_cpu, cpu);
>
> - mce_barrier_enter(&mce_severity_bar);
> + mce_barrier_enter(&mce_severity_bar, nowait);
> if (!mctelem_has_deferred(cpu))
> atomic_set(&severity_cpu, cpu);
> - mce_barrier_exit(&mce_severity_bar);
> + mce_barrier_exit(&mce_severity_bar, nowait);
>
> /* We choose severity_cpu for further processing */
> if (atomic_read(&severity_cpu) == cpu) {
The logic here looks pretty suspicious even without your changes,
but I think we should try hard to not make it worse. I think you
need to avoid setting severity_cpu in the LMCE case (with,
obviously, further resulting adjustments). And it also needs to be
at least clarified (in the patch description) whether
this_cpu(mce_in_process) changing (namely from 0 to 1) after
you've latched it can't have any bad effect.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |