[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/mce: Don't spam the console with "CPUx: Temperature z" (v2)



On 13/06/14 19:09, Konrad Rzeszutek Wilk wrote:
> If the machine has been quite busy it ends up with these
> messages printed on the hypervisor console:
>
> (XEN) CPU3: Temperature/speed normal
> (XEN) CPU1: Temperature/speed normal
> (XEN) CPU0: Temperature/speed normal
> (XEN) CPU1: Temperature/speed normal
> (XEN) CPU0: Temperature/speed normal
> (XEN) CPU2: Temperature/speed normal
> (XEN) CPU3: Temperature/speed normal
> (XEN) CPU0: Temperature/speed normal
> (XEN) CPU2: Temperature/speed normal
> (XEN) CPU3: Temperature/speed normal
> (XEN) CPU1: Temperature/speed normal
> (XEN) CPU0: Temperature above threshold
> (XEN) CPU0: Running in modulated clock mode
> (XEN) CPU1: Temperature/speed normal
> (XEN) CPU2: Temperature/speed normal
> (XEN) CPU3: Temperature/speed normal
>
> While the state changes are important, the non-altered
> state information is not needed. As such add a latch
> mechanism to only print the information if it has
> changed since the last update.
>
> This was observed on Intel DQ67SW,
> BIOS SWQ6710H.86A.0066.2012.1105.1504 11/05/2012
>
> CC: Jan Beulich <jbeulich@xxxxxxxx>
> CC: Keir Fraser <keir@xxxxxxx>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

>
> ---
> [v2: Redo per Daniel and Boris's review]
> [v3: Use per_cpu instead of __get_cpu_var per Andrew's review]
> ---
>  xen/arch/x86/cpu/mcheck/mce_intel.c |   19 ++++++++++++++-----
>  1 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c 
> b/xen/arch/x86/cpu/mcheck/mce_intel.c
> index ad06efc..bb4ce47 100644
> --- a/xen/arch/x86/cpu/mcheck/mce_intel.c
> +++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
> @@ -49,11 +49,15 @@ static int __read_mostly nr_intel_ext_msrs;
>  #define INTEL_SRAR_INSTR_FETCH       0x150
>  
>  #ifdef CONFIG_X86_MCE_THERMAL
> +#define MCE_RING                0x1
> +static DEFINE_PER_CPU(int, last_state);
> +
>  static void intel_thermal_interrupt(struct cpu_user_regs *regs)
>  {
>      uint64_t msr_content;
>      unsigned int cpu = smp_processor_id();
>      static DEFINE_PER_CPU(s_time_t, next);
> +    int *this_last_state;
>  
>      ack_APIC_irq();
>  
> @@ -62,13 +66,17 @@ static void intel_thermal_interrupt(struct cpu_user_regs 
> *regs)
>  
>      per_cpu(next, cpu) = NOW() + MILLISECS(5000);
>      rdmsrl(MSR_IA32_THERM_STATUS, msr_content);
> -    if (msr_content & 0x1) {
> -        printk(KERN_EMERG "CPU%d: Temperature above threshold\n", cpu);
> -        printk(KERN_EMERG "CPU%d: Running in modulated clock mode\n",
> -                cpu);
> +    this_last_state = &per_cpu(last_state, cpu);
> +    if ( *this_last_state == (msr_content & MCE_RING) )
> +        return;
> +    *this_last_state = msr_content & MCE_RING;
> +    if ( msr_content & MCE_RING )
> +    {
> +        printk(KERN_EMERG "CPU%u: Temperature above threshold\n", cpu);
> +        printk(KERN_EMERG "CPU%u: Running in modulated clock mode\n", cpu);
>          add_taint(TAINT_MACHINE_CHECK);
>      } else {
> -        printk(KERN_INFO "CPU%d: Temperature/speed normal\n", cpu);
> +        printk(KERN_INFO "CPU%u: Temperature/speed normal\n", cpu);
>      }
>  }
>  
> @@ -802,6 +810,7 @@ static int cpu_mcabank_alloc(unsigned int cpu)
>  
>      per_cpu(no_cmci_banks, cpu) = cmci;
>      per_cpu(mce_banks_owned, cpu) = owned;
> +    per_cpu(last_state, cpu) = -1;
>  
>      return 0;
>  out:


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.