Xen project Mailing List

[Xen-devel] Re: [PATCH] Fix cpu online/offline bug: mce memory leak.

To: Haitao Shan <maillists.shan@xxxxxxxxx>

From: Keir Fraser <keir.xen@xxxxxxxxx>

Date: Tue, 01 Mar 2011 10:25:38 +0000

Cc: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 01 Mar 2011 02:26:19 -0800

Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=user-agent:date:subject:from:to:cc:message-id:thread-topic :thread-index:in-reply-to:mime-version:content-type :content-transfer-encoding; b=PvHTxhjf3i2SIPlMQasRXHdU+/PNTio+v9Oa3rzCUP/k1ri5Q2FtOIJ6bqyexGHud9 upwrMQwHd1T+0Sn1wJ+34P+9760zoTE1cXOaKPFjp/3tvan8w1RBa9d+Uzw0EMDpeK4S KeP2ykxZiPo7d4UyecaHBBNiygJdSxRs+dCGU=

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thread-index: AcvX+wH4yCFvgaNhw0+5HxMnPYx7oA==

Thread-topic: [PATCH] Fix cpu online/offline bug: mce memory leak.

Well, it is the correct fix, certainly. The new notifier mechanism we borrowed from Linux should enable us to remove all dynamic allocations from a CPU's early bringup path. It's just a case of finding them all as some are tucked away in cpu/platform-specific areas (like this one). There are a couple of alterations for Jinsong to make, and then smoke test the patch again. When he resubmits with the alterations I will check the patch straight in for the next release candidate (RC7). K. On 01/03/2011 10:21, "Haitao Shan" <maillists.shan@xxxxxxxxx> wrote: > Ah, yes. I was misled by the patch name itself. :) > Smart fix, isn't it! I guess the issue is closed. Thanks, Jinsong. > > Shan Haitao > > 2011/3/1 Keir Fraser <keir.xen@xxxxxxxxx> >> On 01/03/2011 10:12, "Haitao Shan" <maillists.shan@xxxxxxxxx> wrote: >> >>> Hi, Keir, >>> >>> Fixing memory leak would be always good. But I don't agree this could fix >>> the >>> problem I observed. >>> Yes, indeed, I observed the issue during stress test of cpu offline/online >>> by >>> offlining all cpus except CPU0 one by one and then bringing them back. In >>> this >>> case, with memory leak, after some time, xmalloc at onlining could hit a >>> heap >>> page that is formerly owned by domains, since usable Xen heap memory will be >>> slowly used up. Without memory leak, xmalloc at onlining will be likely use >>> the memory that is freed just by offlining. So it won't trigger this >>> assertion. >>> >>> But let us think a typical usage model, you will offline all LPs on a socket >>> so that it can be later removed physically. Some other time, you will bring >>> them back. Between offlining and onlining, there could be a time interval as >>> long as one week and activities such as creating and killing many guests. >>> How >>> can we ensure that we won't meet this assertion at that time? >>> >>> It is my understanding that this memory leak triggered the issue I raised >>> but >>> the issue itself can be triggered in many other ways. Please do correct me >>> if >>> I am wrong. Thanks! >> >> The point is that the patch also moves the xmalloc() calls out of the new >> CPU's bringup path, and into CPU_UP_PREPARE context. This means the >> allocations will occur on a different CPU, before the new CPU begins >> execution, and with irqs enabled. That's why it should fix your crash. >> >> -- Keir >> >>> Shan Haitao >>> >>> 2011/3/1 Keir Fraser <keir.xen@xxxxxxxxx> >>>> Some comments inline below. Overall a good patch to have, but needs some >>>> minor adjustments! >>>> >>>> -- Keir >>>> >>>> On 01/03/2011 09:09, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote: >>>> >>>>> Fix cpu online/offline bug: mce memory leak. >>>>> >>>>> Current Xen mce logic didn't free mcabanks. This would be a memory leak >>>>> when >>>>> cpu offline. >>>>> When repeatly do cpu online/offline, this memory leak would make xenpool >>>>> shrink, and at a time point, >>>>> will call alloc_heap_pages --> flush_area_mask, which >>>>> ASSERT(local_irq_is_enabled()). >>>>> However, cpu online is irq disable, so it finally result in Xen crash. >>>>> >>>>> This patch fix the memory leak bug, and tested OK over 110,000 round cpu >>>>> online/offline. >>>>> >>>>> Signed-off-by: Liu, Jinsong <jinsong.liu@xxxxxxxxx> >>>>> >>>>> diff -r 1a364b17d66a xen/arch/x86/cpu/mcheck/mce_intel.c >>>>> --- a/xen/arch/x86/cpu/mcheck/mce_intel.c Fri Feb 25 01:26:01 2011 +0800 >>>>> +++ b/xen/arch/x86/cpu/mcheck/mce_intel.c Mon Feb 28 19:19:20 2011 +0800 >>>>> @@ -1227,9 +1227,24 @@ static void intel_init_mce(void) >>>>> mce_uhandler_num = sizeof(intel_mce_uhandlers)/sizeof(struct >>>>> mca_error_handler); >>>>> } >>>>> >>>>> -static int intel_init_mca_banks(void) >>>>> +static void cpu_mcabank_free(unsigned int cpu) >>>>> { >>>>> - struct mca_banks *mb1, *mb2, * mb3; >>>>> + struct mca_banks *mb1, *mb2, *mb3, *mb4; >>>>> + >>>>> + mb1 = per_cpu(mce_clear_banks, cpu); >>>>> + mb2 = per_cpu(no_cmci_banks, cpu); >>>>> + mb3 = per_cpu(mce_banks_owned, cpu); >>>>> + mb4 = per_cpu(poll_bankmask, cpu); >>>>> + >>>>> + mcabanks_free(mb1); >>>>> + mcabanks_free(mb2); >>>>> + mcabanks_free(mb3); >>>>> + mcabanks_free(mb4); >>>>> +} >>>>> + >>>>> +static void cpu_mcabank_alloc(unsigned int cpu) >>>>> +{ >>>>> + struct mca_banks *mb1, *mb2, *mb3; >>>>> >>>>> mb1 = mcabanks_alloc(); >>>>> mb2 = mcabanks_alloc(); >>>>> @@ -1237,22 +1252,23 @@ static int intel_init_mca_banks(void) >>>>> if (!mb1 || !mb2 || !mb3) >>>>> goto out; >>>>> >>>>> - __get_cpu_var(mce_clear_banks) = mb1; >>>>> - __get_cpu_var(no_cmci_banks) = mb2; >>>>> - __get_cpu_var(mce_banks_owned) = mb3; >>>>> + per_cpu(mce_clear_banks, cpu) = mb1; >>>>> + per_cpu(no_cmci_banks, cpu) = mb2; >>>>> + per_cpu(mce_banks_owned, cpu) = mb3; >>>>> + return; >>>>> >>>>> - return 0; >>>>> out: >>>>> mcabanks_free(mb1); >>>>> mcabanks_free(mb2); >>>>> mcabanks_free(mb3); >>>>> - return -ENOMEM; >>>>> } >>>>> >>>>> /* p4/p6 family have similar MCA initialization process */ >>>>> enum mcheck_type intel_mcheck_init(struct cpuinfo_x86 *c) >>>>> { >>>>> - if (intel_init_mca_banks()) >>>>> + if ( !this_cpu(mce_clear_banks) || >>>>> + !this_cpu(no_cmci_banks) || >>>>> + !this_cpu(mce_banks_owned) ) >>>>> return mcheck_none; >>>>> >>>>> intel_init_mca(c); >>>>> @@ -1301,13 +1317,19 @@ static int cpu_callback( >>>>> static int cpu_callback( >>>>> struct notifier_block *nfb, unsigned long action, void *hcpu) >>>>> { >>>>> + unsigned int cpu = (unsigned long)hcpu; >>>>> + >>>>> switch ( action ) >>>>> { >>>>> + case CPU_UP_PREPARE: >>>>> + cpu_mcabank_alloc(cpu); >>>>> + break; >>>>> case CPU_DYING: >>>>> cpu_mcheck_disable(); >>>>> break; >>>>> case CPU_DEAD: >>>>> cpu_mcheck_distribute_cmci(); >>>>> + cpu_mcabank_free(cpu); >>>>> break; >>>> >>>> You also need to cpu_mcabank_free(cpu) on CPU_UP_CANCELED. Else there's a >>>> memory leak on failed CPU bringup. >>>> >>>>> default: >>>>> break; >>>>> @@ -1322,6 +1344,8 @@ static struct notifier_block cpu_nfb = { >>>>> >>>>> static int __init intel_mce_initcall(void) >>>>> { >>>>> + void *hcpu = (void *)(long)smp_processor_id(); >>>>> + cpu_callback(&cpu_nfb, CPU_UP_PREPARE, hcpu); >>>>> register_cpu_notifier(&cpu_nfb); >>>>> return 0; >>>>> } >>>>> diff -r 1a364b17d66a xen/arch/x86/setup.c >>>>> --- a/xen/arch/x86/setup.c Fri Feb 25 01:26:01 2011 +0800 >>>>> +++ b/xen/arch/x86/setup.c Mon Feb 28 19:19:20 2011 +0800 >>>>> @@ -1203,6 +1203,8 @@ void __init __start_xen(unsigned long mb >>>>> >>>>> arch_init_memory(); >>>>> >>>>> + do_presmp_initcalls(); >>>>> + >>>>> identify_cpu(&boot_cpu_data); >>>>> if ( cpu_has_fxsr ) >>>>> set_in_cr4(X86_CR4_OSFXSR); >>>>> @@ -1235,8 +1237,6 @@ void __init __start_xen(unsigned long mb >>>>> initialize_keytable(); >>>>> >>>>> console_init_postirq(); >>>>> - >>>>> - do_presmp_initcalls(); >>>> >>>> This looks risky, especially so close to 4.1 release. Instead of moving >>>> do_presmp_initcalls(), please special-case cpu_mcabank_alloc() for CPU0 in >>>> intel_mcheck_init(), and remove your direct call to >>>> cpu_callback(...CPU_UP_PREPARE...) in intel_mce_initcall(). >>>> >>>>> for_each_present_cpu ( i ) >>>>> { >>>> >>>> >>> >>> >> >> > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.