[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] cpuidle causing Dom0 soft lockups


  • To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
  • Date: Wed, 03 Feb 2010 13:18:51 +0100
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
  • Delivery-date: Wed, 03 Feb 2010 04:19:13 -0800
  • Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:X-Enigmail-Version:Content-Type: Content-Transfer-Encoding; b=V2ItvSBJVRcRJuPQFdB+e4bhM2BjK6d/E4+E44n4mPu+WULz6TFCzvR1 5GyO8FYOOBFK3R2d6DyYMvXlohXew7G1Bgenw+P2ufsMB9iHwUuPi8t2S LrQ1/J2DiKnwzREGf1A4KE8ZICXNfCqX0IqBVBUT/MUS9OsnyLH1uyULA ks0xs2hPB+U1bBdRj6Vl3hmBssgIxWEjmP0mcii9Iebsy6eVDZAQ4Dm4U Soq14zbY/EdpCo3NsOnJsF+ffgJSt;
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Tian, Kevin wrote:
>> From: Jan Beulich
>> Sent: 2010年2月3日 18:16
>>
>>>>> "Yu, Ke" <ke.yu@xxxxxxxxx> 02.02.10 18:07 >>>
>>>> Just fyi, we now also have seen an issue on a 24-CPU system that went
>>>> away with cpuidle=0 (and static analysis of the hang hinted in that
>>>> direction). All I can judge so far is that this likely has 
>> something to do
>>>> with our kernel's intensive use of the poll hypercall (i.e. 
>> we see vCPU-s
>>>> not waking up from the call despite there being pending unmasked or
>>>> polled for events).
>>> We just identified the cause of this issue, and is trying to 
>> find appropriate way to fix it.
>>
>> Hmm, while I agree that the scenario you describe can be a problem, I
>> don't think it can explain the behavior on the 24-CPU system pointed
>> out above, nor the one Juergen Gross pointed out yesterday.
> 
> Is 24-CPU system observed with same likelihood as 64-CPU system to
> hang at boot time, or less frequent? Ke just did some theoretical analysis
> by assuming some values. There could be other factors added to latency
> and each system may have different characteristics too. We can't
> draw conclusion whether smaller system will face same issue, by simply
> changing CPU number in Ke's formula. :-) Possibly you can provide cpuidle
> information on your 24-core system for further comparison.

My 4-core system hangs _always_. For minutes. If I press any key on the
console it will resume booting with soft lockup messages (all cpus were
in xen_safe_halt).
Sometimes another hang occurs, sometimes the system will come up without
further hangs.

Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technolgy Solutions               e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.