Xen project Mailing List

Re: [Xen-devel] dom0 hang

To: mukesh.rathor@xxxxxxxxxx, "Tian, Kevin" <kevin.tian@xxxxxxxxx>, Yu Ke <ke.yu@xxxxxxxxx>

From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>

Date: Thu, 2 Jul 2009 22:37:55 +0100

Delivery-date: Thu, 02 Jul 2009 14:38:31 -0700

Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=KqDQBSI//8qDFCJpNroqS2euyjnTRNufLk9G0Fh5STbx6RBAPha8i3CDgcgvgnHpSi Upq12QRN8H1pepFlQM7NoNilsxPu12WQ/8YgNx5BD8ICXn+xem61Tm/HBKp1eXHKEJr8 OZzVtFDHh8Up123gnZZN0wv0zhKqm7DY/F7uk=

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

[Oops, adding back in distro list, also adding Kevin Tian and Yu Ke who wrote cs 19460] The functionality I was talking about, subtracting credits and clearing BOOST, happens in csched_vcpu_acct() (which is different than csched_acct()). vcpu_acct() is called from csched_tick(), which should still happen every 10ms on every cpu. The patch I referred to (cs 19460) disables and re-enables tickers in xen/arch/x86/acpi/cpu_idle.c:acpi_processor_idle() every time the processor idles. I can't see anywhere else that tickers are disabled, so it's probably something not properly re-enabling them again. Try applying the attached patch to see if that changes anything. (I'm on the road, so I can't repro the lockup issue.) If that doesn't work, try disabling c-states and see if that helps. Then at least we'll know where the problem lies. -George On Thu, Jul 2, 2009 at 10:10 PM, Mukesh Rathor<mukesh.rathor@xxxxxxxxxx> wrote: > that seems to only suspend csched_pcpu.ticker which is csched_tick that is > only sorting local runq. > > again, we are concerned about csched_priv.master_ticker that calls > csched_acct? correct, so i can trace that? > > thanks, > mukesh > > > George Dunlap wrote: >> >> Ah, I see that there's been some changes to tick stuff with the >> c-state (e.g., cs 19460). It looks like they're supposed to be going >> still, but perhaps the tick_suspend() and tick_resume() aren't being >> called properly. Let me take a closer look. >> >> -George >> >> On Thu, Jul 2, 2009 at 8:14 PM, Mukesh Rathor<mukesh.rathor@xxxxxxxxxx> >> wrote: >>> >>> George Dunlap wrote: >>>> >>>> On Thu, Jul 2, 2009 at 4:19 AM, Mukesh Rathor<mukesh.rathor@xxxxxxxxxx> >>>> wrote: >>>>> >>>>> dom0 hang: >>>>> vcpu0 is trying to wakeup a task and in try_to_wake_up() calls >>>>> task_rq_lock(). since the task has cpu set to 1, it gets runq lock >>>>> for vcpu1. next it calls resched_task() which results in sending IPI >>>>> to vcpu1. for that, vcpu0 gets into the HYPERVISOR_event_channel_op >>>>> HCALL and is waiting to return. Meanwhile, vcpu1 got running, and is >>>>> spinning on it's runq lock in "schedule():spin_lock_irq(&rq->lock);", >>>>> that vcpu0 is holding (and is waiting to return from the HCALL). >>>>> >>>>> As I had noticed before, vcpu0 never gets scheduled in xen. So >>>>> looking further into xen: >>>>> >>>>> xen: >>>>> Both vcpu's are on the same runq, in this case cpu1. But the >>>>> priority of vcpu1 has been set to CSCHED_PRI_TS_BOOST. As a result, >>>>> the scheduler always picks vcpu1, and vcpu0 is starved. Also, I see in >>>>> kdb that the scheduler timer is not set on cpu 0. That would've >>>>> allowed csched_load_balance() to kick in on cpu0. [Also, on >>>>> cpu1, the accounting timer, csched_tick, is not set. Altho, >>>>> csched_tick() is running on cpu0, it only checks runq for cpu0.] >>>>> >>>>> Looks like c/s 19500 changed csched_schedule(): >>>>> >>>>> - ret.time = MILLISECS(CSCHED_MSECS_PER_TSLICE); >>>>> + ret.time = (is_idle_vcpu(snext->vcpu) ? >>>>> + -1 : MILLISECS(CSCHED_MSECS_PER_TSLICE)); >>>>> >>>>> The quickest fix for us would be to just back that out. >>>>> >>>>> >>>>> BTW, just a comment on following (all in sched_credit.c): >>>>> >>>>> if ( svc->pri == CSCHED_PRI_TS_UNDER && >>>>> !(svc->flags & CSCHED_FLAG_VCPU_PARKED) ) >>>>> { >>>>> svc->pri = CSCHED_PRI_TS_BOOST; >>>>> } >>>>> comibined with >>>>> if ( snext->pri > CSCHED_PRI_TS_OVER ) >>>>> __runq_remove(snext); >>>>> >>>>> Setting CSCHED_PRI_TS_BOOST as pri of vcpu seems dangerous. To me, >>>>> since csched_schedule() never checks for time accumulated by a >>>>> vcpu at pri CSCHED_PRI_TS_BOOST, that is same as pinning a vcpu to a >>>>> pcpu. if that vcpu never makes progress, essentially, the system >>>>> has lost a physical cpu. Optionally, csched_schedule() should >>>>> always >>>>> check for cpu time accumulated and reduce the priority over time. >>>>> I can't tell right off if it already does that. or something like >>>>> that :)... my 2 cents. >>>> >>>> Hmm... what's supposed to happen is that eventually a timer tick will >>>> interrupt vcpu1. If cpu1 is set to be "active", then it will be >>>> debited 10ms worth of credit. Eventually, it will go into OVER, and >>>> lose BOOST. If it's "inactive", then when the tick happens, it will >>>> be set to "active" and be debited 10ms again, setting it directly into >>>> OVER (and thus also losing boost). >>>> >>>> Can you see if the timer ticks are still happening, and perhaps put >>>> some tracing it to verify that what I described above is happening? >>>> >>>> -George >>> >>> George, >>> >>> Is that in csched_acct()? Looks like that's somehow gotten removed. If >>> true, then may be that's the fundamental problem to chase. >>> >>> Here's what the trq looks like when hung, not in any schedule function: >>> >>> [0]xkdb> dtrq >>> CPU[00]: NOW:0x00003f2db9af369e >>> 1: exp=0x00003ee31cb32200 fn:csched_tick data:0000000000000000 >>> 2: exp=0x00003ee347ece164 fn:time_calibration data:0000000000000000 >>> 3: exp=0x00003ee69a28f04b fn:mce_work_fn data:0000000000000000 >>> 4: exp=0x00003f055895e25f fn:plt_overflow data:0000000000000000 >>> 5: exp=0x00003ee353810216 fn:rtc_update_second data:ffff83007f0226d8 >>> >>> CPU[01]: NOW:0x00003f2db9af369e >>> 1: exp=0x00003ee30b847988 fn:s_timer_fn data:0000000000000000 >>> 2: exp=0x00003f1b309ebd45 fn:pmt_timer_callback data:ffff83007f022a68 >>> >>> >>> thanks >>> Mukesh >>> >

Attachment: simplify-tick-resume.diff
Description: Text Data

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

Follow-Ups:

Re: [Xen-devel] dom0 hang
- From: Mukesh Rathor

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.