[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
>>> On 18.10.11 at 22:42, Laszlo Ersek <lersek@xxxxxxxxxx> wrote: > ... because the "clock_event_device framework" already accounts for idle > time through the "event_handler" function pointer in > xen_timer_interrupt(). > > The patch is intended as the completion of [1]. It should fix the double > idle times seen in PV guests' /proc/stat [2]. It should be orthogonal to > stolen time accounting (the removed code seems to be isolated). After some more looking around I still think it's incorrect, albeit for a different reason: What tick_nohz_restart_sched_tick() accounts as idle time is *all* time that passed while in cpu_idle(). What gets accounted in do_stolen_accounting() (without your patch) is different: - time the vCPU was in RUNSTATE_blocked gets accounted as idle - time the vCPU was in RUNSTATE_runnable and RUNSTATE_offline gets accounted as stolen. That is, on an overcommitted system (and without your patch) I would expect you to not see the (full) double idle increment for a not fully idle and not fully loaded vCPU. Now we can of course say that for most purposes it's fine to ignore the difference between idle and steal time, but there must be a reason these have been getting accounted separately in Linux without regard to Xen. So I really think that what needs to be suppressed to address this is tick_nohz_restart_sched_tick()'s call to account_idle_ticks(). But simply forcing CONFIG_VIRT_CPU_ACCOUNTING is not an immediate option (not even when considering a Xen-only kernel), as that has further implications. Instead I wonder whether under Xen we should e.g. update per_cpu(tick_cpu_sched, cpu).idle_jiffies prior to calling tick_nohz_restart_sched_tick() (possibly implicitly by doing the update in do_stolen_accounting(), though that wouldn't work when the wakeup is due to an interrupt other than the timer one). The alternative of accounting the negative of the steal time as idle time in do_stolen_accounting() doesn't look correct either, since not all stolen time was necessarily accounted as idle (some would have got - also incorrectly - accounted as process or system time). So my conclusion is that only the full implementation of what CONFIG_VIRT_CPU_ACCOUNTING expects would actually get things sorted out properly (but that would imply Xen *and* native are willing to pay the performance price, or the enabling of this can be made dynamic so that at least native could remain unaffected). Or the negative stolen time should be accounted to the current process, the system, or as idle, depending on the context which do_stolen_accounting() runs in (just like timer_interrupt() did in the XenoLinux kernels for positive accounting). Jan > The approach may be completely misguided. > > [1] https://lkml.org/lkml/2011/10/6/10 > [2] http://lists.xensource.com/archives/html/xen-devel/2010-08/msg01068.html > > Signed-off-by: Laszlo Ersek <lersek@xxxxxxxxxx> > --- > arch/x86/xen/time.c | 17 ++--------------- > 1 files changed, 2 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c > index 163b467..377f6ae 100644 > --- a/arch/x86/xen/time.c > +++ b/arch/x86/xen/time.c > @@ -36,9 +36,8 @@ static DEFINE_PER_CPU(struct vcpu_runstate_info, > xen_runstate); > /* snapshots of runstate info */ > static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate_snapshot); > > -/* unused ns of stolen and blocked time */ > +/* unused ns of stolen time */ > static DEFINE_PER_CPU(u64, xen_residual_stolen); > -static DEFINE_PER_CPU(u64, xen_residual_blocked); > > /* return an consistent snapshot of 64-bit time/counter value */ > static u64 get64(const u64 *p) > @@ -115,7 +114,7 @@ static void do_stolen_accounting(void) > { > struct vcpu_runstate_info state; > struct vcpu_runstate_info *snap; > - s64 blocked, runnable, offline, stolen; > + s64 runnable, offline, stolen; > cputime_t ticks; > > get_runstate_snapshot(&state); > @@ -125,7 +124,6 @@ static void do_stolen_accounting(void) > snap = &__get_cpu_var(xen_runstate_snapshot); > > /* work out how much time the VCPU has not been runn*ing* */ > - blocked = state.time[RUNSTATE_blocked] - snap->time[RUNSTATE_blocked]; > runnable = state.time[RUNSTATE_runnable] - > snap->time[RUNSTATE_runnable]; > offline = state.time[RUNSTATE_offline] - snap->time[RUNSTATE_offline]; > > @@ -141,17 +139,6 @@ static void do_stolen_accounting(void) > ticks = iter_div_u64_rem(stolen, NS_PER_TICK, &stolen); > __this_cpu_write(xen_residual_stolen, stolen); > account_steal_ticks(ticks); > - > - /* Add the appropriate number of ticks of blocked time, > - including any left-overs from last time. */ > - blocked += __this_cpu_read(xen_residual_blocked); > - > - if (blocked < 0) > - blocked = 0; > - > - ticks = iter_div_u64_rem(blocked, NS_PER_TICK, &blocked); > - __this_cpu_write(xen_residual_blocked, blocked); > - account_idle_ticks(ticks); > } > > /* Get the TSC speed from Xen */ _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |