[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"

To: Jan Beulich <JBeulich@xxxxxxxx>
From: Laszlo Ersek <lersek@xxxxxxxxxx>
Date: Wed, 21 Dec 2011 14:53:25 +0100
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Joe Jin <joe.jin@xxxxxxxxxx>, Zhenzhong Duan <zhenzhong.duan@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Delivery-date: Wed, 21 Dec 2011 13:52:13 +0000
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On 12/21/11 09:32, Jan Beulich wrote:

Specifically, without further adjustments, on a 4:3 over-committed
system doing a kernel build, I'm seeing an over-accounting of
approximately 4%. I was able to reduce this to slightly above 1%
with below (experimental and not 32-bit compatible) change:

--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3864,6 +3864,29 @@
        return ns;
  }

+#ifndef CONFIG_XEN
+#define _cputime_to_cputime64 cputime_to_cputime64
+#else
+#define NS_PER_TICK (1000000000 / HZ)
+static DEFINE_PER_CPU(u64, stolen_snapshot);
+static DEFINE_PER_CPU(unsigned int, stolen_residual);
+
+static cputime64_t _cputime_to_cputime64(cputime_t t)
+{
+       //todo not entirely suitable for 32-bit
+       u64 s = this_cpu_read(runstate.time[RUNSTATE_runnable]);
+       unsigned long adj = div_u64_rem(s - this_cpu_read(stolen_snapshot)
+                                         + this_cpu_read(stolen_residual),
+                                       NS_PER_TICK,
+                                       &__get_cpu_var(stolen_residual));
+
+       this_cpu_write(stolen_snapshot, s);
+       t = cputime_sub(t, jiffies_to_cputime(adj));
+
+       return cputime_le(t, cputime_max) ? cputime_to_cputime64(t) : 0;
+}
+#endif
+

Why is this not entirely suitable for 32-bit? div_u64_rem() indeedreturns an u64, but for the truncation to do actual damage, the retval(which is the number of ticks that was stolen from the VCPU) would haveto be greater than about 4*10^9, which seems improbable with 1 msec longticks. Also cputime_sub() only depends, in order to leave any underflowdetectable, on adj <= cputime_max (cca. 2*10^9). I'm likely looking inthe wrong place though.

  /*
   * Account user cpu time to a process.
   * @p: the process that the cpu time gets accounted to
@@ -3882,7 +3905,7 @@
        account_group_user_time(p, cputime);

        /* Add user time to cpustat. */
-       tmp = cputime_to_cputime64(cputime);
+       tmp = _cputime_to_cputime64(cputime);
        if (TASK_NICE(p)>  0)
                cpustat->nice = cputime64_add(cpustat->nice, tmp);
        else
@@ -3934,7 +3957,7 @@
  void __account_system_time(struct task_struct *p, cputime_t cputime,
                        cputime_t cputime_scaled, cputime64_t *target_cputime64)
  {
-       cputime64_t tmp = cputime_to_cputime64(cputime);
+       cputime64_t tmp = _cputime_to_cputime64(cputime);

        /* Add system time to process. */
        p->stime = cputime_add(p->stime, cputime);
@@ -3996,7 +4019,7 @@
  void account_idle_time(cputime_t cputime)
  {
        struct cpu_usage_stat *cpustat =&kstat_this_cpu.cpustat;
-       cputime64_t cputime64 = cputime_to_cputime64(cputime);
+       cputime64_t cputime64 = _cputime_to_cputime64(cputime);
        struct rq *rq = this_rq();

        if (atomic_read(&rq->nr_iowait)>  0)
@@ -4033,7 +4056,7 @@
                                                struct rq *rq)
  {
        cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
-       cputime64_t tmp = cputime_to_cputime64(cputime_one_jiffy);
+       cputime64_t tmp = _cputime_to_cputime64(cputime_one_jiffy);
        struct cpu_usage_stat *cpustat =&kstat_this_cpu.cpustat;

        if (irqtime_account_hi_update()) {

I currently have no idea what the remain 1% could be attributed to,
so thoughts from others would certainly be welcome.

What about irqtime_account_process_tick() calling account_idle_time()with "cputime_one_jiffy" -- it could more frequently trigger theunderflow in _cputime_to_cputime64(). In that case we might want todecrease idle time (ie. account "negative" cputime against idle), but can't.


As always, apologies if what I wrote is painfully wrong.
Laszlo

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
  - From: Jan Beulich

References:
- Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] Xen interfaces / hooks
Next by Date: Re: [Xen-devel] [PATCH] libxl: Fix, specify open mode to QEMU state file.
Previous by thread: Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
Next by thread: Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.