[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [timer/ticks related] dom0 hang during boot on large 1TB system



On Fri, 18 Dec 2009 07:02:55 +0000
Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote:

> On 18/12/2009 04:36, "Mukesh Rathor" <mukesh.rathor@xxxxxxxxxx> wrote:
> 
> > The other fix I thought of was to change INITIAL_JIFFIES to
> > something sooner.
> > 
> > Would appreciate any help, I don't understand xen time management
> > well.
> 
> This isn't really Xen time code, but unchanged Linux time code. I
> don't know which tree you quoted the code from -- 2.6.18 has similar
> but not identical. Anyway, I suggest try using the jiffy-comparison
> macros from <linux/jiffies.h>: time_before(), time_after(), etc.
> These are designed to work even when jiffies wraps. Feel free to send
> patch(es) for that, if you test that out and it works okay.
> 
>  -- Keir
> 

Ok, I came up with the following patch. Jeremy, can you please take a
look also, and comment on my fix since I noticed you've got the same 
issue in your tree. Here's a summary for your benefit:

init/calibrate.c :  calibrate_delay_direct():

                start_jiffies = get_jiffies_64();
                while (get_jiffies_64() <= (start_jiffies + tick_divider)) {
                        pre_start = start;
                        read_current_timer(&start);
                }


if first ever timer interrupt comes after start_jiffies is set, dom0 boot 
may hang if delta in timer_interrupt() is so huge that it causes jiffies 
to wrap. It appears delta is very large when memory is more than 512GB on
certain boxes causing wrap around.

why is delta in dom0->timer_interrupt() related to memory on system? 
Because hyp creates dom0, then page scrubs, then unpauses vcpu. so it
appears lot of page scurbbing results in huge delta on first tick.

thanks,
Mukesh


Signed-off-by: Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>

diff --git a/init/calibrate.c b/init/calibrate.c
index 06066a6..14f62c8 100644
--- a/init/calibrate.c
+++ b/init/calibrate.c
@@ -32,7 +32,7 @@ static unsigned long __devinit calibrate_delay_direct(void)
 {
        unsigned long pre_start, start, post_start;
        unsigned long pre_end, end, post_end;
-       unsigned long start_jiffies;
+       u64 start_jiffies;
        unsigned long tsc_rate_min, tsc_rate_max;
        unsigned long good_tsc_sum = 0;
        unsigned long good_tsc_count = 0;
@@ -64,8 +64,8 @@ static unsigned long __devinit calibrate_delay_direct(void)
        for (i = 0; i < MAX_DIRECT_CALIBRATION_RETRIES; i++) {
                pre_start = 0;
                read_current_timer(&start);
-               start_jiffies = jiffies;
-               while (jiffies <= (start_jiffies + tick_divider)) {
+               start_jiffies = get_jiffies_64();
+               while (get_jiffies_64() <= (start_jiffies + tick_divider)) {
                        pre_start = start;
                        read_current_timer(&start);
                }
@@ -73,7 +73,7 @@ static unsigned long __devinit calibrate_delay_direct(void)
 
                pre_end = 0;
                end = post_start;
-               while (jiffies <=
+               while (get_jiffies_64() <=
                       (start_jiffies + tick_divider * (1 + 
delay_calibration_ticks))) {
                        pre_end = end;
                        read_current_timer(&end);

Attachment: diff.out
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.