[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] HVM Migration of domU on Qemu-upstream DM causes stuck system clock with ACPI
On 31/05/13 15:07, George Dunlap wrote: > On 31/05/13 13:40, Ian Campbell wrote: >> On Fri, 2013-05-31 at 12:57 +0100, Alex Bligh wrote: >>> --On 31 May 2013 12:49:18 +0100 George Dunlap >>> <george.dunlap@xxxxxxxxxxxxx> >>> wrote: >>> >>>> No -- Linux is asking, "Can you give me an alarm in 5ns?" And Xen is >>>> saying, "No". So Linux is saying, "OK, how about 5us? 10us? >>>> 20us?" By >>>> the time it reaches 4ms, Linux has had enough, and says, "If this timer >>>> is so bad that it can't give me an event within 4ms it just won't use >>>> timers at all, thank you very much." >>>> >>>> The problem appears to be that Linux thinks it's asking for >>>> something in >>>> the future, but is actually asking for something in the past. It must >>>> look at its watch just before the final domain pause, and then asks for >>>> the time just after the migration resumes on the other side. So it >>>> doesn't realize that 10ms (or something) has already passed, and that >>>> it's actually asking for a timer in the past. The Xen timer driver in >>>> Linux specifically asks Xen for times set in the past to return an >>>> error. >>>> Xen is returning an error because the time is in the past, Linux thinks >>>> it's getting an error because the time is too close in the future and >>>> tries asking a little further away. >>>> >>>> Unfortunately I think this is something which needs to be fixed on the >>>> Linux side; I don't really see how we can work around it in Xen. >>> I don't think fixing it only on the Linux side is a great idea, not >>> least >>> as it makes any current Linux image not live migrateable reliably. >>> That's >>> pretty horrible. >> Ultimately though a guest bug is a guest bug, we don't really want to be >> filling the hypervisor with lots of quirky exceptions to interfaces in >> order to work around them, otherwise where does it end? >> >> A kernel side fix can be pushed to the distros fairly aggressively (it's >> mostly just a case of getting an upstream stable backport then filing >> bugs with the main ones, we've done it before) and for users upgrading >> the kernel via the distros is really not so hard and mostly reuses the >> process they must have in place for guest kernel security updates and >> other important kernel bugs anyway. > > In any case, it seems I was wrong -- Linux does "look at its watch" > every time it asks. > > The generic timer interface is "set me a timer N nanoseconds in the > future"; the Xen timer implementation executes > pvclock_clocksource_read() and adds the delta. So it may well actually > be a bug in Xen. > > Stand by for further investigation... I've also seen this on FreeBSD PVHVM when doing live migration, which also uses the single shot timer. It seems like the values in vcpu_info->time are not updated as often as they should after the migration. I've implemented a back-off mechanism to cope with that, but this clearly looks like a bug in Xen. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |