[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] HVM Migration of domU on Qemu-upstream DM causes stuck system clock with ACPI
On 31/05/13 13:40, Ian Campbell wrote: On Fri, 2013-05-31 at 12:57 +0100, Alex Bligh wrote:--On 31 May 2013 12:49:18 +0100 George Dunlap <george.dunlap@xxxxxxxxxxxxx> wrote:No -- Linux is asking, "Can you give me an alarm in 5ns?" And Xen is saying, "No". So Linux is saying, "OK, how about 5us? 10us? 20us?" By the time it reaches 4ms, Linux has had enough, and says, "If this timer is so bad that it can't give me an event within 4ms it just won't use timers at all, thank you very much." The problem appears to be that Linux thinks it's asking for something in the future, but is actually asking for something in the past. It must look at its watch just before the final domain pause, and then asks for the time just after the migration resumes on the other side. So it doesn't realize that 10ms (or something) has already passed, and that it's actually asking for a timer in the past. The Xen timer driver in Linux specifically asks Xen for times set in the past to return an error. Xen is returning an error because the time is in the past, Linux thinks it's getting an error because the time is too close in the future and tries asking a little further away. Unfortunately I think this is something which needs to be fixed on the Linux side; I don't really see how we can work around it in Xen.I don't think fixing it only on the Linux side is a great idea, not least as it makes any current Linux image not live migrateable reliably. That's pretty horrible.Ultimately though a guest bug is a guest bug, we don't really want to be filling the hypervisor with lots of quirky exceptions to interfaces in order to work around them, otherwise where does it end? A kernel side fix can be pushed to the distros fairly aggressively (it's mostly just a case of getting an upstream stable backport then filing bugs with the main ones, we've done it before) and for users upgrading the kernel via the distros is really not so hard and mostly reuses the process they must have in place for guest kernel security updates and other important kernel bugs anyway. In any case, it seems I was wrong -- Linux does "look at its watch" every time it asks. The generic timer interface is "set me a timer N nanoseconds in the future"; the Xen timer implementation executes pvclock_clocksource_read() and adds the delta. So it may well actually be a bug in Xen. Stand by for further investigation... -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |