[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Live migrate with Linux >= 4.13 domU causes kernel time jumps and TCP connection stalls.
On 27/12/2018 22:12, Hans van Kranenburg wrote: > So, > > On 12/24/18 1:32 AM, Hans van Kranenburg wrote: >> >> On 12/21/18 6:54 PM, Hans van Kranenburg wrote: >>> >>> We've been tracking down a live migration bug during the last three days >>> here at work, and here's what we found so far. >>> >>> 1. Xen version and dom0 linux kernel version don't matter. >>> 2. DomU kernel is >= Linux 4.13. >>> >>> When using live migrate to another dom0, this often happens: >>> >>> [ 37.511305] Freezing user space processes ... (elapsed 0.001 seconds) >>> done. >>> [ 37.513316] OOM killer disabled. >>> [ 37.513323] Freezing remaining freezable tasks ... (elapsed 0.001 >>> seconds) done. >>> [ 37.514837] suspending xenstore... >>> [ 37.515142] xen:grant_table: Grant tables using version 1 layout >>> [18446744002.593711] OOM killer enabled. >>> [18446744002.593726] Restarting tasks ... done. >>> [18446744002.604527] Setting capacity to 6291456 >> >> Tonight, I've been through 29 bisect steps to figure out a bit more. A >> make defconfig with enabling Xen PV for domU reproduces the problem >> already, so a complete cycle with compiling and testing had only to take >> about 7 minutes. >> >> So, it appears that this 18 gazillion seconds of uptime is a thing that >> started happening earlier than the TCP situation already. All of the >> test scenarios resulted in these huge uptime numbers in dmesg. Not all >> of them result in TCP connections hanging. >> >>> As a side effect, all open TCP connections stall, because the timestamp >>> counters of packets sent to the outside world are affected: >>> >>> https://syrinx.knorrie.org/~knorrie/tmp/tcp-stall.png >> >> This is happening since: >> >> commit 9a568de4818dea9a05af141046bd3e589245ab83 >> Author: Eric Dumazet <edumazet@xxxxxxxxxx> >> Date: Tue May 16 14:00:14 2017 -0700 >> >> tcp: switch TCP TS option (RFC 7323) to 1ms clock >> >> [...] >> >>> [...] >>> >>> 3. Since this is related to time and clocks, the last thing today we >>> tried was, instead of using default settings, put "clocksource=tsc >>> tsc=stable:socket" on the xen command line and "clocksource=tsc" on the >>> domU linux kernel line. What we observed after doing this, is that the >>> failure happens less often, but still happens. Everything else applies. >> >> Actually, it seems that the important thing is that uptime of the dom0s >> is not very close to each other. After rebooting all four back without >> tsc options, and then a few hours later rebooting one of them again, I >> could easily reproduce again when live migrating to the later rebooted >> server. >> >>> Additional question: >>> >>> It's 2018, should we have these "clocksource=tsc tsc=stable:socket" on >>> Xen and "clocksource=tsc" anyways now, for Xen 4.11 and Linux 4.19 >>> domUs? All our hardware has 'TscInvariant = true'. >>> >>> Related: https://news.ycombinator.com/item?id=13813079 >> >> This is still interesting. >> >> ---- >8 ---- >> >> Now, the next question is... is 9a568de481 bad, or shouldn't there be 18 >> gazillion whatever uptime already... In Linux 4.9, this doesn't happen, >> so next task will be to find out where that started. > > And that's... > > commit f94c8d116997597fc00f0812b0ab9256e7b0c58f > Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Date: Wed Mar 1 15:53:38 2017 +0100 > > sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface > > a.k.a. v4.11-rc2~30^2 > > Before this commit, time listed in dmesg seems to follow uptime of the > domU, and after it, time in dmesg seems to jump around up and down when > live migrating to different dom0s, with the occasional/frequent jump to > a number above 18000000000 which then also shows the TCP timestamp > breakage since 9a568de4. > > So, next question is... what now? Any ideas appreciated. > > Can anyone else reproduce this? I have super-common HP DL360 hardware > and mostly default settings, so it shouldn't be that hard. > > Should I mail some other mailinglist with a question? Which one? Does > any of you Xen developers have more experience with time keeping code? My gut feeling tells me that above patch was neglecting Xen by setting a non-native TSC clock too often to "stable" (the "only call clear_sched_clock_stable() when we mark TSC unstable when we use native_sched_clock()" part of the commit message). I can have a more thorough look after Jan. 7th. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |