[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 142973: regressions - FAIL
Jürgen Groß writes ("Re: [Xen-devel] [xen-unstable test] 142973: regressions - FAIL"): > On 21.10.19 10:23, osstest service owner wrote: > > flight 142973 xen-unstable real [real] > > http://logs.test-lab.xenproject.org/osstest/logs/142973/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-amd64-xl-pvshim 18 guest-localmigrate/x10 fail REGR. vs. > > 142750 > > Roger, I believe you have looked into that one? > > I guess the conversation via IRC with Ian regarding the race between > blkback and OSStest was related to the issue? I think this failure is something else. What happens here is this: 2019-10-21 02:58:32 Z executing ssh ... -v root@172.16.145.205 date [bounch of output from ssh] status (timed out) at Osstest/TestSupport.pm line 550. 2019-10-21 02:58:42 Z exit status 4 172.16.145.205 is the guest here. Ie, `ssh date guest' took longer than 10s. We can see that the guest networking is working soon after the migration because we got most of the way through the ssh protocol exchange. On the previous repetition the next message from ssh was debug1: SSH2_MSG_SERVICE_ACCEPT received Looking at http://logs.test-lab.xenproject.org/osstest/logs/142973/test-amd64-amd64-xl-pvshim/rimava1---var-log-xen-console-guest-debian.guest.osstest--incoming.log which is, I think, the log of the "new" instance of guest, after migration, there are messages about killing various services. Eg [1918064738.820550] systemd[1]: systemd-udevd.service: Main process exited, code=killed, status=6/ABRT They don't seem to be normal. For example: http://logs.test-lab.xenproject.org/osstest/logs/142865/test-amd64-amd64-xl-pvshim/rimava1---var-log-xen-console-guest-debian.guest.osstest--incoming.log is the previous xen-unstable flight and it doesn't have them. I looked in http://logs.test-lab.xenproject.org/osstest/logs/142865/test-amd64-amd64-xl-pvshim/rimava1---var-log-xen-console-guest-debian.guest.osstest.log.gz too and that has some alarming messages from the kernel like [ 686.692660] rcu_sched kthread starved for 1918092123128 jiffies! g18446744073709551359 c18446744073709551358 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0 and accompanying stack traces. But the test passed there. I think that is probably something else ? ABRT suggests guest memory corruption. > If this is the case, could you, Ian, please add the workaround you were > thinking of to OSStest (unconditional by now, maybe make it condtitional > later)? I can add the block race workaround but I don't think it will help with migration anyway. The case where things go wrong is destroy. Roger, am I right that a normal guest shutdown is race-free ? I think we tear things down in a slower manner and will therefore end up waiting for blkback ? Or is that not true ? Maybe the right workaround is to disable the code in osstest which tries to clean up a previous failed run. I think the kernel doesn't mind multiple blkfronts (or indeed multiple other tasks) using the same device at once. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |