[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] xenbus and the message of doom
I was investigating a bug report[1] about newer kernels (>3.1) not booting as HVM guests on Amazon EC2. For some reason git bisect did give the some pain, but it lead me at least close and with some crash dump data I think I figured the problem. commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05 Author: Olaf Hering <olaf@xxxxxxxxx> Date: Thu Sep 22 16:14:49 2011 +0200 xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel This change introduced a xs_reset_watches() call. The problem seems to be that there is at least some version of Xen (I was able to reproduce with a 3.4.3 version which I admit to deliberately not having updated) for which xenstore will not return any reply. At least the backtraces in crash showed that xs_init had been calling xs_reset_watches() and that was happily idling in read_reply(). Effectively nothing was going on and the boot just hung. By just not doing that xs_reset_watches() call, I was able to boot under the same host. And for what it is worth there has not been an issue with Xen 4.1.1 and a 3.0 dom0 kernel. Just this "older" release is trouble. Now the big question is, should this never happen and the host needs urgent updating. Or, should xs_talkv() set up a time limit and assume failure when not receiving a message after that? I could imagine the latter might lead at least to a more helpful "there is something wrong here, dude" than just hanging around without any response. ;) -Stefan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |