[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: [Xen-users] old issue after 1024 live migrations seems to still exist.
Hi Ian & list, I'll provide the specifics of my config, sure 2010/7/22 Ian Campbell <Ian.Campbell@xxxxxxxxxx>: > (dropping xen-users to avoid cross-posting) > Do you have a reference to this old issue? I googled for the old mailing list post, but no luck with the traffic on the Xen lists. Firstofall, I'm glad if it's a different bug and doesn't exist for most people :) > To be honest I think it is unlikely that you are seeing the actual same > issue as a bug that old, even if your symptoms are very similar. > > Can you give details of your precise system configuration for both host > and guest, hypervisor changeset (I don't know what Oracle VM 2.0 has in > it), kernel changeset for both dom0 and domU etc. dom0 (both identical) xen_major : 3 xen_minor : 4 xen_extra : .0 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xff400000 xen_changeset : unavailable [root@waxh0004 ~]# uname -a Linux waxh0004 2.6.18-128.2.1.4.9.el5xen #1 SMP Fri Oct 9 14:57:31 EDT 2009 i686 i686 i386 GNU/Linux domU: debian:~# uname -a Linux debian 2.6.26-2-xen-686 #1 SMP Wed Nov 4 23:23:33 UTC 2009 i686 GNU/Linux (debian lenny from stacklet.com, kernel date was nov9 09) > I am currently doing some live migration testing with guests under load > (forkbomb) and am regularly doing 4-5000 successful migrations before I > hit a very subtle deadlock in a PVops domU kernel. I have most likely in > the past 4-5 years personally done tens of thousands of iterations of > live migration in various scenarios and we know other people are > regularly doing automated and manual test of these things so the problem > you are seeing is almost certainly not a generic failure but must be > specific to the version of one or more components in your system. good! > Are you seeing failure after precisely 1024 migrations in every case or > is that just a rough figure? It might be worth no, it was more like "just above 1000", I also had some counter problem in the script. Note that before that a few times the migration ended with a domU was down. so your below hint / leak might just be the thing. > using /usr/lib/xen/bin/lsevtchn to check what is happening to both the > dom0 and domU event channels after each migration iteration. Once upon a okay, will log that > time I was seeing an evtchn leak in domU (now fixed) but that wouldn't > fail after precisely 1024 iterations since there is always a number of > non-leaking event channels also in use. > > Are you able to test with an up to date xen-3.4-testing or even better > the xen-4.0-testing tree? Retesting with Xen 4 would be a bit tricky. Oracle has an SDK domU that has all the dom0 sources, would still take a day of work I'm afraid. I'd hope some other people can do the testing on other versions, thats what I asked and what I didn't send to xen-devel in the first place. I fixed lan management access to one of te hosts (for serial console/reboot/reset...) so on that one I could try re-testing with 3.4 testing. If the issue doesn't show up in your tests then I agree it's probably just in the specific version - in that case I can just inform oracle and they can look into it on their own. >> > is it just the gratious arp? > > The Grat. ARP doesn't get sent by current PVops kernels (I don't know if > you are using this since you haven't provided any details about your > system configuration). A fix is pending in the network subsystem I know I didn't. Because I just asked for someone else to run the script and retest ;p > maintainers tree which I hope will be backported to to 2.6.32.x when it > goes into mainline during the next merge window. > See 06c4648d46d1b757d6b9591a86810be79818b60c and > 592970675c9522bde588b945388c7995c8b51328 in net-next-2.6.git. You will > also need to configure sysctl to enable the arp_notify option for the > devices setting net.ipv4.conf.all.arp_notify = 1 is likely sufficient. classic domU kernel I'll try if I get a newer dom0 kernel to work, but I'll be on vacation for a week now. Considering that you successfully migrate a few thousand times I'd suggest you forget about the issue until then. Greetings, Flo -- 'Sie brauchen sich um Ihre Zukunft keine Gedanken zu machen' _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |