[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] State of GPLPV tests - 28.11.11
> Hello James, > > I am still running tests 7 days a week on two test systems. Results are quite > discouraging though. After experiencing crash after crash I wanted to test if > the configuration I called "stable" (Xen 4.0.1, GPLPV 0.11.0.213, dom0 kernel > 2.6.32.18-pvops0-ak3) was stable indeed. But even that config crashed when > running my torture test. It is stable on our production systems - running > other workloads of course. What crash are you getting these days? Is it the same one as you used to get? > > One thing I thought of... virtualisation gives an interesting > opportunity to > exaggerate race conditions. If you have 8 vCPU's in a > DomU but only let > one or two physical CPUs service those 8 vCPU's,then > it can give rise to > race conditions which could only be rarely seen > (or never seen) in normal > operation. It's awful for performance but > if you could try that and see if it > gives rise to crashes a bit > more frequently it might help us track down the > problem. > > What exactly is the config you are talking about in terms of Xen/dom0 > command line? In terms of domU config files? I don't remember the exact syntax, but if you specify vcpus=4 but only let the DomU run on one physical cpu it might trip up more often, if the problem is caused by a race. If the problem is an arithmetic error in xennet then it won't help. > > As always, I monitor your mercurial repo ;-) How would you see the > relationship of commits 952+953 to our problem? 952 seems to affect LSO in > some way since LsoV1TransmitComplete.TcpPayload is finally wrong (could it > be negative since tx_length is smaller than the fixed tx_length?). What about > 953? Not sure. > One more thought: As mentioned earlier crashes often occurred after an > uptime of 9-10 days and these crashes occurred too consistently to be a "by > chance" event. In my torture tests I am NOT USING a Windows NTP service (I > use the meinberg NTP daemon on Windows). But on production I do. Can > you see any possible impact here? > It's certainly more likely for a stray UDP packet to cause an upset I guess. As the packets pass through a Linux firewall (iptables in Dom0) it's more likely that errant TCP packets will be dropped there. Do you have a crash dump against 0.11.0.323? James _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |