[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] FTR: osstest load spikes, countermeasures



I wrote on IRC:

11:07 <Diziet> andyhhp: Looking at your xtf failure in 110095.  
11:08 <andyhhp> Diziet: yes - it is odd
11:08 <andyhhp> the test itself didn't fail
11:10 <Diziet> andyhhp: Jun 8 07:26:08 osstest uptime: 07:26:08 up 168
               days, 19:53, 9 users, load average: 19.87, 10.37, 9.17
11:10 <Diziet> So your failure coincides with a big load spike on the
               controller VM

Investigating, I discovered:

 * Normal load on the osstest VM is about 2.

 * There were occasional (irregular, but several times a day) load
   spikes (from say 5 but up to 40) on the osstest VM.

 * There were no corresponding load spikes in the dom0.

 * Examination of process accounting logs, top, etc., suggested that
   the load spikes coicided with certain git invocations which used a
   lot of RAM.

I concluded that the most likely causes were (a) shortage of RAM for
some of the very large git trees osstest needs to deal with
(b) Linux's very poor (by default) IO scheduling.

To test (a), I increased the osstest VM's memory from 12 to 20 Gby.
(Luckily the VM host had enough RAM for that.)  After that there was
only one load spike: this morning the load went to 14 at about the
same time as the logs were rotated, so probably cron.weekly.

To fix (b) I have also changed the IO scheduler on the osstest VM, and
on the dom0, from "cfq" to "deadline".  (IME the cfq scheduler is
appallingly bad, and will easily make a machine with any io load
effectively unuseable.)

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.