[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] "right" way to gather domU stats in xen 3 & 4?
Hi all, I'm building a xen agent for nagios / check_mk. Automatic inventory of VMs and the basic up / down reporting are reliable now, and I'm looking at the next items on my list. * Free memory. This seems easy at first, look at xm info and that's mostly it. I can have a different color for memory allocated to dom0 minus the dom0 lower balloon limit, but I'll also have a check that will go to full alarm if anyone is crazy enough to use dom0 balloning. ;) What I don't know is if I also need to substract something for the Xen heap? Long ago it used to default to 32MB i think. Can someone clue me in about that - is it relevant to xm info free / total mem? * per domU I also wanna look at memory statistics. - one thing is: mem vs. mem-max to show balloning. - the other thing is tmem: i don't know if i should spend the time getting it right as I start getting the impression that since it was added by Dan and now tmem2 was added, two-and-a-half years went down where it's considered working implemented none bothers to make it work for everyone. i.e. the recent directed that the direct ballooning daemon was just a lab exercise ;) If you know of any people that successfully run xen with tmem2 and such, I'd love to work with them to build the nagios-sy statistics .Otherwise I'll save myself the headaches. * per domU cpu percent (to show how much of the dom0 power the vm is consuming)... Speed issues: Usually checks in check_mk are fired off every minute, so it would be good if I can directly via xenstore to collect and report my data within 1-2 seconds or less. Speed seems to be an issue I have to worry about - on my "top of the shelf" xen host it will take around 0.6seconds to query a meager 5 VMs. That's just a 1.5GHz VIA box, but I'll have to see how long it takes for 100 VMs or more. Documentation?? What I'm missing is some document that'd show all nodes in the xenstore that are readable. I've poked around a lot already but the statistics are hiding from me. Also I would try to use something that can work in xen4 and xen3. But that's not mandatory, I can fallback from xl to xenstore-read to xm to libvirt. Why you might want to help: Using check_mk you can pull off all kinds of crazy stuff with the data it collects: trend analyzing on disk usage ("simple" example: get an alert if your vm store is growing at a rate that will let it run out of space in 3 days) if somebody feels they need it, use the block IO rates to trigger an eventhandler that will put io & cpu caps on a VM. (hosters might love that :) I think most of these features are not implemented in any nagios checks so far If I just hack it in ksh, it *will work*, but be ugly and slow :) and of course you won't have to bother with any config files to add a VM! Maybe someone likes xenstore *a lot* and can point me at the right spots. Florian p.s.: could interested parties consider spending a day to improve the xm list output? it may technically make sense that a vm created using xm new has no ID and no status instead of "-------" and a VM that is running but didn't use CPU during the microsecond we queried it is shown as blocking. But it makes life harder for each and every xen user for 5 or 6 years now, and technical reasons really don't cut it if they turn information into worthless bytes. (I still feel you would get an "-r-----" state most of the time back in Xen2...) -- the purpose of libvirt is to provide an abstraction layer hiding all xen features added since 2006 until they were finally understood and copied by the kvm devs. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |