[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Proposal - lightweight auto test binaries
On 13/01/2021 18:02, Marek Marczykowski-Górecki wrote: > On Wed, Jan 13, 2021 at 03:45:10PM +0000, Andrew Cooper wrote: >> On 12/01/2021 17:35, Ian Jackson wrote: >>> * When appropriately configured, the xen.git build system >>> will ship them into dist/install/usr/local/.... >>> >>> * There will be a specific directory containing tests which >>> are suitable for running automatically: >>> dist/install/usr/local/lib/xen/autotests >>> >>> * The pass/success reporting API will be encoded into the >>> filename. For now we define only one API: >>> dist/install/usr/local/lib/xen/autotests/simple/* >> Whatever we go for, it ought to be an easy path, and honestly - that's a >> mouthful to get to. These tests need to be just as easy for developers >> to use. >> >> How about opt/xen-autotest/* ? Much easier in a deb/rpm build to filter >> the things which clearly shouldn't be in a production system. It's also >> invariant of libdir which is an advantage for downstreams who test on >> more than one distro. > I can see them being packaged as a separate sub-package instead of not > packaging at all. And many distributions have rules forbidding native > packages form putting stuff into /opt. So a location somewhere in /usr > would be preferable. > I'm fine with dist/install/usr/local/lib/xen/autotests. > For running manually, there could be a trivial test runner (shell > script) that could be installed into dist/install/usr/local/sbin - to > make it even more convenient to access. local, or not, is down to --prefix, but I do like the idea of having a trivial test runner put into sbin for humans to use. Automation such as OSSTest and XenRT will want to track tests individually for reporting and/or regression purposes. But humans wanting to test their change want an easy "run everything and confirm ok". This is precisely why `./xtf-runner -aqq` exists. >>> * A test may exit with one of the XTF exit statuses: >>> 0 # XTF SUCCESS >>> 3 # XTF SKIP >>> 4 # XTF ERROR >>> 5 # XTF FAILURE >>> 6 # XTF CRASH >>> (osstest will treat anything other than 0 and 3 as "fail") >>> Any other termination should be treated as CRASH. >>> >>> * If a test exits nonzero it should print some information about >>> the situation to stderr. >>> >>> * Tests may print other information to stderr, which will be captured >>> and logged. >>> >>> * Tests *must not* print anything to stdout. Test runners *must* >>> ignore stdout (though they may log it, possibly mixed in with >>> stderr). stdout is reserved for future extension. >> I disagree. It is far more important that the tests are pleasant for >> humans to use, both in terms of developing them to begin with, and >> maintaining them if they go wrong. > I can see why such restriction would be useful for future extensions, > but I don't think it's strictly necessary. An extension can for example > require stdout lines with a specific prefix and ignore others. If it comes down to it, we can define a new ABI if we need to make an change, and can't find a compatible way of doing it. But I don't see this plausibly happening for "simple". >>> * Tests should exit with XTF ERROR if they are passed any arguments. >> ./test-foo --debug-this-harder would be a perfectly reasonable thing for >> a human to use to investigate problems, and it is what we use in the >> fuzzing harnesses for investigating crashes. > A data point: libvirt test suite have very same requirement - test > binaries needs to be run without an argument and error when they get > one. In addition to the error, you get a help message with env variables > to use instead (VIR_TEST_DEBUG=1 for example). And I do run it with some > dummy argument, to get this help message, because I can't remember how > those variables are named. Every single time. > But TBH, if parameters wouldn't be very obvious (like --verbose, or > --debug), I'd run --help first too. > > In any case, unknown command like arguments should result in an error. It's somewhat rude to lump every test with needing to use getopt, but I suppose we ought to do something. >> The problem is "for these configurations". Even at the totally basic level, >> >> * PV and Shadow are conditional on Xen's Kconfig. >> * PV32 is further conditional on command line settings, and/or whether >> the hardware supports CET-SS. >> * HVM is dependent on Kconfig, hardware, firmware and command line >> settings. HAP similarly. >> >> `xl create` doesn't handle missing CONFIG_SHADOW, or PV32 being disabled >> cleanly, despite having suitable information available in `xl info`. >> While this can (and should) be fixed, its not helpful for the more >> general testing case. > It indeed is an issue, but in practice this can be worked up > incrementally - when the test fails, see if the reason is unsupported > (by the test) configuration. And if so - add a condition for this > specific case (to return SKIP or choose another configuration). Rinse > and repeat. That isn't robust. For the examples given, they all yield EOPNOTSUPP, and from that alone, you can't distinguish "Xen really can't do $X" from "Someone regressed Xen's logic", but you can't distinguish "shadow unavailable" from "HVM unavailable" like this. Really, what you want is: if ( xen claims feature $X ) success = ("text $X" == SUCCESS) else success = ("test $X" == EOPNOTSUPP) Because that will allow you to spot "someone broke the domain construction logic around $X". Combined with the orchestration framework's idea of "what should Xen be able to do", you also have: success = ("xen claims $X" == "xen ought to claim $X") which allows you to spot "someone broke Xen's ability to spot/enable $X on boot". Other similar testable areas here are: foreach interesting CPUID bit: if ( bit supported in guest ) success = ("vm configured with $X" == "$X visible in guest") && ("vm configured without $X" == "$X not visible in guest") else success = "vm configured with $X" rejected during construction There is absolutely a time and a place for the simple tests, and there is a lot of value in having them, but a small piece of C isn't capable of doing this higher level reasoning. Reasoning this fine grained *is* important given the shear complexity of virtualisation, and I fully intend that the "simple", and this "not-quite-so-simple" set of autotests will be sufficiently quick as to be minutes worth of testing, and live in the Gitlab/patchew pre-commit testing. ~Andrew P.S. if its not already clear by now, I've spent a lot of time trying to figure out how to turn "something, somewhere, went wrong" bug reports into something more useful for a test robot to report, at the point that it has figured out that something went wrong.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |