[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 0/2][XTF] xtf/vpmu VPMU tests
On 28/04/17 20:35, Mohit Gambhir wrote: > > > On 04/25/2017 02:50 PM, Andrew Cooper wrote: >> On 24/04/17 18:54, Mohit Gambhir wrote: >>> Mohit Gambhir (2): >>> xtf/vpmu: Add Intel PMU MSR addresses >>> xtf/vpmu: MSR read/write tests for VPMU >>> >>> arch/x86/include/arch/msr-index.h | 11 + >>> tests/vpmu/Makefile | 9 + >>> tests/vpmu/main.c | 504 >>> ++++++++++++++++++++++++++++++++++++++ >>> 3 files changed, 524 insertions(+) >>> create mode 100644 tests/vpmu/Makefile >>> create mode 100644 tests/vpmu/main.c >>> >> Independently from the content of this series, how have you found the >> XTF to use? Any issues or unexpected corner cases which can be >> improved? > I think overall it was fairly easy to use and was very useful for the > kind of testing we wanted to do for VPMU. > It helped me find 3 bugs - one hypervisor crash, one potential state > leak and one missing functionality. > I didnt find any bugs/corner cases in XTF but here are the few nice to > have features that you can consider adding - > > 1. As said in responses to one of the comments, I think it will be > nice to have some kind of > "expected failure" return type that indicates that it is a known bug. > Thats particularly > useful if the person writing the tests is not a Xen developer. You > seem to have indicated > that xtf_error() means that it is a known error but that wasn't > intuitive to me. I clearly need to improve the documentation in this area. The terminology is inherited from XenRT (our internal test system) and learns from 10 years of lessons on how to sensibly report results in a clear and concise fashion. SUCCESS means "everything worked correctly". FAILURE means "the test positively identified that the subject under test behaved incorrectly". ERROR means "Something unexpected went wrong". SKIP means "The test didn't run", usually because the preconditions were not met. The distinction between failure and error is important when triaging results (and indeed, the priority order of investigation), but in the end, still means that a human is going to have to look at the logs and either fix a product bug/regression, or fix whatever is causing the unexpected nature of the problem (which itself is either a testcase fix, or more likely a testcase and product fix). As for "expected failure", there is no such thing. If your test case has identified that Xen is malfunctioning, then that is a straight failure. Having any other kind of "its broken, but pretend everything went successfully" is a sure-fire way for the problem to get ignored forever more if it becomes the status quo in regression tests. My method thus far is to fix all parts of the problem in Xen, then introduce the complete XTF test which passes on master (but typically will fail on older branches, and OSSTest takes care of relegating those to its idea of "expected failure".) I accept that this is harder to do for larger areas of functionality, but there is no point running a regression test if we are expecting it not to pass, because it won't identify other regressions which might slip it. However, this is predicated on getting master sorted before the test gets run in a regression-capacity. (I am facing a similar dilemma with the nested-virt tests, which is why I am leaning towards having a new category which is for tests or areas of functionality which are in development and aren't ready to be part of the regression tests yet.) > > 2. Its useful to print __LINE__ number from where an error/failure is > reported. Is it? I'm an advocate of the principle that if the message text contains sufficient information to be considered useful, a __func/FILE/LINE__ reference isn't necessary. (Quality of error message text is a frequent review point of mine, with a maintainers hat on). Nothing stops you formatting __LINE__ into the message, but I would suggest that probably means your text could be better. > > 3. Is test_main() considered 1 test? If so it would be useful to > define a test at a finer granularity. > For instance in VPMU case, each MSR test should really be a separate > test. It would require > adding some semantics to names of the functions that define a test. > That way, its easier to > keep track of the health of the software I think. There is a tradeoff with the granularity of reporting. It is a conscious design choice that a single microkernels worth of test logic has a single net result (again, inherited from XenRTs eventual lessons learnt from how to effectively deal with tens of thousands of machine hours of tests per night, and hundreds of thousands of logical tests). The common case for automation is "run all $N thousand XTF tests, and raise tickets for any failures". It is not useful for the automation test framework to have to deal in terms of individual test steps worth of success/failure inside a single logical test. On the other hand, when any non-success is encountered, a developer is going to have to intervene (at which point the quality of error information is important), but the test is designed to be easy and quick to run in isolation, so investigating a single failure inside something like this larger vpmu test shouldn't be difficult. > > 4. Is there a way to build just one test? Something like make > test-hvm64-vpmu? No, but I've been meaning to rewrite the build system to be less bad than it currently is. For now, the fact that total compilation of everything is only a few seconds means that it isn't terribly high on my priority list. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |