[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Notes for xen summit 2018 design session] Process changes: is the 6 monthly release Cadence too short, Security Process, ...
> On Jul 5, 2018, at 12:16 PM, Ian Jackson <ian.jackson@xxxxxxxxxx> wrote: > > Juergen Gross writes ("Re: [Xen-devel] [Notes for xen summit 2018 design > session] Process changes: is the 6 monthly release Cadence too short, > Security Process, ..."): >> We didn't look at the sporadic failing tests thoroughly enough. The >> hypercall buffer failure has been there for ages, a newer kernel just >> made it more probable. This would have saved us some weeks. > > In general, as a community, we are very bad at this kind of thing. > > In my experience, the development community is not really interested > in fixing bugs which aren't directly in their way. > > You can observe this easily in the way that regression in Linux, > spotted by osstest, are handled. Linux 4.9 has been broken for 43 > days. Linux mainline is broken too. > > We do not have a team of people reading these test reports, and > chasing developers to fix them. I certainly do not have time to do > this triage. On trees where osstest failures do not block > development, things go unfixed for weeks, sometimes months. > > And overall my gut feeling is that tests which fail intermittently are > usually blamed (even if this is not stated explicitly) on problems > with osstest or with our test infrastructure. It is easy for > developers to think this because if they wait, the test will get > "lucky", and pass, and so there will be a push and the developers can > carry on. > > I have a vague plan to sit down and think about how osstest's > results analysers could respond better to intermittent failures. The > If I can, I would like intermittent failures to block pushes. That > would at least help address the problem of heisenbugs (which are often > actually quite serious issues) not beint taken seriously. > > I would love to hear suggestions for how to get people to actually fix > test failures in trees not maintained by the Xen Project and therefore > not gated by osstest. Well at the moment, investigation is ad-hoc. Basically everyone has to look to see *whether* there’s been a failure, and it’s nobody’s job in particular to try to chase it down to find out what it might be. If we had a team, we could have a robot rotate between the teams to nominate one particular person per failure to take a look at the result and at least try to classify it, maybe try to find the appropriate person who may be able to take a deeper look. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |