[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [OSSTEST PATCH 14/14] duration_estimator: Move duration query loop into database
> On Jul 21, 2020, at 7:42 PM, Ian Jackson <ian.jackson@xxxxxxxxxxxxx> wrote: > > Stuff the two queries together: we use the firsty query as a WITH > clause. This is significantly faster, perhaps because the query > optimiser does a better job but probably just because it saves on > round trips. > > No functional change. > > Perf: subjectively this seemed to help when the cache was cold. Now I > have a warm cache and it doesn't seem to make much difference. > > Perf: runtime of my test case now ~5-7s. > > Example queries before (from the debugging output): > > Query A part I: > > SELECT f.flight AS flight, > j.job AS job, > f.started AS started, > j.status AS status > FROM flights f > JOIN jobs j USING (flight) > JOIN runvars r > ON f.flight=r.flight > AND r.name=? > WHERE j.job=r.job Did these last two get mixed up? My limited experience w/ JOIN ON and WHERE would lead me to expect we’re joining on `f.flight=r.flight and r.job = j.job`, and having `r.name = ?` as part of the WHERE clause. I see it’s the same in the combined query as well. > AND f.blessing=? > AND f.branch=? > AND j.job=? > AND r.val=? > AND (j.status='pass' OR j.status='fail' > OR j.status='truncated'!) > AND f.started IS NOT NULL > AND f.started >= ? > ORDER BY f.started DESC > > With bind variables: > "test-amd64-i386-xl-pvshim" > "guest-start" > > Query B part I: > > SELECT f.flight AS flight, > s.job AS job, > NULL as started, > NULL as status, > max(s.finished) AS max_finished > FROM steps s JOIN flights f > ON s.flight=f.flight > WHERE s.job=? AND f.blessing=? AND f.branch=? > AND s.finished IS NOT NULL > AND f.started IS NOT NULL > AND f.started >= ? > GROUP BY f.flight, s.job > ORDER BY max_finished DESC > > With bind variables: > "test-armhf-armhf-libvirt" > 'real' > "xen-unstable" > 1594144469 > > Query common part II: > > WITH tsteps AS > ( > SELECT * > FROM steps > WHERE flight=? AND job=? > ) > , tsteps2 AS > ( > SELECT * > FROM tsteps > WHERE finished <= > (SELECT finished > FROM tsteps > WHERE tsteps.testid = ?) > ) > SELECT ( > SELECT max(finished)-min(started) > FROM tsteps2 > ) - ( > SELECT sum(finished-started) > FROM tsteps2 > WHERE step = 'ts-hosts-allocate' > ) > AS duration Er, wait — you were doing a separate `duration` query for each row of the previous query? Yeah, that sounds like it could be a lot of round trips. :-) > > With bind variables from previous query, eg: > 152045 > "test-armhf-armhf-libvirt" > "guest-start.2" > > After: > > Query A (combined): > > WITH f AS ( > SELECT f.flight AS flight, > j.job AS job, > f.started AS started, > j.status AS status > FROM flights f > JOIN jobs j USING (flight) > JOIN runvars r > ON f.flight=r.flight > AND r.name=? > WHERE j.job=r.job > AND f.blessing=? > AND f.branch=? > AND j.job=? > AND r.val=? > AND (j.status='pass' OR j.status='fail' > OR j.status='truncated'!) > AND f.started IS NOT NULL > AND f.started >= ? > ORDER BY f.started DESC > > ) > SELECT flight, max_finished, job, started, status, > ( > WITH tsteps AS > ( > SELECT * > FROM steps > WHERE flight=f.flight AND job=f.job > ) > , tsteps2 AS > ( > SELECT * > FROM tsteps > WHERE finished <= > (SELECT finished > FROM tsteps > WHERE tsteps.testid = ?) > ) > SELECT ( > SELECT max(finished)-min(started) > FROM tsteps2 > ) - ( > SELECT sum(finished-started) > FROM tsteps2 > WHERE step = 'ts-hosts-allocate' > ) > AS duration > > ) FROM f I mean, in both queries (A and B), the transform should basically result in the same thing happening, as far as I can tell. I can try to analyze the duration query and see if I can come up with any suggestions, but that would be a different patch anyway. -George
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |