[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [OSSTEST PATCH v2 07/41] schema: Provide indices for sg-report-flight
George Dunlap writes ("Re: [OSSTEST PATCH v2 07/41] schema: Provide indices for sg-report-flight"): > > > > On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@xxxxxxxxxxxxx> wrote: > > > > These indexes allow very fast lookup of "relevant" flights eg when > > trying to justify failures. > > > > In my ad-hoc test case, these indices (along with the subsequent > > changes to sg-report-flight and Executive.pm, reduce the runtime of > > sg-report-flight from 2-3ks (unacceptably long!) to as little as > > 5-7s seconds - a speedup of about 500x. > > > > (Getting the database snapshot may take a while first, but deploying > > this code should help with that too by reducing long-running > > transactions. Quoted perf timings are from snapshot acquisition.) > > > > Without these new indexes there may be a performance change from the > > query changes. I haven't benchmarked this so I am setting the schema > > updates to be Preparatory/Needed (ie, "Schema first" as > > schema/README.updates has it), to say that the index should be created > > before the new code is deployed. > > > > Testing: I have tested this series by creating experimental indices > > "trial_..." in the actual production instance. (Transactional DDL was > > very helpful with this.) I have verified with \d that schema update > > instructions in this commit generate indexes which are equivalent to > > the trial indices. > > > > Deployment: AFter these schema updates are applied, the trial indices > > are redundant duplicates and should be deleted. ... > > I have no idea if building an index on a LIKE is a good idea or not, but it > certainly seems to be useful, so: > > Reviewed-by: George Dunlap <george.dunlap@xxxxxxxxxx> Thanks. This is a thing called a "partial index", where the index only covers some subset of the rows. The subset is determined a condition on the row contents. Such an index can be a lot smaller than an index on the whole table and also avoids slowing down updates that don't match the index condition. The idea is that when the query contains a condition that matches the index condition, the query planner can use this small on-topic index instead of wading through something large and irrelevant. The query planner is not always very bright about what conditions are subsets of what other conditions, and it runs without seeing the contents of bind variables. So with LIKE, for example, it's generally necessary to precisely replicate the index condition in the queries. That's why some of the queries in this series have things like this: AND r$ri.name LIKE 'built\_revision\_%' AND r$ri.name = ? where the Perl code passes in 'built_revison_something'. I hope this explanation was interesting :-). Ian.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |