https://bugs.launchpad.net/ubuntu/+filebug-show-similar
Referrer: https://bugs.launchpad.net/ubuntu/+filebug/?no-redirect
is timing out on edge and production, a lot
duplicate detection uses an | clause between terms which leads to very bad selectivity - 200000 or more bugs are selected, which takes time to process. The goal is to show relevant bugs specifically allowing users to put in unique-and-missing terms and have it still hit.
To QA this, just go to https://bugs.staging.launchpad.net/ubuntu/+filebug and do a number of searches. They should all complete in a few seconds.
Note that the default timeout on staging is too low for it to work at all as yet - but thats due to the index being big and staging being small: raise the timeout to 25 seconds, and then you can seed the index into memory and subsequent queries will perform well.
https:/ /lp-oops. canonical. com/oops. py/?oopsid= 1662A16
SQL time: 17030 ms
Non-sql time: 816 ms
Total time: 17846 ms
Statement Count: 28
https:/ /bugs.launchpad .net/ubuntu/ +filebug
The query in that oops is: distribution = 1;
SELECT COUNT(CASE WHEN Bug.fti @@ ftq('sphinx') THEN TRUE ELSE null END) FROM Bug, BugTask WHERE BugTask.bug = Bug.id AND BugTask.
which is terrible
AFAIK its equivalent to distribution = 1;
SELECT COUNT(*) from Bug, bugtask where bug.fti @@ ftq('sphinx') and BugTask.bug = Bug.id AND BugTask.
which is tolerable
lpmain_staging=> explain analyse SELECT COUNT(*) from Bug, bugtask where bug.fti @@ ftq('sphinx') and BugTask.bug = Bug.id AND BugTask. distribution = 1;
QUERY PLAN ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ---- 74..5385. 75 rows=1 width=0) (actual time=481. 898..481. 899 rows=1 loops=1) 93..5384. 66 rows=432 width=0) (actual time=333. 412..481. 830 rows=40 loops=1) 93..1841. 57 rows=600 width=4) (actual time=333. 366..480. 218 rows=101 loops=1)
Filter: ((fti)::tsvector @@ '''sphinx' ''::tsquery) 701..323. 701 rows=10819 loops=1)
Index Cond: ((fti)::tsvector @@ '''sphinx' ''::tsquery)
Index Cond: (bugtask.bug = bug.id)
Filter: (bugtask. distribution = 1)
-------
Aggregate (cost=5385.
-> Nested Loop (cost=110.
-> Bitmap Heap Scan on bug (cost=110.
-> Bitmap Index Scan on bug_fti (cost=0.00..110.78 rows=600 width=0) (actual time=323.
-> Index Scan using bugtask__bug__idx on bugtask (cost=0.00..5.89 rows=1 width=4) (actual time=0.012..0.013 rows=0 loops=101)
Total runtime: 482.213 ms
(10 rows)
Time: 489.949 ms distribution = 1;
QUERY PLAN ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ----- 13..281594. 14 rows=1 width=386) (actual time=9579. 732..9579. 733 rows=1 loops=1) 09..280513. 73 rows=432158 width=386) (actual time=3051. 529..6739. 112 rows=436649 loops=1) 00..37077. 68 rows=432158 width=4) (actual time=0.031..562.978 rows=436649 loops=1)
Filter: (distribution = 1) 93..163550. 93 rows=599693 width=390) (actual time=3051. 151..3051. 151 rows=599696 loops=1) 00..163550. 93 rows=599693 width=390) (actual time=0. 010..1249. 606 rows=599696 loops=1)
lpmain_staging=> explain analyze SELECT COUNT(CASE WHEN Bug.fti @@ ftq('sphinx') THEN TRUE ELSE null END) FROM Bug, BugTask WHERE BugTask.bug = Bug.id AND BugTask.
-------
Aggregate (cost=281594.
-> Hash Join (cost=201501.
Hash Cond: (bugtask.bug = bug.id)
-> Seq Scan on bugtask (cost=0.
-> Hash (cost=163550.
-> Seq Scan on bug (cost=0.
Total runtime: 9579.850 ms
(8 rows)
Time: 9582.806 ms