A small optimization relevant in the 2.0.6 that George is running: the staff-side copy visibility check includes the following query:
PERFORM 1
FROM asset.call_number cn JOIN asset.copy cp ON (cp.call_number = cn.id) JOIN actor.org_unit a ON (cp.circ_lib = a.id)
WHERE NOT cn.deleted AND NOT cp.deleted AND cp.circ_lib IN ( SELECT * FROM search.explode_array( search_org_list ) ) AND cn.record IN ( SELECT * FROM search.explode_array( core_result.records ) )
LIMIT 1;
Note that joining actor.org_unit is unnecessary, but it also has a bad effect on the query plan. Compare:
evergreen=# explain SELECT 1
evergreen-# FROM asset.call_number cn
evergreen-# JOIN asset.copy cp ON (cp.call_number = cn.id)
evergreen-# JOIN actor.org_unit a ON (cp.circ_lib = a.id)
evergreen-# WHERE NOT cn.deleted
evergreen-# AND NOT cp.deleted
evergreen-# AND cp.circ_lib IN ( SELECT * FROM search.explode_array( ARRAY[1, 2] ) )
evergreen-# AND cn.record IN ( SELECT * FROM search.explode_array( ARRAY[2, 3] ) )
evergreen-# LIMIT 1; QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Limit (cost=20.88..82.78 rows=1 width=0)
-> Nested Loop Semi Join (cost=20.88..1569722.78 rows=25360 width=0)
Join Filter: (cn.record = ('{2,3}'::integer[])[s.s])
-> Nested Loop (cost=20.88..301722.78 rows=50720 width=8)
-> Hash Join (cost=20.88..25198.05 rows=50720 width=8) Hash Cond: (cp.circ_lib = a.id) -> Append (cost=0.00..21833.43 rows=751401 width=12) -> Seq Scan on copy cp (cost=0.00..21821.73 rows=751316 width=12) Filter: (NOT deleted) -> Seq Scan on unit cp (cost=0.00..11.70 rows=85 width=12) Filter: (NOT deleted) -> Hash (cost=20.71..20.71 rows=14 width=8) -> Hash Join (cost=16.61..20.71 rows=14 width=8) Hash Cond: (('{1,2}'::integer[])[s.s] = a.id) -> HashAggregate (cost=15.00..17.00 rows=200 width=4) -> Function Scan on generate_series s (cost=0.00..12.50 rows=1000 width=4) -> Hash (cost=1.27..1.27 rows=27 width=4) -> Seq Scan on org_unit a (cost=0.00..1.27 rows=27 width=4)
-> Index Scan using call_number_pkey on call_number cn (cost=0.00..5.44 rows=1 width=16) Index Cond: (cn.id = cp.call_number) Filter: (NOT cn.deleted)
-> Function Scan on generate_series s (cost=0.00..12.50 rows=1000 width=4)
(22 rows)
with
evergreen=# explain SELECT 1
evergreen-# FROM asset.call_number cn
evergreen-# JOIN asset.copy cp ON (cp.call_number = cn.id)
evergreen-# WHERE NOT cn.deleted
evergreen-# AND NOT cp.deleted
evergreen-# AND cp.circ_lib IN ( SELECT * FROM search.explode_array( ARRAY[1, 2] ) )
evergreen-# AND cn.record IN ( SELECT * FROM search.explode_array( ARRAY[2, 3] ) )
evergreen-# LIMIT 1; QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Limit (cost=15.00..66.58 rows=1 width=0)
-> Nested Loop Semi Join (cost=15.00..9688659.75 rows=187850 width=0)
Join Filter: (cp.circ_lib = ('{1,2}'::integer[])[s.s])
-> Nested Loop (cost=15.00..4138521.12 rows=375701 width=4)
Join Filter: (cn.id = cp.call_number)
-> Nested Loop (cost=15.00..4511.81 rows=300396 width=8) -> HashAggregate (cost=15.00..17.00 rows=200 width=4) -> Function Scan on generate_series s (cost=0.00..12.50 rows=1000 width=4) -> Index Scan using asset_call_number_record_idx on call_number cn (cost=0.00..22.41 rows=5 width=16) Index Cond: (cn.record = ('{2,3}'::integer[])[s.s]) Filter: (NOT cn.deleted)
-> Append (cost=0.00..13.70 rows=5 width=12) -> Index Scan using cp_avail_cn_idx on copy cp (cost=0.00..11.43 rows=4 width=12) Index Cond: (cp.call_number = cn.id) Filter: (NOT cp.deleted) -> Index Scan using unit_avail_cn_idx on unit cp (cost=0.00..2.27 rows=1 width=12) Index Cond: (cp.call_number = cn.id) Filter: (NOT cp.deleted)
-> Function Scan on generate_series s (cost=0.00..12.50 rows=1000 width=4)
(19 rows)
This is issue is resolved in 2.1+ with the changes to query_parser_fts done as part of the multi-homed support, but I'll push a fix for rel_2_0. Obviously, this micro-optimization, although it is significant in George's particular situation and possibly for other 2.0.x users, doesn't address the "make searching more and more faster" issue in the general case.
A small optimization relevant in the 2.0.6 that George is running: the staff-side copy visibility check includes the following query:
PERFORM 1
JOIN asset.copy cp ON (cp.call_number = cn.id)
JOIN actor.org_unit a ON (cp.circ_lib = a.id)
AND NOT cp.deleted
AND cp.circ_lib IN ( SELECT * FROM search. explode_ array( search_org_list ) )
AND cn.record IN ( SELECT * FROM search. explode_ array( core_result.records ) )
FROM asset.call_number cn
WHERE NOT cn.deleted
LIMIT 1;
Note that joining actor.org_unit is unnecessary, but it also has a bad effect on the query plan. Compare:
evergreen=# explain SELECT 1 explode_ array( ARRAY[1, 2] ) ) explode_ array( ARRAY[2, 3] ) )
QUERY PLAN ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ---- 88..1569722. 78 rows=25360 width=0) '::integer[ ])[s.s] ) 88..301722. 78 rows=50720 width=8) 88..25198. 05 rows=50720 width=8)
Hash Cond: (cp.circ_lib = a.id)
-> Append (cost=0. 00..21833. 43 rows=751401 width=12)
- > Seq Scan on copy cp (cost=0. 00..21821. 73 rows=751316 width=12)
Filter: (NOT deleted)
- > Seq Scan on unit cp (cost=0.00..11.70 rows=85 width=12)
Filter: (NOT deleted)
-> Hash (cost=20.71..20.71 rows=14 width=8)
- > Hash Join (cost=16.61..20.71 rows=14 width=8)
Hash Cond: (('{1,2} '::integer[ ])[s.s] = a.id)
-> HashAggregate (cost=15.00..17.00 rows=200 width=4)
-> Function Scan on generate_series s (cost=0.00..12.50 rows=1000 width=4)
-> Hash (cost=1.27..1.27 rows=27 width=4)
-> Seq Scan on org_unit a (cost=0.00..1.27 rows=27 width=4)
Index Cond: (cn.id = cp.call_number)
Filter: (NOT cn.deleted)
evergreen-# FROM asset.call_number cn
evergreen-# JOIN asset.copy cp ON (cp.call_number = cn.id)
evergreen-# JOIN actor.org_unit a ON (cp.circ_lib = a.id)
evergreen-# WHERE NOT cn.deleted
evergreen-# AND NOT cp.deleted
evergreen-# AND cp.circ_lib IN ( SELECT * FROM search.
evergreen-# AND cn.record IN ( SELECT * FROM search.
evergreen-# LIMIT 1;
-------
Limit (cost=20.88..82.78 rows=1 width=0)
-> Nested Loop Semi Join (cost=20.
Join Filter: (cn.record = ('{2,3}
-> Nested Loop (cost=20.
-> Hash Join (cost=20.
-> Index Scan using call_number_pkey on call_number cn (cost=0.00..5.44 rows=1 width=16)
-> Function Scan on generate_series s (cost=0.00..12.50 rows=1000 width=4)
(22 rows)
with
evergreen=# explain SELECT 1 explode_ array( ARRAY[1, 2] ) ) explode_ array( ARRAY[2, 3] ) )
QUERY PLAN ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------ 00..9688659. 75 rows=187850 width=0) '::integer[ ])[s.s] ) 00..4138521. 12 rows=375701 width=4) 00..4511. 81 rows=300396 width=8)
-> HashAggregate (cost=15.00..17.00 rows=200 width=4)
- > Function Scan on generate_series s (cost=0.00..12.50 rows=1000 width=4)
-> Index Scan using asset_call_ number_ record_ idx on call_number cn (cost=0.00..22.41 rows=5 width=16)
Index Cond: (cn.record = ('{2,3} '::integer[ ])[s.s] )
Filter: (NOT cn.deleted)
-> Index Scan using cp_avail_cn_idx on copy cp (cost=0.00..11.43 rows=4 width=12)
Index Cond: (cp.call_number = cn.id)
Filter: (NOT cp.deleted)
-> Index Scan using unit_avail_cn_idx on unit cp (cost=0.00..2.27 rows=1 width=12)
Index Cond: (cp.call_number = cn.id)
Filter: (NOT cp.deleted)
evergreen-# FROM asset.call_number cn
evergreen-# JOIN asset.copy cp ON (cp.call_number = cn.id)
evergreen-# WHERE NOT cn.deleted
evergreen-# AND NOT cp.deleted
evergreen-# AND cp.circ_lib IN ( SELECT * FROM search.
evergreen-# AND cn.record IN ( SELECT * FROM search.
evergreen-# LIMIT 1;
-------
Limit (cost=15.00..66.58 rows=1 width=0)
-> Nested Loop Semi Join (cost=15.
Join Filter: (cp.circ_lib = ('{1,2}
-> Nested Loop (cost=15.
Join Filter: (cn.id = cp.call_number)
-> Nested Loop (cost=15.
-> Append (cost=0.00..13.70 rows=5 width=12)
-> Function Scan on generate_series s (cost=0.00..12.50 rows=1000 width=4)
(19 rows)
This is issue is resolved in 2.1+ with the changes to query_parser_fts done as part of the multi-homed support, but I'll push a fix for rel_2_0. Obviously, this micro-optimization, although it is significant in George's particular situation and possibly for other 2.0.x users, doesn't address the "make searching more and more faster" issue in the general case.