Comment 9 for bug 1438136

Revision history for this message
Mike Rylander (mrylander) wrote :

No problem.

I'm working with a comparable data set based on your estimates and actual costs, so I think I have a good handle on it now. What I'm seeing is that the statistics for the array are actually really good, but the planner is not, for some reason, noticing that. For instance:

# explain analyze select count(*) from metabib.record_attr_vector_list where vlist @@ '(610)';
                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=3809.32..3809.33 rows=1 width=0) (actual time=902.652..902.652 rows=1 loops=1)
   -> Bitmap Heap Scan on record_attr_vector_list (cost=38.79..3803.90 rows=2167 width=0) (actual time=342.924..724.453 rows=1268532 loops=1)
         Recheck Cond: (vlist @@ '610'::query_int)
         -> Bitmap Index Scan on mrca_vlist_idx (cost=0.00..38.25 rows=2167 width=0) (actual time=333.671..333.671 rows=1486582 loops=1)
               Index Cond: (vlist @@ '610'::query_int)
 Total runtime: 902.712 ms
(6 rows)

Even with no constraining join conditions, it still wants to use a bitmap index scan. However, the stats say that the value 610 is in is in approximately 73% of the rows in that table:

# select row_number from (select unnest, row_number() over () from (select unnest(most_common_elems::text::int[]) from pg_stats where tablename = 'record_attr_vector_list' and attname = 'vlist')x)y where unnest=610;
 row_number
------------
        326
(1 row)

# select most_common_elem_freqs[326] from pg_stats where tablename = 'record_attr_vector_list' and attname = 'vlist'; most_common_elem_freqs
------------------------
                 0.7364
(1 row)

So, either it intarray-supplied operators are not using the stats, or Pg has a bug or deficiency. Either way, I'm continuing to look for a way to construct the query such that the stats are used.

More as the story develops...