authority indexes can fail on Postgres 9.3.0

Bug #1253163 reported by Galen Charlton
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fix Released

Bug Description

We have observed that running a REINDEX TABLE on authority.record_entry can fail when running Evergreen on PostgreSQL 9.3.0:

evergreen=# select count(*) from authority.record_entry;
(1 row)

evergreen=# reindex table authority.record_entry ;
ERROR: could not read block 0 in file "base/16385/82148": read only 0 of 8192 bytes
CONTEXT: SQL statement "SELECT control_set FROM authority.record_entry WHERE id = auth_id"
PL/pgSQL function authority.normalize_heading(text,boolean) line 14 at SQL statement

The by_heading* indexes on authority.record_entry are the stock, non-unique BTREE ones.

Steps were taken to confirm that this wasn't simply index corruption specific to the test database, including dropping and recreating the by_heading* indexes, dropping and recreating both the normal_heading() functions and the indexes, and starting with a completely fresh 9.3 cluster. In all cases, 'reindex table authority.record_entry' would fail.

This problem is not observed with Postgres 9.1 and 9.2. Preliminary testing on our end suggests that converting the indexes to either hash or GIST avoids whatever seems to be wrong with the BTREE version of the indexes.

This is most probably a PostgreSQL bug, but I'm opening an Evergreen LP as well.

Evergreen 2.3, 2.4, and 2.5
PostgreSQL 9.3.0
Debian Wheezy

Galen Charlton (gmc)
tags: added: postgresql
Revision history for this message
Chris Sharp (chrissharp123) wrote :


I can confirm the issue on a server running a recent version of master.

Evergreen master
PostgreSQL 9.3.1
Ubuntu 12.04 LTS

Changed in evergreen:
status: New → Confirmed
Revision history for this message
Mike Rylander (mrylander) wrote :

While this needs to be reported up-stream to the Pg team, I think we should address it locally as well, in case the fix is a while coming. I see two options, each with a tradeoff:

  1) Switch to pgtrm-based gist or gin indexes. We lose the unique check -- so sayeth the pg docs: Only B-tree currently supports unique indexes. -- but we can do that other ways (before trigger, perhaps an exclusion constraint). For example, after "create extension pg_trgm;":
    - drop index authority.unique_by_heading_and_thesaurus; create index by_heading_and_thesaurus on authority.record_entry using gist (authority.normalize_heading(marc) gist_trgm_ops) WHERE deleted IS FALSE OR deleted = false;
    - drop index authority.by_heading; create index by_heading on authority.record_entry using gist (authority.simple_normalize_heading(marc) gist_trgm_ops) WHERE deleted IS FALSE OR deleted = false;

  2) Add two new columns to authority.record_entry to contain the two heading variants we need. Populate them at ingest time in a BEFORE trigger. Create the indexes we want directly on the column data.

(1) has the benefit of using less space, and hewing closer to how things work today -- less code and schema churn.
(2) has the benefit of no new deps (pg_trgm extention) and the the availability of the heading strings for other purposes -- though I can think of none right now -- and arguably simpler schema and query construction.


Yamil (ysuarez)
tags: added: authority
Changed in evergreen:
assignee: nobody → Mike Rylander (mrylander)
Revision history for this message
Mike Rylander (mrylander) wrote :
tags: added: pullrequest
Changed in evergreen:
assignee: Mike Rylander (mrylander) → nobody
milestone: none → 2.6.0-rc1
importance: Undecided → Critical
Revision history for this message
Mike Rylander (mrylander) wrote :

I've updated this branch after testing the upgrade script. Some obvious copy/paste-o's in there.

Dan Wells (dbw2)
tags: added: 2.6-rc-blocker
Revision history for this message
Dan Wells (dbw2) wrote :

I found this recent bug report which seemed relevant and interesting:

According to the reply there, this isn't really a PG bug, since we are lying about the immutability of this function when we do the select inside it.

I wonder, since the select on the same table is the one which dies, if we might be able to get around this by passing in the control_set value (which we certainly have in the context of this index creation). We'd still be lying, since we do other selects in there, but we'd be lying less, and it might be an alternate means of solving the problem.

On the other hand, what Mike has done here is more correct and foolproof, and is along the lines of what is suggested in the thread. If we go with these changes (and I'm expecting we will), should we also drop the fake immutability of this function? I can see that biting us in cases where we attempt to change authority configurations (a rare case, but likely to happen at times).

Revision history for this message
Mike Rylander (mrylander) wrote :

Good find, as usual, Dan!

So, indeed, it's the fact that we're lying to PG that is causing the pain. I'm surprised that the index in question is being consulted during the select that claims to fail, but that is really beside the point. I can't find a single smoking gun in the release notes, and the changelog for 9.3 is, um, long, but I suspect that a combination of visibility map changes and general index infrastructure improvements lead to a situation where PG needs to be able to trust our promises more than in the past.

If it's an open question as to whether we should address the underlying lie directly -- and I think it's pretty well answered by Tom Lane's response to the poster of a similar issue on the Pg bugs list -- I'm strongly inclined to spend the disk space on materialized, indexable values, else eventually we'll run afoul of this or something very much like it in the future.

The immutability of that function exists solely to allow it to be used in an index expression, so dropping that and calling it STABLE instead seems correct. Adding STRICT would be correct, too, as there's no reasonable non-NULL output for NULL input.

As for the situation when authority config is modified -- that's something we certainly need to discuss, but on another bug, I think. It's is related not only to heading definition changes, but to authority merge (and the associated timeouts) and, more broadly, reingest of both authority and bib data, due to schema or config changes, or batch updates. I sketched a couple ideas over on bug 1193490, but here and now is probably not the time to address that larger challenge...

Revision history for this message
Dan Wells (dbw2) wrote :

Mike, thanks for the comment. I only brought up the config question as possible further reason to change to STABLE, since the current immutability could potentially interfere with newer configurations being picked up.

It sounds like we agree to go ahead with the materialization being done in this branch, but with recognition that this is now a long-term change, since the PG behavior may never go back to the old, looser way.

It looks like we also agree that the functions should be lowered to STABLE. That isn't strictly part of this bug, so do we prefer to do that as part of this bug (just being opportunistic), do we open a separate bug for that, or do we roll that change into the bigger picture merge/reingest work being done?

Revision history for this message
Mike Rylander (mrylander) wrote :

IMO, we should go ahead and mark it STABLE and STRICT now. If we have cause to use the function more than once in a query, there's no need to recalculate the output for the same parameters. (In fact, we wouldn't want that anyway.) Since we're knocking around this function's uses in this bug, doing it here is probably less surprising (or, at least more context-y) than in a follow-on script doing general work.

Revision history for this message
Dan Wells (dbw2) wrote :

Okay, signed off on Mike's bits, and added a couple more commits:

1) Move functions from IMMUTABLE to STABLE STRICT, as discussed.

2) Not sure about this one, so feel free to leave out, but make a best-effort attempt to recreate the UNIQUE index we dropped.;a=shortlog;h=refs/heads/user/dbwells/pg-93-authorities-index-rebase

Revision history for this message
Ben Shum (bshum) wrote :

Thanks to Dan for walking me through this. This code is committed to master.

Changed in evergreen:
status: Confirmed → Fix Committed
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.