Launchpad itself

fti.py wants to lock all replication sets

Bug #435674 reported by Stuart Bishop on 2009-09-24

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Launchpad itself	Won't Fix	High	Stuart Bishop	Launchpad itself 3.1.11

Bug Description

fti.py is attempting to lock all replication sets when it is run. This is bad, as it will either deadlock with the SSO servers and fail, or succeed and block the SSO servers from running while it updates the fti indexes.

Tags:

Revision history for this message

Stuart Bishop (stub) wrote on 2009-09-24:

postgres@wildcherry:~/launchpad/database/schema$ ./fti.py

postgres@wildcherry:~/launchpad/database/schema$ ./fti.py

2009-09-23 23:06:20 INFO    No need to rebuild full text index on archive
2009-09-23 23:06:20 INFO    No need to rebuild full text index on bug
2009-09-23 23:06:20 INFO    No need to rebuild full text index on bugtask
2009-09-23 23:06:20 INFO    No need to rebuild full text index on binarypackagerelease
2009-09-23 23:06:20 INFO    No need to rebuild full text index on cve
2009-09-23 23:06:20 INFO    No need to rebuild full text index on distribution
2009-09-23 23:06:20 INFO    No need to rebuild full text index on distributionsourcepackagecache
2009-09-23 23:06:20 INFO    No need to rebuild full text index on distroseriespackagecache
2009-09-23 23:06:20 INFO    No need to rebuild full text index on faq
2009-09-23 23:06:20 INFO    No need to rebuild full text index on message
2009-09-23 23:06:20 INFO    No need to rebuild full text index on messagechunk
2009-09-23 23:06:20 INFO    No need to rebuild full text index on person
2009-09-23 23:06:20 INFO    No need to rebuild full text index on product
2009-09-23 23:06:20 INFO    No need to rebuild full text index on productreleasefile
2009-09-23 23:06:20 INFO    No need to rebuild full text index on project
2009-09-23 23:06:20 INFO    No need to rebuild full text index on shippingrequest
2009-09-23 23:06:20 INFO    No need to rebuild full text index on specification
2009-09-23 23:06:20 INFO    No need to rebuild full text index on question
2009-09-23 23:06:20 INFO    Executing generated SQL using slonik
/tmp/slonikpX2jNU.sk:32: PGRES_FATAL_ERROR select "_sl".ddlScript_prepare(1, -1);  - ERROR:  deadlock detected
DETAIL:  Process 19773 waits for AccessExclusiveLock on relation 876869 of database 876387; blocked by process 10280.
Process 10280 waits for AccessShareLock on relation 876880 of database 876387; blocked by process 19773.
CONTEXT:  SQL statement "lock table "public"."account" in access exclusive mode"
PL/pgSQL function "altertablerestore" line 47 at EXECUTE statement
SQL statement "SELECT  "_sl".alterTableRestore(tab_id) from "_sl".sl_table where tab_set in (select set_id from "_sl".sl_set where set_origin = "_sl".getLocalNodeId('_sl'))"
PL/pgSQL function "ddlscript_prepare" line 34 at PERFORM
2009-09-23 23:06:29 ERROR   slonik script failed

Stuart Bishop (stub) on 2009-09-24

Changed in launchpad-foundations:
status:	New → Triaged
importance:	Undecided → High
assignee:	nobody → Stuart Bishop (stub)
milestone:	none → 3.1.10

Revision history for this message

Stuart Bishop (stub) wrote on 2009-10-01:

"""In Slony-I version 1.0, this would only lock the tables in the specified replication set. As of 1.1, all replicated tables are locked (e.g. - triggers are removed at the start, and restored at the end). This deals with the risk that one might request DDL changes on tables in multiple replication sets."""

So this is unfortunately expected behavior and throws a spanner into our upgrade procedures.

Revision history for this message

Stuart Bishop (stub) wrote on 2009-10-07:

I believe this behavior stops being a problem if we move the master for the auth replication set to a dedicated server that does not have a replica of the lpmain replication set.

In this setup, the authdb tables will get locked on the Launchpad databases (launchpad_prod*). They will not be locked on the standalone database used for read only mode, as it is stand alone. They will not be locked on the authdb replication set master as there is no lpmain replica there.

Revision history for this message

Francis J. Lacoste (flacoste) wrote on 2009-10-07: Re: [Bug 435674] Re: fti.py wants to lock all replication sets

On October 7, 2009, Stuart Bishop wrote:
> I believe this behavior stops being a problem if we move the master for
> the auth replication set to a dedicated server that does not have a
> replica of the lpmain replication set.
>
> In this setup, the authdb tables will get locked on the Launchpad
> databases (launchpad_prod*). They will not be locked on the standalone
> database used for read only mode, as it is stand alone. They will not be
> locked on the authdb replication set master as there is no lpmain
> replica there.
>

The plan was to still have a replica of the lpmain replication set on the auth
master DB. Otherwise, we'll need to set-up another separate launchpad replica
for the ShipIt/OpenID servers.

--
Francis J. Lacoste
<email address hidden>

Curtis Hovey (sinzui) on 2009-11-11

Changed in launchpad-foundations:
milestone:	3.1.10 → 3.1.11

Revision history for this message

Stuart Bishop (stub) wrote on 2009-12-08:

The current situation seems to behaving 'good enough' until the launchpad/auth split is finished early next year.

Changed in launchpad-foundations:
status:	Triaged → Won't Fix

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.