Person-picker takes a very long time consistently due to missing full text index

Bug #618356 reported by Robert Collins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Stuart Bishop

Bug Description

Sprint:+huge-vocabulary is showing up on https://devpad.canonical.com/~stub/ppr/lpnet/latest-daily-timeout-candidates.html as taking a very long time - 38 seconds - with little variation.

Specification:+huge-vocabulary
OOPS-1962DY31 shows that this is an expensive search to pick a person.

Related branches

Deryck Hodge (deryck)
affects: launchpad → blueprint
tags: added: pg83
Curtis Hovey (sinzui)
tags: added: sprints
Changed in launchpad:
importance: High → Critical
Curtis Hovey (sinzui)
description: updated
Changed in launchpad:
assignee: nobody → Launchpad Teal Squad (launchpad-teal-squad)
summary: - Sprint:+huge-vocabulary takes a very long time consistently
+ Sprint:+huge-vocabulary/Specification:+huge-vocabulary takes a very long
+ time consistently
summary: - Sprint:+huge-vocabulary/Specification:+huge-vocabulary takes a very long
- time consistently
+ Person-picker takes a very long time consistently
description: updated
Curtis Hovey (sinzui)
tags: added: disclosure
Revision history for this message
Curtis Hovey (sinzui) wrote : Re: Person-picker takes a very long time consistently

I should note that the picker timed out searching for the user (who searched on his full name) because he wanted to assign himself as the driver. Maybe all fields role fields: owner, driver, bug_supervisor, security_contact, assignee, drafter, approver, reviewer should have an action to assign yourself. Bugtasks and Mp do this already.

Revision history for this message
Curtis Hovey (sinzui) wrote :
Download full text (4.3 KiB)

This query took 8269ms.

SELECT Person.account,
       Person.creation_comment,
       Person.creation_rationale,
       Person.datecreated,
       Person.defaultmembershipperiod,
       Person.defaultrenewalperiod,
       Person.displayname,
       Person.hide_email_addresses,
       Person.homepage_content,
       Person.icon,
       Person.id,
       Person.logo,
       Person.mailing_list_auto_subscribe_policy,
       Person.merged,
       Person.mugshot,
       Person.name,
       Person.personal_standing,
       Person.personal_standing_reason,
       Person.registrant,
       Person.renewal_policy,
       Person.subscriptionpolicy,
       Person.teamdescription,
       Person.teamowner,
       Person.verbose_bugnotifications,
       Person.visibility
FROM
  (SELECT DISTINCT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname, Person.hide_email_addresses, Person.homepage_content, Person.icon, Person.id, Person.logo, Person.mailing_list_auto_subscribe_policy, Person.merged, Person.mugshot, Person.name, Person.personal_standing, Person.personal_standing_reason, Person.registrant, Person.renewal_policy, Person.subscriptionpolicy, Person.teamdescription, Person.teamowner, Person.verbose_bugnotifications, Person.visibility, Person.name = 'david.wonderly'
   FROM (
           (SELECT Person.account, Person.creation_comment, Person.creation_rationale, Person.datecreated, Person.defaultmembershipperiod, Person.defaultrenewalperiod, Person.displayname, Person.hide_email_addresses, Person.homepage_content, Person.icon, Person.id, Person.logo, Person.mailing_list_auto_subscribe_policy, Person.merged, Person.mugshot, Person.name, Person.personal_standing, Person.personal_standing_reason, Person.registrant, Person.renewal_policy, Person.subscriptionpolicy, Person.teamdescription, Person.teamowner, Person.verbose_bugnotifications, Person.visibility
            FROM Person
            LEFT JOIN EmailAddress ON EmailAddress.person = Person.id
            WHERE Person.id IN
                (SELECT id
                 FROM
                   (SELECT Person.id, 100 AS rank
                    FROM Person
                    WHERE name = 'david.wonderly'
                    UNION ALL SELECT Person.id, rank(fti, ftq('david.wonderly'))
                    FROM Person
                    WHERE Person.fti @@ ftq('david.wonderly')
                    UNION ALL SELECT Person.id, 10 AS rank
                    FROM Person, IrcId
                    WHERE IrcId.person = Person.id
                      AND LOWER(IrcId.nickname) = 'david.wonderly'
                    UNION ALL SELECT Person.id, 1 AS rank
                    FROM Person, EmailAddress
                    WHERE EmailAddress.person = Person.id
                      AND LOWER(email) LIKE 'david.wonderly' || '%%'
                      AND EmailAddress.status IN (2, 4)) AS public_subquery
                 ORDER BY rank DESC LIMIT 100)
              AND Person.visibility = 1
              AND Person.merged IS NULL
              AND (NOT (Person.teamowner IS NULL)
                   OR EmailAdd...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (6.7 KiB)

explain from qastaging:

 Limit (cost=145627.02..145627.04 rows=7 width=266)
   -> Sort (cost=145627.02..145627.27 rows=101 width=266)
         Sort Key: ((person.name = 'david.wonderly'::text)), person.displayname, person.name
         -> Subquery Scan person (cost=145622.57..145625.10 rows=101 width=266)
               -> HashAggregate (cost=145622.57..145623.84 rows=101 width=266)
                     -> Subquery Scan person (cost=145613.74..145616.01 rows=101 width=266)
                           -> HashAggregate (cost=145613.74..145614.75 rows=101 width=755)
                                 -> Append (cost=144443.37..145607.42 rows=101 width=755)
                                       -> Nested Loop Left Join (cost=144443.37..145182.56 rows=100 width=755)
                                             Filter: ((NOT (public.person.teamowner IS NULL)) OR (public.emailaddress.status = 4))
                                             -> Nested Loop (cost=144443.37..145080.28 rows=100 width=755)
                                                   -> HashAggregate (cost=144443.37..144444.37 rows=100 width=4)
                                                         -> Limit (cost=144441.87..144442.12 rows=100 width=8)
                                                               -> Sort (cost=144441.87..144458.39 rows=6608 width=8)
                                                                     Sort Key: public_subquery.rank
                                                                     -> Subquery Scan public_subquery (cost=0.00..144189.32 rows=6608 width=8)
                                                                           -> Append (cost=0.00..144123.24 rows=6608 width=4)
                                                                                 -> Subquery Scan "*SELECT* 1" (cost=0.00..6.42 rows=1 width=4)
                                                                                       -> Index Scan using person__name__key on person (cost=0.00..6.41 rows=1 width=4)
                                                                                             Index Cond: (name = 'david.wonderly'::text)
                                                                                 -> Seq Scan on person (cost=0.00..44384.20 rows=5 width=54)
                                                                                       Filter: ((fti)::tsvector @@ '''david'' & ''wonder'' | ''davidwond'''::tsquery)
                                                                                 -> Subquery Scan "*SELECT* 3" (cost=0.00..1363.76 rows=125 width=4)
                                                                                       -> Nested Loop (cost=0.00..1362.51 rows=125 width=4)
                                                                                             -> Seq Scan on ircid (cost=0.00..570.19 rows=125 width=4)
                                                                                                   Filter: (lower(nickname) = 'david.wonderly'::text)
                                                                                             -> ...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

This is the major estimated cost:

-> Subquery Scan "*SELECT* 4" (cost=578.23..98368.80 rows=6477 width=4)
        -> Nested Loop (cost=578.23..98304.03 rows=6477 width=4)
             -> Bitmap Heap Scan on emailaddress (cost=578.23..31482.49 rows=22092 width=4)
                   Filter: ((status = ANY ('{2,4}'::integer[])) AND (lower(email) ~~ 'david.wonderly%%'::text))
                    -> Bitmap Index Scan on emailaddress__lower_email__key (cost=0.00..572.71 rows=22376 width=0)
                           Index Cond: ((lower(email) >= 'david.wonderly'::text) AND (lower(email) < 'david.wonderlz'::text))
              -> Index Scan using person_pkey on person (cost=0.00..3.01 rows=1 width=4)
                    Index Cond: (public.person.id = public.emailaddress.person)

but grabbing just that row is 36ms, so the estimate is bogus - note the row count projected (22K) vs actual (1)

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (8.7 KiB)

explain analyze (mostly cold) on qastaging:

 Limit (cost=145627.02..145627.04 rows=7 width=266) (actual time=5228.161..5228.163 rows=1 loops=1)
   -> Sort (cost=145627.02..145627.27 rows=101 width=266) (actual time=5228.158..5228.159 rows=1 loops=1)
         Sort Key: ((person.name = 'david.wonderly'::text)), person.displayname, person.name
         Sort Method: quicksort Memory: 25kB
         -> Subquery Scan person (cost=145622.57..145625.10 rows=101 width=266) (actual time=5227.951..5227.952 rows=1 loops=1)
               -> HashAggregate (cost=145622.57..145623.84 rows=101 width=266) (actual time=5227.937..5227.938 rows=1 loops=1)
                     -> Subquery Scan person (cost=145613.74..145616.01 rows=101 width=266) (actual time=5227.908..5227.910 rows=1 loops=1)
                           -> HashAggregate (cost=145613.74..145614.75 rows=101 width=755) (actual time=5227.872..5227.874 rows=1 loops=1)
                                 -> Append (cost=144443.37..145607.42 rows=101 width=755) (actual time=4925.668..5227.847 rows=1 loops=1)
                                       -> Nested Loop Left Join (cost=144443.37..145182.56 rows=100 width=755) (actual time=4925.667..4925.671 rows=1 loops=1)
                                             Filter: ((NOT (public.person.teamowner IS NULL)) OR (public.emailaddress.status = 4))
                                             -> Nested Loop (cost=144443.37..145080.28 rows=100 width=755) (actual time=4899.215..4899.218 rows=1 loops=1)
                                                   -> HashAggregate (cost=144443.37..144444.37 rows=100 width=4) (actual time=4899.189..4899.190 rows=1 loops=1)
                                                         -> Limit (cost=144441.87..144442.12 rows=100 width=8) (actual time=4899.172..4899.174 rows=3 loops=1)
                                                               -> Sort (cost=144441.87..144458.39 rows=6608 width=8) (actual time=4899.172..4899.172 rows=3 loops=1)
                                                                     Sort Key: public_subquery.rank
                                                                     Sort Method: quicksort Memory: 25kB
                                                                     -> Subquery Scan public_subquery (cost=0.00..144189.32 rows=6608 width=8) (actual time=14.238..4899.083 rows=3 loops=1)
                                                                           -> Append (cost=0.00..144123.24 rows=6608 width=4) (actual time=14.237..4899.080 rows=3 loops=1)
                                                                                 -> Subquery Scan "*SELECT* 1" (cost=0.00..6.42 rows=1 width=4) (actual time=14.236..14.239 rows=1 loops=1)
                                                                                       -> Index Scan using person__name__key on person (cost=0.00..6.41 rows=1 width=4) (actual time=14.210..14.213 rows=1 loops=1)
                                                                                             Index Cond: (name = 'david.wonderly'::text), Person.defaultrenewalperiod, Person.displayname, Person.hide_email_add...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

Digging in - SELECT id
                FROM
                  (SELECT Person.id, 100 AS rank
                   FROM Person
                   WHERE name = 'david.wonderly'
                   UNION ALL SELECT Person.id, rank(fti, ftq('david.wonderly'))
                   FROM Person
                   WHERE Person.fti @@ ftq('david.wonderly')
                   UNION ALL SELECT Person.id, 10 AS rank
                   FROM Person, IrcId
                   WHERE IrcId.person = Person.id
                     AND LOWER(IrcId.nickname) = 'david.wonderly'
                   UNION ALL SELECT Person.id, 1 AS rank
                   FROM Person, EmailAddress
                   WHERE EmailAddress.person = Person.id
                     AND LOWER(email) LIKE 'david.wonderly' || '%%'
                     AND EmailAddress.status IN (2, 4)) AS public_subquery
                ORDER BY rank DESC LIMIT 100;
   id
---------
 3020039
 3020039
 3020039
(3 rows)

Time: 1149.682 ms

taking out sections:

 SELECT id
                FROM
                  (SELECT Person.id, rank(fti, ftq('david.wonderly'))
                   FROM Person
                   WHERE Person.fti @@ ftq('david.wonderly')
                   UNION ALL SELECT Person.id, 10 AS rank
                   FROM Person, IrcId
                   WHERE IrcId.person = Person.id
                     AND LOWER(IrcId.nickname) = 'david.wonderly'
                   UNION ALL SELECT Person.id, 1 AS rank
                   FROM Person, EmailAddress
                   WHERE EmailAddress.person = Person.id
                     AND LOWER(email) LIKE 'david.wonderly' || '%%'
                     AND EmailAddress.status IN (2, 4)) AS public_subquery
                ORDER BY rank DESC LIMIT 100;
1135ms

SELECT id
                FROM
                  (SELECT Person.id, 10 AS rank
                   FROM Person, IrcId
                   WHERE IrcId.person = Person.id
                     AND LOWER(IrcId.nickname) = 'david.wonderly'
                   UNION ALL SELECT Person.id, 1 AS rank
                   FROM Person, EmailAddress
                   WHERE EmailAddress.person = Person.id
                     AND LOWER(email) LIKE 'david.wonderly' || '%%'
                     AND EmailAddress.status IN (2, 4)) AS public_subquery
                ORDER BY rank DESC LIMIT 100;
Time: 30.087 ms

So its the fti lookup.

Revision history for this message
Robert Collins (lifeless) wrote :

explain analyze SELECT Person.id, rank(fti, ftq('david.wonderly'))
                   FROM Person
                   WHERE Person.fti @@ ftq('david.wonderly');
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Seq Scan on person (cost=0.00..44384.20 rows=5 width=54) (actual time=1123.050..1150.860 rows=1 loops=1)
   Filter: ((fti)::tsvector @@ '''david'' & ''wonder'' | ''davidwond'''::tsquery)
 Total runtime: 1150.917 ms

Indexes:
    "person_pkey" PRIMARY KEY, btree (id) CLUSTER
    "person__account__key" UNIQUE, btree (account)
    "person__name__key" UNIQUE, btree (name)
    "person__icon__idx" btree (icon) WHERE icon IS NOT NULL
    "person__logo__idx" btree (logo) WHERE logo IS NOT NULL
    "person__merged__idx" btree (merged) WHERE merged IS NOT NULL
    "person__mugshot__idx" btree (mugshot) WHERE mugshot IS NOT NULL
    "person__registrant__idx" btree (registrant)
    "person__teamowner__idx" btree (teamowner) WHERE teamowner IS NOT NULL
    "person_datecreated_idx" btree (datecreated)
    "person_sorting_idx" btree (person_sort_key(displayname, name))

There is no person fti index.

Either we need to add one, or drop the fti component of this query.

summary: - Person-picker takes a very long time consistently
+ Person-picker takes a very long time consistently due to missing full
+ text index
Ian Booth (wallyworld)
Changed in launchpad:
assignee: Launchpad Teal Squad (launchpad-teal-squad) → Ian Booth (wallyworld)
status: Triaged → In Progress
Revision history for this message
Ian Booth (wallyworld) wrote :

The person fti index is being added. It was an oversight that it wasn't created.

Changed in launchpad:
assignee: Ian Booth (wallyworld) → Stuart Bishop (stub)
Revision history for this message
Ian Booth (wallyworld) wrote :

Hi stub, can you please close this bug when the person fti index has been created, if it hasn't already been done.
Thanks.

Stuart Bishop (stub)
Changed in launchpad:
status: In Progress → Fix Released
Curtis Hovey (sinzui)
tags: added: person-picker
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.