DistroSeries:+needs-packaging timeouts

Bug #722794 reported by Ursula Junque
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Robert Collins

Bug Description

 18 SELECT COUNT(*) FROM (SELECT DISTINCT SourcePackageName.id, SourcePackageName.name, coalesce(tota ... $INT) AND archive.purpose = $INT AND section.name != $STRING AND packaging.id IS NULL)) AS "_tmp":
   GET: 18 Robots: 9 Local: 9
      8 https://launchpad.net/ubuntu/natty/+needs-packaging (DistroSeries:+needs-packaging)
       OOPS-1907C1055, OOPS-1907C285, OOPS-1907D513, OOPS-1907E288, OOPS-1907F245
      5 https://answers.launchpad.net/ubuntu/karmic/+needs-packaging (DistroSeries:+needs-packaging)
       OOPS-1907D739, OOPS-1907E21, OOPS-1907G723, OOPS-1907J2432, OOPS-1907N622
      2 https://launchpad.net/ubuntu/maverick/+needs-packaging (DistroSeries:+needs-packaging)
       OOPS-1907A193, OOPS-1907L219
   [3 other URLs]

Branch: launchpad-rev-12632.1
Revno: 12633
SQL time: 10722 ms
Non-sql time: 361 ms
Total time: 11083 ms
Statement Count: 28

two slow queries
25. 456 6058ms SQL-launchpad-main-slave
SELECT COUNT(*)
FROM
  (SELECT DISTINCT SourcePackageName.id, SourcePackageName.name, coalesce(total_bug_heat, 0) + coalesce(po_messages, 0) + CASE
                                                                                                                              WHEN spr.component = 1 THEN 1000
                                                                                                                              ELSE 0
                                                                                                                          END AS score, coalesce(bug_count, 0) AS bug_count, coalesce(total_messages, 0) AS total_messages
   FROM SourcePackageName
   JOIN SourcePackageRelease spr ON SourcePackageName.id = spr.sourcepackagename
   JOIN SourcePackagePublishingHistory spph ON spr.id = spph.sourcepackagerelease
   JOIN archive ON spph.archive = Archive.id
   JOIN section ON spph.section = section.id
   JOIN DistroSeries ON spph.distroseries = DistroSeries.id
   LEFT JOIN Packaging ON SourcePackageName.id = Packaging.sourcepackagename
   AND Packaging.distroseries = DistroSeries.id
   LEFT JOIN DistributionSourcePackage dsp ON dsp.sourcepackagename = spr.sourcepackagename
   AND dsp.distribution = DistroSeries.distribution
   LEFT JOIN
     (SELECT POTemplate.sourcepackagename, POTemplate.distroseries, SUM(POTemplate.messagecount) * 0.5 AS po_messages, SUM(POTemplate.messagecount) AS total_messages
      FROM POTemplate
      WHERE POTemplate.sourcepackagename IS NOT NULL
        AND POTemplate.distroseries = 106
      GROUP BY POTemplate.sourcepackagename, POTemplate.distroseries) messages ON SourcePackageName.id = messages.sourcepackagename
   AND DistroSeries.id = messages.distroseries
   WHERE (DistroSeries.id = 106
          AND spph.status IN (1, 2)
          AND archive.purpose = 1
          AND section.name != 'translations'
          AND packaging.id IS NULL)) AS "_tmp"
26. 6516 4491ms SQL-launchpad-main-slave
SELECT DISTINCT SourcePackageName.id, SourcePackageName.name, coalesce(total_bug_heat, 0) + coalesce(po_messages, 0) + CASE
                                                                                                                           WHEN spr.component = 1 THEN 1000
                                                                                                                           ELSE 0
                                                                                                                       END AS score,
                                                                                                                       coalesce(bug_count, 0) AS bug_count,
                                                                                                                       coalesce(total_messages, 0) AS total_messages
FROM SourcePackageName
JOIN SourcePackageRelease spr ON SourcePackageName.id = spr.sourcepackagename
JOIN SourcePackagePublishingHistory spph ON spr.id = spph.sourcepackagerelease
JOIN archive ON spph.archive = Archive.id
JOIN section ON spph.section = section.id
JOIN DistroSeries ON spph.distroseries = DistroSeries.id
LEFT JOIN Packaging ON SourcePackageName.id = Packaging.sourcepackagename
AND Packaging.distroseries = DistroSeries.id
LEFT JOIN DistributionSourcePackage dsp ON dsp.sourcepackagename = spr.sourcepackagename
AND dsp.distribution = DistroSeries.distribution
LEFT JOIN
  (SELECT POTemplate.sourcepackagename, POTemplate.distroseries, SUM(POTemplate.messagecount) * 0.5 AS po_messages, SUM(POTemplate.messagecount) AS total_messages
   FROM POTemplate
   WHERE POTemplate.sourcepackagename IS NOT NULL
     AND POTemplate.distroseries = 106
   GROUP BY POTemplate.sourcepackagename,
            POTemplate.distroseries) messages ON SourcePackageName.id = messages.sourcepackagename
AND DistroSeries.id = messages.distroseries
WHERE (DistroSeries.id = 106
       AND spph.status IN (1,
                           2)
       AND archive.purpose = 1
       AND section.name != 'translations'
       AND packaging.id IS NULL)
ORDER BY score DESC, SourcePackageName.name LIMIT 21
OFFSET 20

Tags: qa-ok timeout

Related branches

description: updated
tags: added: dba
description: updated
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

I suspect this page has definitional problems:
 - its reporting needs packaging for frozen distributions (not interesting as they are frozen)
 - its reporting more detail than is useful - the number of translation strings and total bug heat isn't particularly helpful AFAICT: if we want to drive this to zero, the fact its non zero should be sufficient

That said, the performance of the page varies wildly: it just rendered: 28 queries/external actions issued in 0.92 seconds, but the refresh before that timed out. - partly explained by the use of memcache, but if the miss-case is slow, we will see repeated timeouts here.

Some possibilities:
 - move it to distro rather than distro series (because a /distro/ only has one moving target for packaging into)
 - strip out, or denormalise the expensive things to show.

Revision history for this message
Curtis Hovey (sinzui) wrote :

This page is only valuable for the series in development. We saw users using this page and we pondered adding a redirect to the current series.

I think placing this under the distro can be done now...the distribution source package now shares the packaging link code with the series source package. I would want to add the link to the copyright file to this page because I often need to visit the SP to be certain of the link to choose.

We have been adding denormalised data to the dsp schema object. We could add more, but I think we need to decide on its value first. Much of the denormalised data is used to sort the list of packages by most need (has the most open bug, translations, or is in main). The sorting has been ridiculous in the past, and I remain dissatisfied with it--users do not believe it is sorted now. We could sort it alphabetically, or maybe we could use pop-con.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 722794] Re: DistroSeries:+needs-packaging timeouts

Thanks for commenting Curtis. What do you think of the idea of simply
not sorting it - or sorting it by (in_main, in_universe,
in_multiverse, other)? if its really just a TODO list, making it show
fast and be easy to work with is sufficient to let it get acted on.

Revision history for this message
Curtis Hovey (sinzui) wrote :

I vote to remove the sort.

Users do not think it is sorted now. It will not be a loss. While some Ubuntu developers might be disappointed that packages that really need upstream links are not favoured, the reality is that users are ignoring these packages in the listing. users see this listing and think:
* This listing seems to be random
* Some of the items in the listing cannot possibly have an upstream:
* I do not know anything about these packages
* I clicked through to the page and there are no suggestions.
* This timed out after 5 pages and I give up because I could not find anything I knew I could link.

Most of the suggestions were linked in the first few months of the suggestion feature! We now need users to register the upstream. User link packages to upstream based on their need, not Ubuntu's need.

So unsorting the listing may help with the last item. Filtering out impossible package links could reduce the work of the db and app, but I really do not know how to identify a meta package in our schema.

Revision history for this message
Robert Collins (lifeless) wrote :

Ok cool; we'll need *a* sort so that we can still paginate, but it can
be much simpler. We can drop the related 'how important is this'
columns and eager load those rather than joining.

Revision history for this message
Stuart Bishop (stub) wrote :
Revision history for this message
Stuart Bishop (stub) wrote :

For the query in comment #6, the following variant runs consistantly at 0.6-1s on the master:

SELECT
    spn.id, spn.name,
    coalesce(total_bug_heat, 0)
    + coalesce(po_messages, 0)
    + CASE WHEN component = 1 THEN 1000 ELSE 0 END AS score,
    coalesce(bug_count, 0) AS bug_count,
    coalesce(total_messages, 0) AS total_messages
FROM
    SourcepackageName AS spn, (
    SELECT
        spr.sourcepackagename,
        spr.component,
        bug_count,
        total_bug_heat,
        SUM(POTemplate.messagecount) * 0.5 AS po_messages,
        SUM(POTemplate.messagecount) AS total_messages
    FROM
        SourcePackageRelease AS spr
        JOIN SourcePackagePublishingHistory AS spph
            ON spr.id = spph.sourcepackagerelease
        JOIN Archive
            ON spph.archive = Archive.id
        JOIN Section
            ON spph.section = Section.id
        JOIN DistroSeries
            ON spph.distroseries = DistroSeries.id
        LEFT OUTER JOIN DistributionSourcePackage AS dsp
            ON dsp.sourcepackagename = spr.sourcepackagename
                AND dsp.distribution = DistroSeries.distribution
        LEFT OUTER JOIN POTemplate
            ON POTemplate.sourcepackagename = spr.sourcepackagename
                AND POTemplate.distroseries = DistroSeries.id
    WHERE
        DistroSeries.id = 93
        AND spph.status IN (1, 2)
        AND Archive.purpose = 1
        AND Section.name <> 'translations'
        AND NOT EXISTS (
            SELECT TRUE FROM Packaging
            WHERE
                Packaging.sourcepackagename = spr.sourcepackagename
                AND Packaging.distroseries = spph.distroseries)
    GROUP BY
        spr.sourcepackagename, spr.component, bug_count, total_bug_heat
    ) AS whatever
WHERE spn.id = whatever.sourcepackagename
ORDER BY score DESC, spn.name LIMIT 21 OFFSET 0;

Revision history for this message
Stuart Bishop (stub) wrote :

The slow count() query seems pointless and probably a side effect of our batching code or similar.

tags: removed: dba
description: updated
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

So I think the speed thing is strongly cold-cache dependent. I'm going to have a fiddle and see if I can get a much cheaper query.

Revision history for this message
Robert Collins (lifeless) wrote :

stubs query is better - much faster hot; cold it was 9 seconds, but we should take an incremental improvement... transcribing now.

Revision history for this message
Robert Collins (lifeless) wrote :

query is missing distinct on (score, spn.name), even with that its a cool 1000ms though.

Changed in launchpad:
assignee: nobody → Robert Collins (lifeless)
Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
Changed in launchpad:
milestone: none → 11.04
tags: added: qa-needstesting
Changed in launchpad:
status: Triaged → Fix Committed
William Grant (wgrant)
tags: added: qa-ok
removed: qa-needstesting
William Grant (wgrant)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.