Launchpad itself

rosetta-pofile-stats.py needs optimizing

Bug #351011 reported by Stuart Bishop on 2009-03-29

24

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	Launchpad itself	Triaged	High	Unassigned
	Ubuntu Translations	New	Undecided	Unassigned

Bug Description

In today's staging run, rosetta-pofile-stats.py took 6 hours to run. This makes it impractical to perform as part of the staging rollout and impractical to run daily on production.

Tags:

Stuart Bishop (stub) on 2009-03-29

Changed in rosetta:
status:	New → Triaged

Revision history for this message

Данило Шеган (danilo) wrote on 2009-03-31:

#1

Can it not be done after staging roll-out (i.e. after staging has started up)? The extra load should not hurt on staging (it might even help emulate the load from production).

Not that we should not optimize it, of course.

Revision history for this message

Данило Шеган (danilo) wrote on 2009-03-31:

#2

The easy thing we could do: run updateStatistics only on pofiles that have changed in the last week or so (should cover cases where we miss one or two daily runs).

Revision history for this message

Stuart Bishop (stub) wrote on 2009-03-31:

#3

@danilo - it is being run after staging has started up. The problem is we are approaching 24 hours to do the rollout and run all the batch jobs - we will either need to run the post-update batch jobs in parallel (possibly causing load or locking issues), or update less frequently than daily.

Revision history for this message

Robert Collins (lifeless) wrote on 2011-01-11:

#4

@Danilo - is it still taking this long with recife, or have you fixed it in passing? I'm triaging to high to reflect the priority this has; if it already fixed thats obviously even better ;)

Changed in launchpad:
importance:	Undecided → High

Revision history for this message

Данило Шеган (danilo) wrote on 2011-01-11:

#5

It's taking even longer today (not related to recife). We do have a pofile-stats-daily (which only runs over a subset of PO files), but that's disabled because of the DBLoopTuner stalls on replication slave rebuilds (bug 622670).

Revision history for this message

Данило Шеган (danilo) wrote on 2011-01-11:

#6

Also bug 622668 about the actual pofile-stats-daily script being slow due to DBLoopTuner. It is also currently set to run over 7 days of PO files, but it should be enough to do it over 2 days of them if it runs daily (as was the plan).

I am pretty sure we should not aim to fix pofile-stats script performance which runs over all the PO files in our DB, because that simply can't scale.

The problem is that statistics get out of date in the first place even though we do try to keep them up-to-date after every update (we call pofile.updateStatistics after every update, and this might be fixed in recife branch because we are getting rid of multiple points where things are calculated: triggers, property validators and regular LP code). However, even fixing that won't be a full solution since we do need delayed statistics update for sharing PO files, which is where pofile-stats-daily is useful.

Revision history for this message

Stuart Bishop (stub) wrote on 2011-01-12: Re: [Bug 351011] Re: rosetta-pofile-stats.py needs optimizing

#7

On Wed, Jan 12, 2011 at 1:52 AM, Данило Шеган <email address hidden> wrote:

> It's taking even longer today (not related to recife). We do have a
> pofile-stats-daily (which only runs over a subset of PO files), but
> that's disabled because of the DBLoopTuner stalls on replication slave
> rebuilds (bug 622670).

Given we normally only rebuild a slave once per month, disabling seems
unnecessary.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

Revision history for this message

Robert Collins (lifeless) wrote on 2012-01-01:

#8

This has been exacerbated by the fastdowntime process which interrupts all db connections when it is executed.

Revision history for this message

Curtis Hovey (sinzui) wrote on 2012-10-24:

#9

I am making this critical because it the dupe was critical.

Changed in launchpad:
importance:	High → Critical

Revision history for this message

Curtis Hovey (sinzui) wrote on 2012-11-02:

#10

The script is now run once a week. It take between 2 and 3 days to complete. the log shows it is often blocked by "old xact process_death_row". While it sleeps for 10 minute, it is often blocked by for about 100 minutes to proceed. In one example, it was blocked for 8 hours.

Revision history for this message

Curtis Hovey (sinzui) wrote on 2013-02-07:

#11

Maybe we can queue all the locked operations so that the loop can continue. The queued items could be tried after the non-locked were completed, or just ignored (assuming they will be processed in the next run)

Airkm (airkm) on 2019-02-24

information type:

Public → Private

William Grant (wgrant) on 2019-02-25

information type:

Private → Public

Kristian Glass (doismellburning) on 2020-07-14

Changed in launchpad:
importance:	Critical → High

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.