Language pack exporter runs long; triggers alerts

Bug #684664 reported by Jeroen T. Vermeulen
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Confirmed
High
Unassigned

Bug Description

Full language-pack exports have gotten too slow again. A recent run terminated with the log message "Killed" (so no OOM kill; that would have produced a traceback). And we're getting regular alarms about the script not running, possibly just because of the script being too quiet.

A full pack for Natty is 78196 POFiles and takes about 17 hours. In the ongoing run of about 13 hours old, it's spent about ¼ of its time using CPU and is currently running at about ⅓.

Trying to catch ongoing database activity is a bit haphazard, but so far Tom hasn't spotted any.

description: updated
summary: - Language pack exporter runs long, dies
+ Language pack exporter runs long; triggers alerts
Tom Haddon (mthaddon)
tags: added: canonical-losa-lp
Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 684664] Re: Language pack exporter runs long; triggers alerts

Doesn't OOM kill take down the process immediately, no signal?

I thought it was 'as vicious as it can get'.

Changed in launchpad:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Данило Шеган (danilo) wrote :

This should be critical if it's indeed a problem. I am getting https://lpstats.canonical.com/graphs/LangpackRSS updated to include latest language pack generation runs as well, and if this continues happening (the last few weeks seem to have been quiet), I'll re-open and we'll have to investigate.

Revision history for this message
Данило Шеган (danilo) wrote :

Seems to be becoming more of an issue: pasting my comment from bug 794573.

Full language pack exports are starting to approach 24h again: the last run took 21h32min (
/srv/launchpad.net-logs/scripts/loganberry/rosetta/language-pack-exporter.log-20110607.gz on devpad) for 84770 PO files, and the current one is on PO file 63690 of 84567 after 15h (the last one was at about the same place after 15h).

We used to be able to produce full language packs of ~65k PO files in 8h if my memory serves me correctly. So, other than the grow in scale, we seem to also be roughly two times slower.

If we go over 24h it might turn out to be a bigger issue because sometimes different exports are scheduled for consecutive days, and that means either running two at the same time or failing due to the lock (not sure if we make the lock Ubuntu version specific).

Revision history for this message
William Grant (wgrant) wrote :

Query counts seem to be roughly sensibly scaling by template, except for one thing. There are hundreds of calls per template like this, each executing a (reasonably quick) query.

0-2@SQL-main-master
           SELECT DISTINCT POTemplate.source_file_format
             FROM TranslationTemplateItem
                  JOIN POTemplate
                    ON POTemplate.id = TranslationTemplateItem.potemplate
             WHERE TranslationTemplateItem.potmsgset = 7754677

[...]
  File "/srv/launchpad.net/codelines/trunk/lib/lp/translations/model/pofile.py", line 1662, in __init__
    self.messages = self._getMessages()
  File "/srv/launchpad.net/codelines/trunk/lib/lp/translations/model/pofile.py", line 1794, in _getMessages
    msgset.singular_text = row.potmsgset.singular_text
  File "/srv/launchpad.net/codelines/trunk/lib/lp/translations/model/potmsgset.py", line 241, in singular_text
    if self.uses_english_msgids:
  File "/srv/launchpad.net/codelines/trunk/lib/lp/translations/model/potmsgset.py", line 216, in uses_english_msgids
    self._conflictsExistingSourceFileFormats())
  File "/srv/launchpad.net/codelines/trunk/lib/lp/translations/model/potmsgset.py", line 187, in _conflictsExistingSourceFileFormats
    cur.execute(query)
[...]

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

That should be very easily cacheable in principle, though it could be complicated with existing code structures. The exporter is checking whether the template (the whole template — it's just picking a POTMsgSet as a sample) is XPI or regular gettext. If it's XPI, everything's a bit different: the singular text (what gettext calls the msgid) then needs to be retrieved as the English (“en”) translation instead of simply as POTMsgSet.msgid_singular.

Changed in launchpad:
status: Triaged → Fix Committed
assignee: nobody → miadawn79@gmail.com (miadawn79)
David Planella (dpm)
Changed in launchpad:
status: Fix Committed → Confirmed
assignee: miadawn79@gmail.com (miadawn79) → nobody
Tobias Scholz (scholzt7)
Changed in launchpad:
status: Confirmed → Opinion
assignee: nobody → Tobias Scholz (scholzt7)
Changed in launchpad:
status: Opinion → Confirmed
assignee: Tobias Scholz (scholzt7) → nobody
no longer affects: ubuntu
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.