We had some more fun cascading scanner failures this evening.
2011-09-14 09:00:04 INFO Creating lockfile: /var/lock/launchpad-runscanbranches.lock
2011-09-14 09:00:11 INFO Running through Twisted.
2011-09-14 09:00:15 INFO Running BranchScanJob (ID 9734101).
2011-09-14 09:00:26 INFO Running BranchScanJob (ID 9734103).
2011-09-14 09:01:02 INFO Creating lockfile: /var/lock/launchpad-runscanbranches.lock
2011-09-14 09:02:02 INFO Creating lockfile: /var/lock/launchpad-runscanbranches.lock
2011-09-14 09:03:02 INFO Creating lockfile: /var/lock/launchpad-runscanbranches.lock
2011-09-14 09:04:02 INFO Creating lockfile: /var/lock/launchpad-runscanbranches.lock
2011-09-14 09:05:03 INFO Creating lockfile: /var/lock/launchpad-runscanbranches.lock
2011-09-14 09:05:29 INFO Job resulted in OOPS: OOPS-2083SMS11
2011-09-14 09:05:32 INFO Running BranchScanJob (ID 9734104).
Unhandled error in Deferred:
Unhandled Error
Traceback (most recent call last):
Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 42.
Unhandled Error
[... snip the usual traceback ...]
File "/srv/bzrsyncd.launchpad.net/production/launchpad-rev-13921/lib/lp/services/job/runner.py", line 441, in update
if response['success']:
exceptions.TypeError: 'NoneType' object is unsubscriptable
2011-09-14 09:05:32 INFO Running BranchScanJob (ID 9734109).
Unhandled error in Deferred:
Unhandled Error
Traceback (most recent call last):
Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 42.
[... snip ...]
gmb noticed that his lp:~gmb/launchpad/scratch hadn't scanned, so pushed it again once the tracebacks had stopped. The tracebacks immediately started again. Examining the hanging and subsequent job IDs from the snippet above and the second incident, we get <https://pastebin.canonical.com/52731/>. 9734103 and 9734574 are the two that hung, and are both for branch 62300, which is lp:~gmb/launchpad/scratch.
Something about the job to bring that branch (which isn't new -- it was last scanned more than a year ago, and was originally older still) up to date causes the scanner to take more than 5 minutes. We may finally be able to track this down.
We had some more fun cascading scanner failures this evening.
2011-09-14 09:00:04 INFO Creating lockfile: /var/lock/ launchpad- runscanbranches .lock launchpad- runscanbranches .lock launchpad- runscanbranches .lock launchpad- runscanbranches .lock launchpad- runscanbranches .lock launchpad- runscanbranches .lock internet. error.ProcessTe rminated: A process has ended with a probable error condition: process ended with exit code 42. launchpad. net/production/ launchpad- rev-13921/ lib/lp/ services/ job/runner. py", line 441, in update 'success' ]: TypeError: 'NoneType' object is unsubscriptable internet. error.ProcessTe rminated: A process has ended with a probable error condition: process ended with exit code 42.
2011-09-14 09:00:11 INFO Running through Twisted.
2011-09-14 09:00:15 INFO Running BranchScanJob (ID 9734101).
2011-09-14 09:00:26 INFO Running BranchScanJob (ID 9734103).
2011-09-14 09:01:02 INFO Creating lockfile: /var/lock/
2011-09-14 09:02:02 INFO Creating lockfile: /var/lock/
2011-09-14 09:03:02 INFO Creating lockfile: /var/lock/
2011-09-14 09:04:02 INFO Creating lockfile: /var/lock/
2011-09-14 09:05:03 INFO Creating lockfile: /var/lock/
2011-09-14 09:05:29 INFO Job resulted in OOPS: OOPS-2083SMS11
2011-09-14 09:05:32 INFO Running BranchScanJob (ID 9734104).
Unhandled error in Deferred:
Unhandled Error
Traceback (most recent call last):
Failure: twisted.
Unhandled Error
[... snip the usual traceback ...]
File "/srv/bzrsyncd.
if response[
exceptions.
2011-09-14 09:05:32 INFO Running BranchScanJob (ID 9734109).
Unhandled error in Deferred:
Unhandled Error
Traceback (most recent call last):
Failure: twisted.
[... snip ...]
gmb noticed that his lp:~gmb/launchpad/scratch hadn't scanned, so pushed it again once the tracebacks had stopped. The tracebacks immediately started again. Examining the hanging and subsequent job IDs from the snippet above and the second incident, we get <https:/ /pastebin. canonical. com/52731/>. 9734103 and 9734574 are the two that hung, and are both for branch 62300, which is lp:~gmb/launchpad/scratch.
Something about the job to bring that branch (which isn't new -- it was last scanned more than a year ago, and was originally older still) up to date causes the scanner to take more than 5 minutes. We may finally be able to track this down.