backend crashed with TypeError in read(): timeout must be a float or None

Bug #827859 reported by dmitrij-96@mail.ru
56
This bug affects 8 people
Affects Status Importance Assigned to Milestone
checkbox (Ubuntu)
Fix Released
High
Daniel Manrique

Bug Description

ubuntu 11.10 during system checking

ProblemType: Crash
DistroRelease: Ubuntu 11.10
Package: checkbox 0.12.4
ProcVersionSignature: Ubuntu 3.0.0-8.11-generic 3.0.1
Uname: Linux 3.0.0-8-generic i686
Architecture: i386
Date: Wed Aug 17 12:12:44 2011
ExecutablePath: /usr/share/checkbox/backend
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Alpha i386 (20110814)
InterpreterPath: /usr/bin/python2.7
PackageArchitecture: all
ProcCmdline: /usr/bin/python /usr/share/checkbox/backend --path=/usr/share/checkbox/scripts:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games /tmp/checkboxkjcxXX/input /tmp/checkboxkjcxXX/output
ProcEnviron:
 PATH=(custom, no user)
 LANG=ru_RU.UTF-8
 SHELL=/bin/bash
PythonArgs: ['/usr/share/checkbox/backend', '--path=/usr/share/checkbox/scripts:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games', '/tmp/checkboxkjcxXX/input', '/tmp/checkboxkjcxXX/output']
SourcePackage: checkbox
Title: backend crashed with TypeError in read(): timeout must be a float or None
UpgradeStatus: Upgraded to oneiric on 2011-08-17 (0 days ago)
UserGroups:

Related branches

Revision history for this message
dmitrij-96@mail.ru (dmitrij-96) wrote :
tags: removed: need-duplicate-check
Changed in checkbox (Ubuntu):
status: New → Confirmed
Daniel Manrique (roadmr)
visibility: private → public
Revision history for this message
Daniel Manrique (roadmr) wrote :

Hi,

Thank you for taking the time to report this bug and helping to make Ubuntu better.

I'm unable to replicate the error you're getting. If possible, could you please provide the sequence of steps you followed to get this error, and also let me know if you're still experiencing it, i.e. is it reproducible every time?

I located the section of code that triggered the failure and I'll be submitting a fix to prevent against this problem, but I'd be interested in seeing the real-world steps that led to the failure, as I can't see it happening under normal usage (and was, indeed, unable to recreate this behavior).

Changed in checkbox (Ubuntu):
importance: Undecided → High
Revision history for this message
Daniel Manrique (roadmr) wrote :

Setting to importance: High as it potentially "Has a severe impact on a small portion of Ubuntu users". Even if I was unable to reproduce it, the backend should not be prone to this kind of failure.

Revision history for this message
Daniel Manrique (roadmr) wrote :

The problem comes from this line in lib/process.py:

ready = select.select(tocheck, [], [], timeout)

according to the trace, an invalid timeout value was passed causing select.select to raise an exception. The timeout value itself comes from the message representing a job, and this is not validated by the Job class. Valid values for timeout are positive integers or floats, zero included, and "None". Note that "None" and 0 have special meanings so we can't just clobber "None" to "zero".

I'll propose an update to the Job class which preserves "None" but tries to convert anything else to a float, handling cleanly the case where a value can't be converted (meaning it was a string or something equally bizarre).

Changed in checkbox (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Daniel Manrique (roadmr)
Revision history for this message
Daniel Manrique (roadmr) wrote :

Steps to replicate, thanks to Paul Hoell in bug 833747.

1- install a language that may contain non-ascii characters (so far this has been observed in russian and german):
   sudo apt-get install language-pack-de language-pack-gnome-de
2- Launch checkbox using this installed language:
  LANG=de_AT.UTF-8 LANGUAGE=de_AT:de checkbox-gtk
3- Start a test run with all the tests. It's OK if you skip them, the problem gets triggered either way.

Expected result:
 - All tests run correctly and the run finishes.

Actual result:
- As seen in checkbox log file, any tests whose description contains non-ascii characters generate an exception (excerpt below).
- As soon as the first job specifying "user" gets dispatched to the backend, the backend crashes with the trace seen here. Since it crashed in the middle of executing a job, the frontend doesn't notice this condition and will wait forever. As Paul observed, the window eventually becomes unresponsive.

I won't set the bugs as duplicates as they potentially describe different issues and have to be addressed separately:

1- This bug (backend crashes when timeout has an invalid value).
2- bug 833747 (test execution fails when the description has non-ascii characters).

Daniel Manrique (roadmr)
Changed in checkbox (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package checkbox - 0.12.7

---------------
checkbox (0.12.7) oneiric; urgency=low

  New upstream release (LP: #850395):

  [Brendan Donegan]
  * Redirecting stderr to pipe to fix the gconf_resource script (LP: #832321)
  * Clear jobs directory when user selects No to recover question (LP: #836623)

  [Daniel Manrique]
  * checkbox/job.py: Guard against bogus timeout values (LP: #827859)
  * More explicit handling of string decoding/encoding, avoids problems with
    non-ascii characters (LP: #833747)
  * Changed architecture from all to any for checkbox base, to build
    architecture-specific binaries (LP: #833696)

  [Jeff Lane]
  * Several corrections necessary due to test name changes or typos found in
    job files

  [Marc Tardif]
  * Connecting hyper text widgets only once (LP: #827904)
  * Detecting MMC readers as OTHER instead of DISK (LP: #822948)
  * Validating the hostname in the SSL certificate (LP: #625076)
  * Validating the submission.xml (LP: #838123)
 -- Daniel Manrique <email address hidden> Fri, 14 Sep 2011 17:15:26 -0400

Changed in checkbox (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.