Unity fails to start sometimes in CI resulting in screen unlock failure [what(): bind: Address already in use]

Bug #1285215 reported by Omer Akram
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mir
Fix Released
Critical
Alberto Aguirre
mir (Ubuntu)
Fix Released
Medium
Unassigned
unity8 (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

At times unity8 fails to start with the unlock_screen script that we run in our CI infrastructure and that results in tests failure. At that time manually trying to start unity gives:

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >'
  what(): bind: Address already in use
Aborted (core dumped)

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: unity8 7.84+14.04.20140221-0ubuntu1
Uname: Linux 3.4.0-4-mako armv7l
ApportVersion: 2.13.2-0ubuntu5
Architecture: armhf
Date: Wed Feb 26 20:25:42 2014
InstallationDate: Installed on 2014-02-26 (0 days ago)
InstallationMedia: Ubuntu Trusty Tahr (development branch) - armhf (20140226.1)
SourcePackage: unity8
UpgradeStatus: No upgrade log present (probably fresh install)

Related branches

Revision history for this message
Omer Akram (om26er) wrote :
Revision history for this message
Omer Akram (om26er) wrote :

this log is printed:

Traceback (most recent call last):
File "/home/phablet/bin/unlock_screen.py", line 98, in <module>
unlock_screen()
File "/home/phablet/bin/unlock_screen.py", line 58, in unlock_screen
unity8 = introspection.get_proxy_object_for_existing_process(connection_name=conn)
File "/usr/lib/python2.7/dist-packages/autopilot/introspection/__init__.py", line 173, in get_proxy_object_for_existing_process
raise ProcessSearchError(message_string)
autopilot.introspection.ProcessSearchError: Search criteria (dbus bus = 'session', connection name = 'com.canonical.Shell.BottomBarVisibilityCommunicator', object path = '/com/canonical/Autopilot/Introspection') returned no results

Revision history for this message
Michał Sawicz (saviq) wrote :

Adding affects: mir to see whether something could happen there (abstract sockets were mentioned once to me).

Otherwise we might just remove the socket in our upstart job... Comments?

Changed in unity8 (Ubuntu):
status: New → Confirmed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I think this happens when you have a lingering socket file from the previous server:
    /tmp/mir_socket
or wherever it is.

Mir (server) doesn't contain any smarts to test the validity of existing socket files. I'm assuming the old server is dead. In that case we'd need a way to probe if a socket file is presently opened by any other (server) process. If not then delete it and try again.

summary: Unity fails to start sometimes in CI resulting in screen unlock failure
+ [ what(): bind: Address already in use]
Changed in mir (Ubuntu):
status: New → Triaged
Changed in mir:
status: New → Triaged
summary: Unity fails to start sometimes in CI resulting in screen unlock failure
- [ what(): bind: Address already in use]
+ [what(): bind: Address already in use]
Changed in mir:
importance: Undecided → Medium
Changed in mir (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Michał Sawicz (saviq) wrote :

In that case making the unity8 task invalid.

Changed in unity8 (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
kevin gunn (kgunn72) wrote :

making critical, as we just need to clean this one up.

per mterry suggestion
"fuser $MIR_SOCKET" and if no one else is using it, delete it

no matter what we need some mechanism to take care of stale/orphaned sockets

Changed in mir:
importance: Medium → Critical
Revision history for this message
kevin gunn (kgunn72) wrote :

from ubuntu-phone mailer
xnox said
"File socket should not be used at all, as they are known to be racy
and can easily end up stale, instead an abstract unix domain socket
should be used throughout.

This is not the first time, where stale, and not properly cleared file
sockets in mir are causing severe breakage on touch.

With an abstract socket, there is no file on a filesystem, and thus
when a process is gone the socket is also gone. You can see examples
of abstract sockets in: upstart, dbus-daemon, and plenty of other
system level daemons on ubuntu."

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

In theory this could only occur if the first server has crashed or lost power (forced off). Was there a crash preceding this we need to fix?

Revision history for this message
kevin gunn (kgunn72) wrote :

yes, separate to this there is a unity8/qt crash occuring.
however, this is kind of a recurring theme so we could alleviate some heart burn by providing some sort of mechanism to detect/dispatch stale sockets.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package unity8 - 7.84+14.04.20140228-0ubuntu1

---------------
unity8 (7.84+14.04.20140228-0ubuntu1) trusty; urgency=low

  [ Michał Sawicz ]
  * Fix CardHeader title font weight.
  * Delete stale sockets. (LP: #1285215)

  [ Dmitrijs Ledkovs ]
  * Ship python3 autopilot modules.

  [ Albert Astals ]
  * Cleanup DashContent Remove unused signals and properties

  [ Michał Karnicki ]
  * Take it easy on the logging.
  * Fix CardHeader title font weight.

  [ Nick Dedekind ]
  * Added ability to change indicator profile in shell (env
    UNITY_INDICATOR_PROFILE)

  [ Andrea Cimitan ]
  * Rename PreviewRating to PreviewRatingInput
  * Adds PreviewRatingDisplay

  [ Daniel d'Andrada ]
  * DirectionalDragArea: Reset status if disabled while dragging (LP:
    #1276122)

  [ Dimitri John Ledkov ]
  * Ship python3 autopilot modules.
 -- Ubuntu daily release <email address hidden> Fri, 28 Feb 2014 10:48:06 +0000

Changed in unity8 (Ubuntu):
status: Invalid → Fix Released
Changed in mir:
assignee: nobody → Alberto Aguirre (albaguirre)
milestone: none → 0.1.9
status: Triaged → In Progress
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir/devel at revision None, scheduled for release in mir, milestone Unknown

Changed in mir:
status: In Progress → Fix Committed
Changed in mir:
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mir - 0.1.9+14.10.20140430.1-0ubuntu1

---------------
mir (0.1.9+14.10.20140430.1-0ubuntu1) utopic; urgency=medium

  [ Daniel van Vugt ]
  * New upstream release 0.1.9 (https://launchpad.net/mir/+milestone/0.1.9)
    - mirclient ABI unchanged, still at 7. Clients do not need rebuilding.
    - mirserver ABI bumped to 19. Shells need rebuilding.
    - More libmirserver class changes and reorganization, including;
      . Moving things from shell:: to scene::
      . Rewriting/refactoring surface factories.
    - Added an id() to Renderable.
    - Scene/Renderer interfaces:
      . Scene is no longer responsible for its own iteration (no for_each
        any more). Instead you should iterate over the list returned by
        Scene::generate_renderable_list().
    - Bugs fixed:
      . Stale socket issue. (LP: #1285215)
      . Qt render gets blocked on EGLSwapBuffers. (LP: #1292306)
      . Lock order violated found in helgrind (potential deadlock).
        (LP: #1296544)
      . [regression] SwitchingBundle in framedropping mode can hang.
        (LP: #1306464)
      . [DPMS] Display backlight turns back on almost immediately after
        being turned off. (LP: #1231857)
      . Wrong frame is seen on wake up/resume/unlock. (LP: #1233564)
      . Nested platform is not testable (LP: #1299101)
      . [regression] mir_demo_server_shell crashes on display resume.
        (LP: #1308941)
      . Multi-threaded composition is actually mostly serialized by
        SurfaceStack::guard. (LP: #1234018)
      . Mirscreencast slows down compositing and makes it very jerky.
        (LP: #1280938)
      . Mirscreencast can cause clients to render faster than the screen
        refresh rate. (LP: #1294361)
      . Screen turns on when a new session/surface appears. (LP: #1297876)
      . mir-doc package is >56MB in size, expands to >100MB of files.
        (LP: #1304998)
      . [regression] Clang: 'mir::test::doubles::MockSurface::visible'
        hides overloaded virtual function [-Woverloaded-virtual].
        (LP: #1301135)
      . [regression] GLRenderer* unit tests have recently become noisy.
        (LP: #1308905)
      . FocusController::set_focus_to() no longer seems to raise a session
        to the top. (LP: #1302689)

  [ Ubuntu daily release ]
  * New rebuild forced
 -- Ubuntu daily release <email address hidden> Wed, 30 Apr 2014 13:26:58 +0000

Changed in mir (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.