Qt app aborts if it cannot connect to Mir - QtMir rejecting the connection

Bug #1422797 reported by Ugo Riboni
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
qtmir (Ubuntu)
Invalid
High
Gerry Boland
ubuntu-app-launch (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

When running several camera-app tests in a sequence, randomly we have the application crash.
Tracing the crash leads to a fatal error in qtubuntu when trying to connect to MIR. The stack trace follows below.

According to elopio this might be because the app has not exited yet from the previous tests when running the next one.
I am attaching the AP log as well.

#0 0xb62298e6 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
#1 0xb6237e5e in raise () from /lib/arm-linux-gnueabihf/libc.so.6
#2 0xb6238b4e in abort () from /lib/arm-linux-gnueabihf/libc.so.6
#3 0xb6bfe7d6 in qt_message_fatal (context=..., message=...) at global/qlogging.cpp:1340
#4 QMessageLogger::fatal (this=<optimized out>,
    msg=0xb3f7a02c "UbuntuClientIntegration: connection to Mir server failed. Check that a Mir server is\nrunning, and the correct socket is being used and is accessible. The shell may have\nrejected the incoming connectio"...) at global/qlogging.cpp:669
#5 0xb3f67356 in ?? () from /usr/lib/arm-linux-gnueabihf/qt5/plugins/platforms/libqpa-ubuntumirclient.so
#6 0xb3f69bc4 in ?? () from /usr/lib/arm-linux-gnueabihf/qt5/plugins/platforms/libqpa-ubuntumirclient.so
#7 0xb670620a in loadIntegration (argv=0xbeeed2a4, argc=@0xbeeed12c: 1, parameters=..., key=..., loader=<optimized out>) at kernel/qplatformintegrationfactory.cpp:64
#8 QPlatformIntegrationFactory::create (platform=..., paramList=..., argc=@0xbeeed12c: 1, argv=argv@entry=0xbeeed2a4, platformPluginPath=...) at kernel/qplatformintegrationfactory.cpp:81
#9 0xb670dc34 in init_platform (argv=0xbeeed2a4, argc=@0xbeeed12c: 1, platformThemeName=..., platformPluginPath=..., pluginArgument=...) at kernel/qguiapplication.cpp:980
#10 QGuiApplicationPrivate::createPlatformIntegration (this=<optimized out>) at kernel/qguiapplication.cpp:1128
#11 0xb670e396 in QGuiApplicationPrivate::createEventDispatcher (this=<optimized out>) at kernel/qguiapplication.cpp:1143
#12 0xb6d6df20 in QCoreApplication::init (this=this@entry=0xbeeed130) at kernel/qcoreapplication.cpp:734
#13 0xb6d6dfa6 in QCoreApplication::QCoreApplication (this=0xbeeed130, p=...) at kernel/qcoreapplication.cpp:657
#14 0xb670ee06 in QGuiApplication::QGuiApplication (this=0xbeeed130, argc=@0xbeeed12c: 1, argv=0xbeeed2a4, flags=328448) at kernel/qguiapplication.cpp:532
#15 0x00014480 in CameraApplication::CameraApplication(int&, char**) ()
#16 0x00016c82 in main ()

Revision history for this message
Ugo Riboni (uriboni) wrote :
Revision history for this message
Ugo Riboni (uriboni) wrote :

Attaching the click I had the crashes with

Revision history for this message
Ugo Riboni (uriboni) wrote :

This is a log from unity8 showing the same problem: http://pastebin.ubuntu.com/10274655/
Note: it is not from the same test run or same machine as the log attached above, but the symptoms of the crash are the same.

Changed in autopilot:
importance: Undecided → High
Revision history for this message
Christopher Lee (veebers) wrote :

I'm having trouble reproducing this error locally which is making it hard for me to attempt to debug or get a better picture of what's happening.

In the meantime, Ugo, are you able to provide an autopilot log that uses the -v flag to provide a little more insight to what's happening.

I swear that I've seen something like this before (at least the message "UbuntuClientIntegration: connection to Mir server failed.. . .") in the past when I've been manually launching applications from a terminal in the device. But, of course, I don't have a bug filed for this or any logs to relate to it so at this point this is merely a mention of this with no hard fact :-).

A couple of interesting lines in the log provided[1] specifically line: 349[2] which I don't see appear in unity8 log when the tests all pass (and the crash doesn't happen) so I'm tempted to say it appears when it does happen (speculation at this point) so maybe there is something happening there in the interaction between Mir and the Shell.
The other line that was pointed out to me is on line 8561[3] of that log.

Another thought, can you confirm which version of Ubuntu you are running as well as the device type you see it on?

It's not immediately obvious to me what might be happening as autopilot is stopping the application in the prescribed manner (not just killing it) and the fact that it doesn't happen reliably would suggest perhaps some sort of race condition.

I'll continue looking into this, but for the moment I don't have any further suggestions as to what might be happening.

[1] http://pastebin.ubuntu.com/10274655/
[2] "Couldn't find a .desktop file for "com.ubuntu.camera_camera" . Skipping..."
[3] "ApplicationManager REJECTED connection from app with pid 16790 as no desktop_file_hint specified"

Changed in autopilot:
status: New → Incomplete
Bill Filler (bfiller)
Changed in unity-mir (Ubuntu):
importance: Undecided → High
kevin gunn (kgunn72)
no longer affects: unity-mir (Ubuntu)
kevin gunn (kgunn72)
no longer affects: mir
summary: - MIR refuses the app to connect
+ Mir refuses the app to connect
Revision history for this message
Christopher Lee (veebers) wrote :

A quick note as per comment on IRC:

<bfiller> veebers: I believe what is happening is that a test is causing camera-app to crash, and while it is crashing (and apport running) all subsequent tests that are run during that time frame fail
...
<bfiller> veebers: yes, and figuring out why the camera-app is crashing in the first place is something my team needs to look into
<bfiller> veebers: one option would be to have autopilot abort running the rest of the tests when it detects a crash
<bfiller> as currently it's quite hard to tell which test actually crashed
<bfiller> (and the tests run after fail for the reasons mentioned above)
<veebers> bfiller: right makes sense. Autopilot run currently has the --failfast option which might be of help, (-ff, --failfast Stop the test run on the first error or failure.)

Revision history for this message
I Ahmad (iahmad) wrote :
Revision history for this message
I Ahmad (iahmad) wrote :
Revision history for this message
Christopher Lee (veebers) wrote :
Download full text (4.4 KiB)

Late in the day I was able to check out some logs, here are my (raw) thoughts and observations regarding what happened in case they might be of use.

Logs found here: http://rtm-dashboard.ci.ubuntu.com/smokeng/utopic/touch_stable/krillin/243:20150224:20150216-fe747ac/349/camera_app/

[camera_app.tests.test_focus.TestFocus.test_focus_invalid]
Passes, but there is the message in the log:
"""
06:43:26.599 ERROR _launcher:206 - Timed out waiting for Application with app_id 'com.ubuntu.camera_camera_3.0.0.516' to stop.
"""

[camera_app.tests.test_flash.TestCameraFlash.test_flash_hdr_mutually_exclusive]
Happens shortly afterward with the error message:

"""
Traceback (most recent call last):
  File "/home/phablet/autopilot/camera_app/tests/test_flash.py", line 24, in setUp
    super(TestCameraFlash, self).setUp()
  File "/home/phablet/autopilot/camera_app/tests/__init__.py", line 43, in setUp
    self.launch_click_installed()
  File "/home/phablet/autopilot/camera_app/tests/__init__.py", line 73, in launch_click_installed
    emulator_base=CameraCustomProxyObjectBase)
  File "/usr/lib/python3/dist-packages/autopilot/testcase.py", line 323, in launch_click_package
    return launcher.launch(package_id, app_name, app_uris)
  File "/usr/lib/python3/dist-packages/autopilot/application/_launcher.py", line 283, in launch
    return super().launch(app_id, app_uris)
  File "/usr/lib/python3/dist-packages/autopilot/application/_launcher.py", line 140, in launch
    state.get('message', '')
  File "/usr/lib/python3/dist-packages/autopilot/application/_launcher.py", line 233, in _check_status_error
    raise RuntimeError(': '.join(message_parts))
RuntimeError: Timed out while waiting for application to launch
"""

Due to there being a screenshot with the camera running either the app is still running from previous test or has been started again.

[camera_app.tests.test_focus.TestFocus.test_focus_valid_and_disappear]
Is the next test to run, it too shows a screenshot that looks the same as the previous test so it's either the still running up or it's been launched again.

It also fails with the message:
"""
Traceback (most recent call last):
  File "/home/phablet/autopilot/camera_app/tests/test_focus.py", line 24, in setUp
    super(TestFocus, self).setUp()
  File "/home/phablet/autopilot/camera_app/tests/__init__.py", line 43, in setUp
    self.launch_click_installed()
  File "/home/phablet/autopilot/camera_app/tests/__init__.py", line 73, in launch_click_installed
    emulator_base=CameraCustomProxyObjectBase)
  File "/usr/lib/python3/dist-packages/autopilot/testcase.py", line 323, in launch_click_package
    return launcher.launch(package_id, app_name, app_uris)
  File "/usr/lib/python3/dist-packages/autopilot/application/_launcher.py", line 283, in launch
    return super().launch(app_id, app_uris)
  File "/usr/lib/python3/dist-packages/autopilot/application/_launcher.py", line 140, in launch
    state.get('message', '')
  File "/usr/lib/python3/dist-packages/autopilot/application/_launcher.py", line 233, in _check_status_error
    raise RuntimeError(': '.join(message_parts))
RuntimeError: Timed out while waiting for application to launch
"""

[camera_app....

Read more...

Revision history for this message
Christopher Lee (veebers) wrote :

Attaching the 2 logs mentioned in my previous comment

Revision history for this message
Christopher Lee (veebers) wrote :
kevin gunn (kgunn72)
Changed in qtmir (Ubuntu):
status: New → Invalid
Revision history for this message
kevin gunn (kgunn72) wrote : Re: Qtmir aborts if it cannot connect to Mir

renamed based on discussion with dobey
can we make Qtmir log a message with qCritical() and exit with QApplication.exit()

summary: - Mir refuses the app to connect
+ Qtmir aborts if it cannot connect to Mir
Changed in qtmir (Ubuntu):
status: Invalid → New
importance: Undecided → High
Revision history for this message
kevin gunn (kgunn72) wrote :

like dobey said, if you just run qmlscene you can see the result

phablet@ubuntu-phablet:~$ qmlscene
[1424984233.100272] Loader: Loading modules from: /usr/lib/arm-linux-gnueabihf/mir/client-platform/
[1424984233.101127] Loader: Loading module: /usr/lib/arm-linux-gnueabihf/mir/client-platform/mesa.so
[1424984233.102225] Loader: Loading module: /usr/lib/arm-linux-gnueabihf/mir/client-platform/android.so
[1424984233.103658] Loader: Loading module: /usr/lib/arm-linux-gnueabihf/mir/client-platform/dummy.so
UbuntuClientIntegration: connection to Mir server failed. Check that a Mir server is
running, and the correct socket is being used and is accessible. The shell may have
rejected the incoming connection, so check its log file
Aborted (core dumped)

@gerry, is there any reason we shouldn't/couldn't change ?

no longer affects: autopilot
Revision history for this message
Gerry Boland (gerboland) wrote :

QtMir has strict policy that every app that tries to connect to Mir must:

1. either be launched via upstart-app-launch [unsupported hack: launched with --desktop_file_hint=/path/to/desktopfile.desktop]. Upstart notifies QtMir that a process with a particular appId is about to start. QtMir then searches for a desktop file for that appId, which it associates with the process when it connects to Mir.

2. be only instance of a process with that appId running

The first point is there to ensure that every process which draws a UI must have metadata associated with it, metadata like name, icon, etc, which the shell requires to have app appear in launcher correctly. Alternative is the current situation, where we have BAMF to heuristically match processes to desktop files - which is undesirable.

The second point can be relaxed for desktop apps, but will probably be maintained for phablet.

I'll try to determine if an app crashing correctly notifies QtMir, so that if a new instance of it appears, QtMir will accept it. That's a guess at the problem however, I'd need to reproduce.

summary: - Qtmir aborts if it cannot connect to Mir
+ Qt app aborts if it cannot connect to Mir - QtMir rejecting the
+ connection
Changed in qtmir:
importance: Undecided → High
assignee: nobody → Gerry Boland (gerboland)
Revision history for this message
Gerry Boland (gerboland) wrote :
Download full text (3.3 KiB)

Am update with my findings so far:

I can occasionally reproduce QtMir rejecting the camera app connection.
It does this as it is not expecting the camera process to connect to it/mir. Why not? Upstart/Ubuntua-app-launch should have notified it that camera app is starting. But it appears some notifications are getting confused. Before the rejection, QtMir prints:

qtmir.applications: ApplicationManager::onProcessStarting - appId= "com.ubuntu.camera_camera"
qtmir.applications: ApplicationManager::onProcessStarting application already found with appId "com.ubuntu.camera_camera"
qtmir.applications: ApplicationManager::onProcessStarting - appId= "com.ubuntu.camera_camera"
qtmir.applications: ApplicationManager::onProcessStarting application already found with appId "com.ubuntu.camera_camera"
qtmir.applications: ApplicationManager::onResumeRequested - appId= "com.ubuntu.camera_camera"
qtmir.applications: ApplicationManager::onProcessStarting - appId= "com.ubuntu.camera_camera"
qtmir.applications: ApplicationManager::onProcessStarting application already found with appId "com.ubuntu.camera_camera"
qtmir.applications: ApplicationManager::onResumeRequested - appId= "com.ubuntu.camera_camera"

The onProcessStarting print is due to a message "starting" from Ubuntu-app-launch. UAL appears to send 3 starting messages! - but in fact the process has stopped (well disconnected from Mir), due to the next lines of output:

qtmir.mir: SessionListener::stopping - this= SessionListener(0xb193b7ac) session= 0x1a419dc
qtmir.applications: ApplicationManager::onSessionStopping - sessionName= QtUbuntu
qtmir.applications: ApplicationManager::remove - appId= "com.ubuntu.camera_camera"

Watching dbus traffic from Upstart, I noticed this error coincides with these events:

signal sender=:1.0 -> dest=(null destination) serial=1646 path=/com/ubuntu/Upstart; interface=com.ubuntu.Upstart0_6; member=EventEmitted
   string "starting"
   array [
      string "JOB=application-click"
      string "INSTANCE=com.ubuntu.camera_camera_3.0.0.516"
      string "UBUNTU_APP_LAUNCH_ARCH=arm-linux-gnueabihf"
   ]

signal sender=:1.0 -> dest=(null destination) serial=1655 path=/com/ubuntu/Upstart; interface=com.ubuntu.Upstart0_6; member=EventEmitted
   string "started"
   array [
      string "JOB=application-click"
      string "INSTANCE=com.ubuntu.camera_camera_3.0.0.516"
      string "UBUNTU_APP_LAUNCH_ARCH=arm-linux-gnueabihf"
   ]
signal sender=:1.0 -> dest=(null destination) serial=1672 path=/com/ubuntu/Upstart; interface=com.ubuntu.Upstart0_6; member=EventEmitted
   string "stopping"
   array [
      string "JOB=application-click"
      string "INSTANCE=com.ubuntu.camera_camera_3.0.0.516"
      string "RESULT=ok"
      string "UBUNTU_APP_LAUNCH_ARCH=arm-linux-gnueabihf"
   ]

**No Stopped!!!**

signal sender=:1.0 -> dest=(null destination) serial=1708 path=/com/ubuntu/Upstart; interface=com.ubuntu.Upstart0_6; member=EventEmitted
   string "starting"
   array [
      string "JOB=application-click"
      string "INSTANCE=com.ubuntu.camera_camera_3.0.0.516"
      string "UBUNTU_APP_LAUNCH_ARCH=arm-linux-gnueabihf"
   ]

Normally these are the sequence of events: starting, started, s...

Read more...

Revision history for this message
Gerry Boland (gerboland) wrote :

Note I generally can reproduce this with a newly flashed device and the first run of the camera-app test suite. Further runs tend to not exhibit the bug.

Ted Gould (ted)
Changed in ubuntu-app-launch (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in qtmir (Ubuntu):
status: New → Confirmed
Revision history for this message
Gerry Boland (gerboland) wrote :

Almost a year later, I've not heard of this issue cropping up recently. I'm marking as invalid, but if anyone discovers this again, please re-open and I'll look again.

Changed in qtmir:
status: New → Invalid
Changed in qtmir (Ubuntu):
status: Confirmed → Invalid
Changed in ubuntu-app-launch (Ubuntu):
status: Confirmed → Invalid
Michał Sawicz (saviq)
Changed in qtmir (Ubuntu):
assignee: nobody → Gerry Boland (gerboland)
no longer affects: qtmir
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.