scoperegistry uses 45% CPU after applying updates from the store, draining battery quickly

Bug #1470750 reported by Jean-Baptiste Lallement
56
This bug affects 10 people
Affects Status Importance Assigned to Milestone
Canonical System Image
Fix Released
Critical
Alejandro J. Cura
unity-scopes-api (Ubuntu)
Fix Released
Critical
Paweł Stołowski

Bug Description

current build number: 44
device name: arale
channel: ubuntu-touch/rc-proposed/meizu.en
last update: 2015-07-02 07:10:44
version version: 44
version ubuntu: 20150702
version device: 20150608-6e66f3c
version custom: 20150602-731-5-32

Suddenly this morning, the battery started discharging very quickly, 'top' revealed that scoperegistry was using around 45% CPU steadily.

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 2207 phablet 20 0 377996 9576 6200 S 46,0 0,5 67:34.54 scoperegistry

Around the same time that the battery started dropping, I applied all the updates from the store, in the list of apps to update were:
- Terminal
- BBC Sport
- BBC
- Euronews
- Engadget
- Cnet

I pressed 'Update all' and waited until it's done then closed system-settings.

There is no crash file corresponding to this event.

I reproduced it on ubuntu-touch/rc-proposed/bq-aquaris.en/krillin #55

Related branches

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
description: updated
description: updated
summary: - arale - scoperegistry uses 50% CPU, draining battery quickly
+ scoperegistry uses 50% CPU after applying updates from the store,
+ draining battery quickly
description: updated
summary: - scoperegistry uses 50% CPU after applying updates from the store,
+ scoperegistry uses 45% CPU after applying updates from the store,
draining battery quickly
description: updated
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

also reported by bq user

Changed in canonical-devices-system-image:
assignee: nobody → Alejandro J. Cura (alecu)
importance: Undecided → High
status: New → Confirmed
Changed in unity-scopes-api (Ubuntu):
importance: Undecided → Critical
Revision history for this message
Michi Henning (michihenning) wrote :

A good first step would be to attach with gdb to the registry and get a stack for all threads. We need to figure out roughly where in the code something is spinning.

Revision history for this message
Paweł Stołowski (stolowski) wrote :

Also, do you still have any registry log file from the time it happened? Relevant files are here: /home/phablet/.local/share/unity-scopes/unconfined/Registry/logs

Revision history for this message
Paweł Stołowski (stolowski) wrote :

Ok, I was able to reproduce by applying all available updates on top of image 55.

Never mind the question about logs, nothing unusual there. Some of the updates are for scopes and the logs show that they are uninstalled and the installed again, e.g.

[2015-07-02 14:54:53.960533] INFO: Registry: ScopesWatcher: scope: "com.canonical.scopes.bbc_bbc" uninstalled from: "/home/phablet/.local/share/unity-scopes//com.canonical.scopes.bbc_bbc_1.8.3"
[2015-07-02 14:54:53.982678] INFO: Registry: ScopesWatcher: scope: "com.canonical.scopes.bbc_bbc" installed to: "/home/phablet/.local/share/unity-scopes//com.canonical.scopes.bbc_bbc_1.8.4"

I collected backtrace of all threads, unfortunately the overlay ppa doesn't offer packages with debug symbols... I'll try to rebuild a version with symbols on my phone.

After reproducing it, scope-registry keeps consuming around 50% of the cpu but apart from that it seems to be working correctly (scopes are started on demand and they work).

Revision history for this message
Paweł Stołowski (stolowski) wrote :

It turned out the debug package for libunity-scopes3 is available here: http://ppa.launchpad.net/ci-train-ppa-service/stable-phone-overlay/ubuntu/pool/main/u/unity-scopes-api/

Attaching backtrace for all threads again, with more symbols.

Revision history for this message
Paweł Stołowski (stolowski) wrote :

When using strace against scoperegistry in this state, the output gets *flooded* with the following until I interrupt strace:

[pid 10491] futex(0x443a14, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 10491] futex(0x443a3c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 244392527, {633437444, 854775807}, ffffffff) = -1 ETIMEDOUT (Connection timed out)

Now, googling for it reveals several problems like this (for various programs) caused by the leap second (it so happens that leap seconds were added on Jun 30/Jul 1 this year). Could this be the issue?

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in unity-scopes-api (Ubuntu):
status: New → Confirmed
Revision history for this message
Michi Henning (michihenning) wrote :

This is almost certainly caused by zmq. If the scope disappears unexpectedly, and there is a message from the registry to the scope pending, zmq by default tries to re-connect to the peer once every millisecond. If you look at the registry, you won't see anything unusual because all its threads will be exactly where you'd expect them to be. The re-connect spinning happens inside one of zmq's threads.

I've come across this once before. I'll try and dredge up the details. Basically, what I did was to add a reaper mechanism to the outgoing connection pool that trashes the socket if the request fails.

I'm wondering whether, possibly, this is happening on a oneway request from the registry to the scope?

There is a way to set a zmq socket option that adjusts the retry interval to something less aggressive. But setting that wouldn't fix the problem; instead, the registry would still be trying to re-connect indefinitely, just less often. zmq does not allow the number of retries to be restricted to some limit. As far as I know, the only way to stop the problem is to trash the offending socket.

It would be good to know what invocation is in flight when we enter that state.

Revision history for this message
Michi Henning (michihenning) wrote :

If the leap second is indeed responsible, the problem should go away after a reboot.

Revision history for this message
Michi Henning (michihenning) wrote :

The fix for the re-connection problem we had previously was applied at revision 504.

Revision history for this message
Paweł Stołowski (stolowski) wrote :

The problem can be reproduced by just installing & uninstalling scopes on the phone with the install_loop script from the following bug: https://bugs.launchpad.net/ubuntu/+source/unity-scopes-api/+bug/1450493/+attachment/4419415/+files/install_loop.zip

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

The CPU spike is triggered upon uninstalling a scope. A single install / uninstall cycle reproduces this issue.

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

I was able to recreate this on OTA-4 on the Nexus.

phablet@ubuntu-phablet:~$ system-image-cli -i
current build number: 20
device name: mako
channel: ubuntu-touch/stable/ubuntu
last update: 2015-07-03 14:43:04
version version: 20
version ubuntu: 20150611.3
version device: 20150210
version custom: 20150617

Revision history for this message
Alfonso Sanchez-Beato (alfonsosanchezbeato) wrote :

I have seen this happening in OTA-4 krillin.

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

Ok, we've narrowed this issue down to a single change we made to delay notifying the dash when a scope is removed. Makes sense that it was triggered upon uninstalling a scope. A fix is in the pipeline.

Changed in unity-scopes-api (Ubuntu):
assignee: nobody → Pawel Stolowski (stolowski)
Changed in unity-scopes-api (Ubuntu):
status: Confirmed → In Progress
Changed in canonical-devices-system-image:
status: Confirmed → In Progress
milestone: none → ww28-2015
Changed in canonical-devices-system-image:
importance: High → Critical
Changed in canonical-devices-system-image:
status: In Progress → Fix Released
Changed in canonical-devices-system-image:
status: Fix Released → In Progress
Changed in canonical-devices-system-image:
status: In Progress → Fix Released
Changed in unity-scopes-api (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.