unity8-dash has a thread that is polling rather rapidly on epoll wait

Bug #1364464 reported by Colin Ian King on 2014-09-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
unity-scopes-api (Ubuntu)
Critical
Michi Henning
Utopic
Undecided
Unassigned
unity-scopes-shell (Ubuntu)
Undecided
Unassigned
unity8 (Ubuntu)
Undecided
Unassigned

Bug Description

one of the threads to unity8-dash is creating quite a few wakeups per second:

# eventstat 60 1

 Event/s PID Task Init Function Callback
   13.02 2812 scopes_ng::Scop hrtimer_start_range_ns hrtimer_wakeup

process 2812 is a thread of unity8-dash

attaching strace to the thread we see:

root@ubuntu-phablet:/proc# strace -p 2812
Process 2812 attached
clock_gettime(CLOCK_MONOTONIC, {925, 47970238}) = 0
epoll_wait(57, {}, 256, 115) = 0
clock_gettime(CLOCK_MONOTONIC, {925, 165670469}) = 0
socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0) = 50
fcntl64(50, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(50, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(50, {sa_family=AF_LOCAL, sun_path="/run/user/32011/zmq/priv/clickscope"}, 110) = -1 ENOENT (No such file or directory)
close(50) = 0
clock_gettime(CLOCK_MONOTONIC, {925, 174626930}) = 0
epoll_wait(57, {}, 256, 39) = 0
clock_gettime(CLOCK_MONOTONIC, {925, 215561930}) = 0
socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0) = 50
fcntl64(50, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(50, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(50, {sa_family=AF_LOCAL, sun_path="/run/user/32011/zmq/priv/clickscope"}, 110) = -1 ENOENT (No such file or directory)
close(50) = 0
clock_gettime(CLOCK_MONOTONIC, {925, 222932007}) = 0
epoll_wait(57, {}, 256, 65) = 0
clock_gettime(CLOCK_MONOTONIC, {925, 290266777}) = 0
socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0) = 50
fcntl64(50, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(50, F_SETFL, O_RDWR|O_NONBLOCK) = 0

..ad infinitum...

the epoll_wait() is sleeping a very short while before a connect to a clickscope named socket path is attempted over and over again. Is this rapid polling really necessary? It's contributing to 0.50%-1.00% of the CPU load of the phone.

Related branches

Albert Astals Cid (aacid) wrote :

scopes_ng:: is most probably unity-scopes-shell

Michał Sawicz (saviq) wrote :

Yeah, this is between -api (most probably there - zmq) and -shell.

Changed in unity8 (Ubuntu):
status: New → Incomplete
Colin Ian King (colin-king) wrote :

In 60 seconds, one can observe:

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
 44.51 2.250000 2139 1052 poll
 28.03 1.416574 366 3871 epoll_wait
 24.33 1.230000 10250 120 rt_sigtimedwait
  2.77 0.140000 70000 2 restart_syscall
  0.26 0.013026 2 6306 fcntl64
  0.04 0.002036 1 3153 3153 connect
  0.01 0.000655 0 3213 close
  0.01 0.000607 0 3213 socket
  0.01 0.000529 0 7134 clock_gettime
  0.01 0.000520 1 1026 write
  0.01 0.000352 0 1038 16 read
  0.00 0.000246 0 985 recvfrom
  0.00 0.000098 1 120 madvise
  0.00 0.000000 0 54 ioctl
  0.00 0.000000 0 6 gettimeofday
  0.00 0.000000 0 120 clone
  0.00 0.000000 0 25 mprotect
  0.00 0.000000 0 1 writev
  0.00 0.000000 0 1 sched_yield
  0.00 0.000000 0 120 rt_sigprocmask
  0.00 0.000000 0 15 5 futex
  0.00 0.000000 0 6 bind
  0.00 0.000000 0 6 getsockname
  0.00 0.000000 0 12 sendto
  0.00 0.000000 0 77 13 recvmsg
  0.00 0.000000 0 120 set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00 5.054643 31796 3187 total

The 3153 failed connects are to named sockets that don't exist. That's a rate of 52 PER SECOND. This is rather ridiculous, we are meant to be trying to make a low power phone last and not drain the battery on this kind of rapid polling.

Changed in unity-scopes-api (Ubuntu):
assignee: nobody → Michi Henning (michihenning)
Michi Henning (michihenning) wrote :

This is caused by Zmq retrying behind the scenes. There are ways to mitigate this, but I'd like to learn a bit more first, if possible. Can you tell me how I can reproduce this and what you did to the phone before you noticed the problem? Install or uninstall scopes or similar?

Colin Ian King (colin-king) wrote :

Test scenario:

Clean install with developer mode so I can access the device via adb shell. Device is on the lock screen, sitting idle, so essentially is should be doing not a lot.

I ran eventstat to see which processes are generating the most wakeups:

root@ubuntu-phablet:~# eventstat 60 1
 Event/s PID Task Init Function Callback
   87.00 0 [swapper/0] hrtimer_start_range_ns tick_sched_timer
   39.05 3759 scopes_ng::Scop hrtimer_start_range_ns hrtimer_wakeup
   16.68 5 [kworker/u:0] hwmsen_work_func hwmsen_poll
   10.00 3951 ubuntu-location hrtimer_start_range_ns hrtimer_wakeup

Then I strace'd PID 3759,
strace -p 3759 -e connect

Changed in unity-scopes-shell (Ubuntu):
status: New → Invalid
Changed in unity-scopes-api (Ubuntu):
status: New → In Progress
Changed in unity8 (Ubuntu):
status: Incomplete → Invalid
Michi Henning (michihenning) wrote :

Thanks for the info! The linked branch mitigates this be eliminating reconnects for twoway invocations completely, and limiting reconnection attempts for oneway invocations to at most one per second. This is an interim fix, and I'm working on a solution that will eliminate the reconnection attempts completely once they have failed a number of times, but that's a non-trivial change. For now, this will certainly help to reduce battery drain by a lot.

Colin Ian King (colin-king) wrote :

Great! Thanks!

Michi Henning (michihenning) wrote :

Thanks for the heads-up! This would have gone unnoticed for a long time without your diligence!

Changed in unity-scopes-api (Ubuntu):
status: In Progress → Fix Committed
Thomas Strehl (strehl-t) on 2014-10-13
Changed in unity-scopes-api (Ubuntu):
importance: Undecided → Critical
tags: added: rtm14
Changed in unity-scopes-api (Ubuntu):
status: Fix Committed → Fix Released
Changed in unity-scopes-api (Ubuntu Utopic):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers