snap install seems to be polling at 3000 nanoseconds in a pselect call

Bug #1792959 reported by Colin Ian King on 2018-09-17
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
High
Unassigned

Bug Description

I observed snap install was eating a lot of CPU across threads, attaching strace -f in a snap install I observed *lots* of polling pselect calls:

[pid 27055] pselect6(0, NULL, NULL, NULL, {tv_sec=0, tv_nsec=3000}, NULL <unfinished ...>
[pid 27055] <... pselect6 resumed> ) = 0 (Timeout)
[pid 27055] pselect6(0, NULL, NULL, NULL, {tv_sec=0, tv_nsec=3000}, NULL <unfinished ...>
[pid 27055] <... pselect6 resumed> ) = 0 (Timeout)
[pid 27049] pselect6(0, NULL, NULL, NULL, {tv_sec=0, tv_nsec=3000}, NULL <unfinished ...>
[pid 27049] <... pselect6 resumed> ) = 0 (Timeout)
[pid 27049] pselect6(0, NULL, NULL, NULL, {tv_sec=0, tv_nsec=3000}, NULL <unfinished ...>
[pid 27049] <... pselect6 resumed> ) = 0 (Timeout)

Honestly, not much can happen in 3000 nanoseconds, light travels less that 900 meters in the kind of duration. Polling on pselect is generating a lot of work for the scheduler. Can this pselect timeout be increased to something sensible, like 10th of a second so we don't eat up so much CPU. Is this some kind of broken spin lock?

Michael Vogt (mvo) on 2018-11-15
Changed in snapd:
importance: Undecided → High
Michael Vogt (mvo) wrote :

Thanks for your bugreport.

The 300 nanoseconds apparently comes from a usleep(3) in GO runtime (runtime/proc.go:runqgrab to be precise). I have not looked into the details why this is happening yet, the internet (https://stackoverflow.com/questions/35155119/how-to-optimize-golang-program-that-spends-most-time-in-runtime-osyield-and-runt) indicates that we might trigger too much GC for some reason. But it might also be a red-herring. The next step here is probably to run pprof to see what triggers the usleep(3).

Changed in snapd:
status: New → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers