Comment 12 for bug 1800101

Revision history for this message
Ansis Atteka (ansisatteka) wrote :

I am seeing high CPU utilization by gnome-shell process as well. Especially when using Chrome Web Browser and moving/resizing windows. Chrome usage goes up too. This issue happens consistently on 18.04. The system becomes barely responsible to user input.

Based on strace output of gnome-shell process here is behavior that caught my attention:

1, gnome-shell invokes clock_gettime() and futex() system calls excessively.
2. gnome-shell on recvmsg() system call receives errors.

sudo strace -c -f -p `pidof gnome-shell`
strace: Process 2028 attached with 18 threads
^Cstrace: Process 2028 detached
strace: Process 2141 detached
...
strace: Process 2566 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
 69.38 0.047219 1312 36 3 futex
 18.60 0.012661 3 4375 clock_gettime
  5.18 0.003527 9 390 346 recvmsg
  2.05 0.001398 10 137 poll
  1.86 0.001263 4 315 getpid
  1.25 0.000851 12 69 writev
  0.84 0.000570 9 62 write
  0.52 0.000357 2 156 sched_yield
  0.23 0.000156 4 37 read
  0.06 0.000038 38 1 restart_syscall
  0.03 0.000019 6 3 nanosleep
------ ----------- ----------- --------- --------- ----------------
100.00 0.068059 5581 349 total

It seems that high CPU usage originates from NVidia's shared library (binary distributed with libnvidia-gl-390:amd64 debian package). The gnome-shell process loads this library.

# sudo strace -k -o /tmp/aaa -f -p `pidof gnome-shell`

Here is proof that those calls to clock_gettime() system function originate from NVidia library:

2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=821854879}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=824319178}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=826767633}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=829303953}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=831753193}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=834216603}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=836654560}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=839135214}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=841593660}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]
 > /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.390.77(vk_icdGetInstanceProcAddr+0x5865) [0xab445]
2028 clock_gettime(CLOCK_MONOTONIC, {tv_sec=14361, tv_nsec=844017483}) = 0
 > /lib/x86_64-linux-gnu/libc-2.27.so(syscall+0x19) [0x11b839]

As for recvmsg() - while it is not necessarily a bug, it is questionable that a user space code after just doing recvmsg() and receiving EAGAIN error on non-blocking Unix Domain Socket needs to again call recvmsg() without going though poll() system call. Here is snippet that I repeatedly see in the strace output:

2028 [00007f7e93d77567] recvmsg(5<UNIX:[36182->37095]>, {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable)
 > /lib/x86_64-linux-gnu/libpthread-2.27.so(recvmsg+0x47) [0x12567]
 > /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0(xcb_wait_for_special_event+0x4a8) [0xd888]
 > /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0(xcb_poll_for_reply64+0x178) [0xe358]
 > /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0(_XFreeX11XCBStructure+0x849) [0x3dd79]
 > /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0(_XFreeX11XCBStructure+0x9ae) [0x3dede]
 > /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0(_XEventsQueued+0x5d) [0x3e1cd]
 > /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0(XPending+0x5d) [0x2fd3d]
 > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.3(g_main_context_prepare+0x1c8) [0x4ba98]
 > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.3(g_main_context_dispatch+0x3cb) [0x4c46b]
 > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.3(g_main_loop_run+0xc2) [0x4c8d2]
 > /usr/lib/x86_64-linux-gnu/libmutter-2.so.0.0.0(meta_run+0x2c) [0x9824c]
 > /usr/bin/gnome-shell(_init+0x88c) [0x248c]
 > /lib/x86_64-linux-gnu/libc-2.27.so(__libc_start_main+0xe7) [0x21b97]
 > /usr/bin/gnome-shell(_init+0x9ca) [0x25ca]
2028 [00007f7e93d77567] recvmsg(5<UNIX:[36182->37095]>, {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable)
 > /lib/x86_64-linux-gnu/libpthread-2.27.so(recvmsg+0x47) [0x12567]
 > /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0(xcb_wait_for_special_event+0x4a8) [0xd888]
 > /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0(xcb_poll_for_reply64+0x90) [0xe270]
 > /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0(_XFreeX11XCBStructure+0x969) [0x3de99]
 > /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0(_XEventsQueued+0x5d) [0x3e1cd]
 > /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0(XPending+0x5d) [0x2fd3d]
 > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.3(g_main_context_prepare+0x1c8) [0x4ba98]
 > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.3(g_main_context_dispatch+0x3cb) [0x4c46b]
 > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.5600.3(g_main_loop_run+0xc2) [0x4c8d2]
 > /usr/lib/x86_64-linux-gnu/libmutter-2.so.0.0.0(meta_run+0x2c) [0x9824c]
 > /usr/bin/gnome-shell(_init+0x88c) [0x248c]
 > /lib/x86_64-linux-gnu/libc-2.27.so(__libc_start_main+0xe7) [0x21b97]
 > /usr/bin/gnome-shell(_init+0x9ca) [0x25ca]