I am seeing high CPU utilization by gnome-shell process as well. Especially when using Chrome Web Browser and moving/resizing windows. Chrome usage goes up too. This issue happens consistently on 18.04. The system becomes barely responsible to user input.
Based on strace output of gnome-shell process here is behavior that caught my attention:
1, gnome-shell invokes clock_gettime() and futex() system calls excessively.
2. gnome-shell on recvmsg() system call receives errors.
It seems that high CPU usage originates from NVidia's shared library (binary distributed with libnvidia-gl-390:amd64 debian package). The gnome-shell process loads this library.
As for recvmsg() - while it is not necessarily a bug, it is questionable that a user space code after just doing recvmsg() and receiving EAGAIN error on non-blocking Unix Domain Socket needs to again call recvmsg() without going though poll() system call. Here is snippet that I repeatedly see in the strace output:
I am seeing high CPU utilization by gnome-shell process as well. Especially when using Chrome Web Browser and moving/resizing windows. Chrome usage goes up too. This issue happens consistently on 18.04. The system becomes barely responsible to user input.
Based on strace output of gnome-shell process here is behavior that caught my attention:
1, gnome-shell invokes clock_gettime() and futex() system calls excessively.
2. gnome-shell on recvmsg() system call receives errors.
sudo strace -c -f -p `pidof gnome-shell`
strace: Process 2028 attached with 18 threads
^Cstrace: Process 2028 detached
strace: Process 2141 detached
...
strace: Process 2566 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
69.38 0.047219 1312 36 3 futex
18.60 0.012661 3 4375 clock_gettime
5.18 0.003527 9 390 346 recvmsg
2.05 0.001398 10 137 poll
1.86 0.001263 4 315 getpid
1.25 0.000851 12 69 writev
0.84 0.000570 9 62 write
0.52 0.000357 2 156 sched_yield
0.23 0.000156 4 37 read
0.06 0.000038 38 1 restart_syscall
0.03 0.000019 6 3 nanosleep
------ ----------- ----------- --------- --------- ----------------
100.00 0.068059 5581 349 total
It seems that high CPU usage originates from NVidia's shared library (binary distributed with libnvidia- gl-390: amd64 debian package). The gnome-shell process loads this library.
# sudo strace -k -o /tmp/aaa -f -p `pidof gnome-shell`
Here is proof that those calls to clock_gettime() system function originate from NVidia library:
2028 clock_gettime( CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=821854879}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=824319178}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=826767633}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=829303953}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=831753193}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=834216603}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=836654560}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=839135214}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=841593660}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839] x86_64- linux-gnu/ libGLX_ nvidia. so.390. 77(vk_icdGetIns tanceProcAddr+ 0x5865) [0xab445] CLOCK_MONOTONIC , {tv_sec=14361, tv_nsec=844017483}) = 0 64-linux- gnu/libc- 2.27.so( syscall+ 0x19) [0x11b839]
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
> /usr/lib/
2028 clock_gettime(
> /lib/x86_
As for recvmsg() - while it is not necessarily a bug, it is questionable that a user space code after just doing recvmsg() and receiving EAGAIN error on non-blocking Unix Domain Socket needs to again call recvmsg() without going though poll() system call. Here is snippet that I repeatedly see in the strace output:
2028 [00007f7e93d77567] recvmsg( 5<UNIX: [36182- >37095] >, {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable) 64-linux- gnu/libpthread- 2.27.so( recvmsg+ 0x47) [0x12567] x86_64- linux-gnu/ libxcb. so.1.1. 0(xcb_wait_ for_special_ event+0x4a8) [0xd888] x86_64- linux-gnu/ libxcb. so.1.1. 0(xcb_poll_ for_reply64+ 0x178) [0xe358] x86_64- linux-gnu/ libX11. so.6.3. 0(_XFreeX11XCBS tructure+ 0x849) [0x3dd79] x86_64- linux-gnu/ libX11. so.6.3. 0(_XFreeX11XCBS tructure+ 0x9ae) [0x3dede] x86_64- linux-gnu/ libX11. so.6.3. 0(_XEventsQueue d+0x5d) [0x3e1cd] x86_64- linux-gnu/ libX11. so.6.3. 0(XPending+ 0x5d) [0x2fd3d] x86_64- linux-gnu/ libglib- 2.0.so. 0.5600. 3(g_main_ context_ prepare+ 0x1c8) [0x4ba98] x86_64- linux-gnu/ libglib- 2.0.so. 0.5600. 3(g_main_ context_ dispatch+ 0x3cb) [0x4c46b] x86_64- linux-gnu/ libglib- 2.0.so. 0.5600. 3(g_main_ loop_run+ 0xc2) [0x4c8d2] x86_64- linux-gnu/ libmutter- 2.so.0. 0.0(meta_ run+0x2c) [0x9824c] gnome-shell( _init+0x88c) [0x248c] 64-linux- gnu/libc- 2.27.so( __libc_ start_main+ 0xe7) [0x21b97] gnome-shell( _init+0x9ca) [0x25ca] 5<UNIX: [36182- >37095] >, {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable) 64-linux- gnu/libpthread- 2.27.so( recvmsg+ 0x47) [0x12567] x86_64- linux-gnu/ libxcb. so.1.1. 0(xcb_wait_ for_special_ event+0x4a8) [0xd888] x86_64- linux-gnu/ libxcb. so.1.1. 0(xcb_poll_ for_reply64+ 0x90) [0xe270] x86_64- linux-gnu/ libX11. so.6.3. 0(_XFreeX11XCBS tructure+ 0x969) [0x3de99] x86_64- linux-gnu/ libX11. so.6.3. 0(_XEventsQueue d+0x5d) [0x3e1cd] x86_64- linux-gnu/ libX11. so.6.3. 0(XPending+ 0x5d) [0x2fd3d] x86_64- linux-gnu/ libglib- 2.0.so. 0.5600. 3(g_main_ context_ prepare+ 0x1c8) [0x4ba98] x86_64- linux-gnu/ libglib- 2.0.so. 0.5600. 3(g_main_ context_ dispatch+ 0x3cb) [0x4c46b] x86_64- linux-gnu/ libglib- 2.0.so. 0.5600. 3(g_main_ loop_run+ 0xc2) [0x4c8d2] x86_64- linux-gnu/ libmutter- 2.so.0. 0.0(meta_ run+0x2c) [0x9824c] gnome-shell( _init+0x88c) [0x248c] 64-linux- gnu/libc- 2.27.so( __libc_ start_main+ 0xe7) [0x21b97] gnome-shell( _init+0x9ca) [0x25ca]
> /lib/x86_
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/bin/
> /lib/x86_
> /usr/bin/
2028 [00007f7e93d77567] recvmsg(
> /lib/x86_
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /usr/bin/
> /lib/x86_
> /usr/bin/