htop is blank when using in focal in wsl1

Bug #1871129 reported by Patrick Wu
178
This bug affects 39 people
Affects Status Importance Assigned to Milestone
Ubuntu WSL
Fix Released
Medium
Unassigned
glibc (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
htop (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

[Impact]

Programs previously using nanosleep syscall and with glibc 31's update being switched to use clock_nanosleep broke due to clock_nanosleep using CLOCK_REALTIME returns EINVAL in wsl1.

[Test Case]

Run sleep in WSL1 on Windows 10 2004 or older version.
The fixed version works correctly, the not fixed version breaks in Focal.

Run the following program under in WSL1 and on a real kernel with strace and observe that the fallback is applied only when it is needed:

#include <time.h>
#include <stdio.h>

int main (int argc, char ** argv) {
 struct timespec ts, rem;
       clock_gettime(CLOCK_REALTIME, &ts);
 printf("Sleep 1.5s with TIMER_ABSTIME\n");
 ts.tv_nsec += 500000000L;
 if (ts.tv_nsec > 1000000000L) {
  ts.tv_nsec -= 1000000000L;
    ts.tv_sec += 1;
 }
 ts.tv_sec += 1;
 clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, &ts, &rem);

 printf("sleep 1.2s\n");
 ts.tv_sec = 1;
 ts.tv_nsec = 200000000L;
 clock_nanosleep(CLOCK_REALTIME, 0, &ts, &rem);

 printf("sleep 1.2s with CLOCK_MONOTONIC\n");
 ts.tv_sec = 1;
 ts.tv_nsec = 200000000L;
 clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, &rem);

 printf("invalid sleep -1s (+200ms)\n");
 ts.tv_nsec = 200000000L;
 ts.tv_sec = -1;
 clock_nanosleep(CLOCK_REALTIME, 0, &ts, &rem);

 printf("invalid sleep 1s (-200ms)\n");
 ts.tv_nsec = -200000000L;
 ts.tv_sec = 1;
 clock_nanosleep(CLOCK_REALTIME, 0, &ts, &rem);

 printf("invalid sleep 0s (+1.200ms as nsec)\n");
 ts.tv_nsec = 1200000000L;
 ts.tv_sec = 1;
 clock_nanosleep(CLOCK_REALTIME, 0, &ts, &rem);

}

[Regression Potential]

The fix is falling back to using monotonic time and also converts the clock_nanosleep call to not use TIMER_ABSTIME in the fallback.
If the calculation from absolute to relative time is not correct then it may result in a very long sleep effectively hanging the calling process.
Also the checks in the callback can crash the calling program if a mistake is made.
The program in the [Test] section aims to cover all branches of the fix.

[Original Bug Text]

Right now I am trying Ubuntu 20.04 on WSL and I noticed that when I run htop in WSL 1st generation, it is completely blank:

https://user-images.githubusercontent.com/15316889/78563857-31697b00-784e-11ea-9f21-338a6cf8cb23.gif

The previous version (htop 2.1.0) works without any issue.

I am using htop 2.2.0-2build1 on Ubuntu 20.04 on WSL1 on Windows 10 build 19592.1001.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Yep, I was able to confirm this behavior.. will check it soon.

Changed in htop (Ubuntu):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
importance: Undecided → Critical
importance: Critical → Medium
status: New → Confirmed
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

The issue is happening because WSL is currently not POSIX compliant and any call to glibc that uses CLOCK_REALTIME will fail with EINVAL (-1) (such as clock_gettime() or clock_nanosleep()). It is likely that a glibc change has made this to appear (instead of the htop different version).

Upstream related bugs:

https://github.com/microsoft/WSL/issues/2503

https://github.com/microsoft/WSL/issues/4898 <- opened and being worked

--

htop strace output:

12324 read(4, "htop\0", 4096) = 5
12324 read(4, "", 4091) = 0
12324 close(4) = 0
12324 getdents64(3, /* 0 entries */, 32768) = 0
12324 close(3) = 0
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
12324 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=75000000}, 0x7ffff6dce7b0) = -1 EINVAL (Invalid argument)
... <indefinitely>

--

As a workaround a software can use monotonic clock instead but changing glibc specific function clocks is likely a no-go at this point in time (specially if just for WSL).

Changed in glibc (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Changed in htop (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Commit to blame in glibc:

commit 3537ecb49cf7177274607004c562d6f9ecc99474
Author: Adhemerval Zanella <email address hidden>
Date: Tue Nov 5 21:37:44 2019 +0000

    Refactor nanosleep in terms of clock_nanosleep

    The generic version is straightforward. For Hurd, its nanosleep
    implementation is moved to clock_nanosleep with adjustments from
    generic unix implementation.

    The generic clock_nanosleep unix version is also removed since
    it calls nanosleep.

    Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.

    Reviewed-by: Florian Weimer <email address hidden>

diff --git a/posix/nanosleep.c b/posix/nanosleep.c
index d8564c7119..ed41c8cce7 100644
--- a/posix/nanosleep.c
+++ b/posix/nanosleep.c
@@ -24,10 +24,13 @@ int
 __nanosleep (const struct timespec *requested_time,
             struct timespec *remaining)
 {
- __set_errno (ENOSYS);
- return -1;
+ int ret = __clock_nanosleep (CLOCK_REALTIME, 0, requested_time, remaining);
+ if (ret != 0)
+ {
+ __set_errno (ret);
+ return -1;
+ }
+ return 0;
 }
-stub_warning (nanosleep)
-
-hidden_def (__nanosleep)
+libc_hidden_def (__nanosleep)
 weak_alias (__nanosleep, nanosleep)

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

At our side we *could* change this clock from CLOCK_REALTIME to CLOCK_MONOTONIC but.... =) that does not seem the right thing to do to fix WSL1 because it does not behave properly with REALTIME clocks. Let's wait upstream bug for now...

Changed in htop (Ubuntu):
status: Confirmed → Opinion
Changed in wsl (Ubuntu):
status: New → Confirmed
Changed in glibc (Ubuntu):
status: Confirmed → Invalid
Changed in htop (Ubuntu):
status: Opinion → Invalid
importance: Medium → Undecided
Changed in glibc (Ubuntu):
importance: Medium → Undecided
Changed in wsl (Ubuntu):
importance: Undecided → Medium
Patrick Wu (callmepk)
affects: wsl (Ubuntu) → ubuntuwsl
Revision history for this message
Patrick Wu (callmepk) wrote :

Thanks for the update. I will also try to report in the WSL bug tracker to see whether there will be a fix for WSL1 in the future.

Revision history for this message
Patrick Wu (callmepk) wrote :

Looks like the fix is coming but not yet released: https://github.com/microsoft/WSL/issues/4898#issuecomment-603384784

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

This is a "hotfix" for the issue. I'm preparing a PPA so people can use it but it won't be the final fix. It is more likely that WSL1 will have to support CLOCK_REALTIME eventually =).

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

PPA: https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1871129

Reminder: this is just a temporary workaround as we wait upstream fix.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

rafaeldtinoco@wsl1:~$ sudo apt-mark hold libc6
libc6 set on hold.

rafaeldtinoco@wsl1:~$ dpkg -l libc6
hi libc6:amd64 2.31-0ubuntu8+lp1871129~1 amd64 GNU C Library: Shared libraries

Not sure how much time upstream will take to fix this bug, so make sure to mark libc6 as "hold" so it does not get upgraded. The version I'm using for the mitigation is: 2.31-0ubuntu8+lp1871129~1. It is the latest for today but won't be if there is some stable release upgrade for glibc (libc6) landing in the archive.

I confirm "htop" works good with this mitigation.

o/

Changed in ubuntuwsl:
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Revision history for this message
Hayden Barnes (haydenb) wrote :

I am changing the status of this bug to in progress because a patch for this on WSl 1 is now available on Insiders Fast Ring and will be backported to existing Windows 10 builds in the coming weeks.

We are tracking general issues on this at https://bugs.launchpad.net/ubuntuwsl/+bug/1871240

Changed in ubuntuwsl:
status: Confirmed → In Progress
Changed in ubuntuwsl:
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Balint Reczey (rbalint)
Changed in glibc (Ubuntu):
status: Invalid → Fix Committed
Balint Reczey (rbalint)
description: updated
Balint Reczey (rbalint)
description: updated
Balint Reczey (rbalint)
description: updated
description: updated
Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Patrick, or anyone else affected,

Accepted glibc into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/glibc/2.31-0ubuntu9.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in glibc (Ubuntu Focal):
status: New → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Balint Reczey (rbalint) wrote :

fixed in glibc 2.31-0ubuntu11 in 20.10

Changed in glibc (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (glibc/2.31-0ubuntu9.1)

All autopkgtests for the newly accepted glibc (2.31-0ubuntu9.1) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

prometheus-blackbox-exporter/0.13.0+ds-2 (armhf, amd64, ppc64el, s390x, arm64)
prometheus-pushgateway/1.0.0+ds-1 (armhf, amd64, ppc64el, s390x, arm64)
systemd/245.4-4ubuntu3.2 (s390x, amd64, ppc64el)
gfs2-utils/unknown (amd64)
hugo/0.68.3-1 (armhf, amd64, ppc64el, s390x, arm64)
grubzfs-testsuite/0.4.10 (amd64)
glibc/2.31-0ubuntu9.1 (armhf)
badger/2.0.1-3 (armhf, amd64, ppc64el, s390x, arm64)
resource-agents/1:4.5.0-2ubuntu2 (armhf)
etcd/3.2.26+dfsg-6 (amd64, ppc64el)
postgresql-multicorn/1.3.4-31-g9ff7875-3 (armhf, amd64, ppc64el, s390x)
gfs2-utils/3.2.0-3 (ppc64el, s390x, arm64)
scipy/1.3.3-3build1 (ppc64el)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#glibc

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Balint Reczey (rbalint) wrote :
Download full text (4.5 KiB)

Tested with 2.31-0ubuntu9.1 on Focal running on Windows 10 build 19041 in WSL1:

]0;ubuntu@DESKTOP-GOODMAK: ~ubuntu@DESKTOP-GOODMAK:~$ ./a.out 
Sleep 1.5s with TIMER_ABSTIME
sleep 1.2s
sleep 1.2s with CLOCK_MONOTONIC
invalid sleep -1s (+200ms)
invalid sleep 1s (-200ms)
invalid sleep 0s (+1.200ms as nsec)
]0;ubuntu@DESKTOP-GOODMAK: ~ubuntu@DESKTOP-GOODMAK:~$ ./a.out [1@s[1@t[1@r[1@a[1@c[1@e[1@
execve("./a.out", ["./a.out"], 0x7ffffa6eaab0 /* 19 vars */) = 0
brk(NULL) = 0x7ffff34aa000
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffffb11f4e0) = -1 EINVAL (Invalid argument)
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=29824, ...}) = 0
mmap(NULL, 29824, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe759f58000
close(3) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360q\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32, 848) = 32
pread64(3, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\363\377?\332\200\270\27\304d\245n\355Y\377\t\334"..., 68, 880) = 68
fstat(3, {st_mode=S_IFREG|0755, st_size=2029224, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe759f90000
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32, 848) = 32
pread64(3, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\363\377?\332\200\270\27\304d\245n\355Y\377\t\334"..., 68, 880) = 68
mmap(NULL, 2036952, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fe759d60000
mprotect(0x7fe759d85000, 1847296, PROT_NONE) = 0
mmap(0x7fe759d85000, 1540096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7fe759d85000
mmap(0x7fe759efd000, 303104, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19d000) = 0x7fe759efd000
mmap(0x7fe759f48000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7fe759f48000
mmap(0x7fe759f4e000, 13528, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fe759f4e000
close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7fe759f91380) = 0
mprotect(0x7fe759f48000, 12288, PROT_READ) = 0
mprotect(0x7fe759f97000, 4096, PROT_READ) = 0
mprotect(0x7fe759f8d000, 4096, PROT_READ) = 0
munmap(0x7fe759f58000, 29824) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1600117863, tv_nsec=617308900}) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0
brk(NULL) = 0x7ffff34aa000
brk(0x7ffff34cb000) = 0x7ffff34cb000
write(1, "Sleep 1.5s with TIMER_ABSTIME\n", 30Sleep 1.5s with TIMER_ABSTIME
) = 30
clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, {tv_sec=1600117865, tv_nsec=117308900}, 0x7ffffb11f4a0) = -1 E...

Read more...

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.31-0ubuntu9.1

---------------
glibc (2.31-0ubuntu9.1) focal; urgency=medium

  [ Michael Hudson-Doyle ]
  * Mark tst-getpw as XFAIL on arm64. (LP: #1869364)

  [ Matthias Klose ]
  * Copy the fully conditionalized x86 variant for math-vector-fortran.h
    to /usr/include/finclude. On all architectures. (LP: #1879092)

  [ Balint Reczey ]
  * debian/gbp.conf: Add initial configuration
  * debian/control.in/main: Add Vcs-* pointing to Ubuntu packaging repository
  * debian/debhelper.in/libc.preinst: Fix setting LDCONFIG_NOTRIGGER
    (LP: #1889190)
  * Fall back to calling nanosleep syscall when __clock_nanosleep returns
    EINVAL due to CLOCK_REALTIME not being supported (LP: #1871129)
  * debian/testsuite-xfail-debian.mk: XFAIL tst-getpw on armhf, too
    (LP: #1869364)
  * XFAIL stdlib/tst-getrandom (LP: #1891403)

  [ Dimitri John Ledkov ]
  * debian/patches/powerpc: Cherrypick upstream patches to support POWER10
    optimized library loading. LP: #1887989

 -- Balint Reczey <email address hidden> Mon, 17 Aug 2020 22:02:52 +0200

Changed in glibc (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for glibc has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Patrick Wu (callmepk)
Changed in ubuntuwsl:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.