glibc __read_chk not a cancellation point

Bug #2007796 reported by jandryuk
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Undecided
Unassigned
Kinetic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
I'm working with Xen and libxenstore. libxenstore, when using a "watch", spawns a pthread (read_thread). When libxenstore shuts down, it pthread_cancel()s and pthread_join()s the "watch" thread.

That thread never exits and the process shutdown hangs.

read_threads is sitting in __read_chk(). In glibc 2.35, __read_chk is not a cancellation point, so the thread never reacts to the cancellation.

Upstream glibc fixed it in 2.36 in https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=dc30acf20bd635d71cd4c84100e842fdf0429e48

Here's the 2.35 disassembly - the lack of __pthread_enable_asynccancel() indicates the missing cancellation support:
(gdb) disassemble
Dump of assembler code for function __read_chk:
   0x00007ffff7ea04d0 <+0>: endbr64
   0x00007ffff7ea04d4 <+4>: cmp %rcx,%rdx
   0x00007ffff7ea04d7 <+7>: ja 0x7ffff7ea0504 <__read_chk+52>
   0x00007ffff7ea04d9 <+9>: xor %eax,%eax
   0x00007ffff7ea04db <+11>: syscall
=> 0x00007ffff7ea04dd <+13>: cmp $0xfffffffffffff000,%rax
   0x00007ffff7ea04e3 <+19>: ja 0x7ffff7ea04f0 <__read_chk+32>
   0x00007ffff7ea04e5 <+21>: ret
   0x00007ffff7ea04e6 <+22>: cs nopw 0x0(%rax,%rax,1)
   0x00007ffff7ea04f0 <+32>: mov 0xe3919(%rip),%rdx # 0x7ffff7f83e10
   0x00007ffff7ea04f7 <+39>: neg %eax
   0x00007ffff7ea04f9 <+41>: mov %eax,%fs:(%rdx)
   0x00007ffff7ea04fc <+44>: mov $0xffffffffffffffff,%rax
   0x00007ffff7ea0503 <+51>: ret
   0x00007ffff7ea0504 <+52>: push %rax
   0x00007ffff7ea0505 <+53>: call 0x7ffff7ea00b0 <__GI___chk_fail>
End of assembler dump.

[Test procedure]

The patch includes a test for this that is run at build time.

[Regression potential]

Besides the usual risks with any glibc update, this could potentially surface some race conditions at thread shutdown in user applications that were thus far hidden by the lack of cancellation point.

Revision history for this message
jandryuk (jandryuk) wrote :
Simon Chopin (schopin)
no longer affects: glibc (Ubuntu Kinetic)
no longer affects: glibc (Ubuntu Lunar)
Changed in glibc (Ubuntu):
status: New → Fix Released
Simon Chopin (schopin)
Changed in glibc (Ubuntu Jammy):
status: New → In Progress
description: updated
Changed in glibc (Ubuntu Kinetic):
status: New → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello jandryuk, or anyone else affected,

Accepted glibc into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/glibc/2.35-0ubuntu3.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in glibc (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (glibc/2.35-0ubuntu3.2)

All autopkgtests for the newly accepted glibc (2.35-0ubuntu3.2) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

adsys/0.9.2~22.04.1 (armhf)
bbmap/38.95+dfsg-1 (armhf)
boost1.74/1.74.0-14ubuntu3 (armhf)
breezy/3.2.1+bzr7585-1build1 (armhf)
casync/2+20201210-1build1 (ppc64el)
cmake/3.22.1-1ubuntu1.22.04.1 (amd64, arm64, armhf)
cython/0.29.28-1ubuntu3 (i386)
dbus-test-runner/16.10.0~bzr100+repack1-4.1 (armhf)
exim4/4.95-4ubuntu2.2 (ppc64el)
fwupd/1.7.9-1~22.04.1 (armhf)
glib-networking/2.72.0-1 (i386)
golang-github-influxdata-tail/1.0.0+git20180327.c434825-4 (ppc64el)
golang-gogoprotobuf/1.3.2-1 (arm64)
grubzfs-testsuite/0.4.15build1 (amd64)
gtk+3.0/3.24.33-1ubuntu2 (i386)
gyoto/1.4.4-7build1 (amd64, arm64, s390x)
ksh93u+m/1.0.0~beta.2-1 (ppc64el)
libassuan/2.5.5-1build1 (amd64)
libflame/5.2.0-3ubuntu3 (s390x)
libite/2.5.1-1 (s390x)
linux-aws-5.19/5.19.0-1027.28~22.04.1 (amd64, arm64)
linux-azure-5.19/5.19.0-1027.30~22.04.2 (amd64)
linux-gcp-5.19/5.19.0-1026.28~22.04.1 (arm64)
linux-gke/5.15.0-1034.39 (arm64)
linux-hwe-5.19/5.19.0-44.45~22.04.1 (amd64)
linux-lowlatency-hwe-5.19/5.19.0-1026.27~22.04.1 (amd64)
linux-oracle-5.19/5.19.0-1025.28~22.04.1 (amd64)
mercurial/6.1.1-1ubuntu1 (amd64)
mtail/3.0.0~rc48-3 (ppc64el)
mutter/42.5-0ubuntu1 (amd64)
mypy/0.942-1ubuntu1 (armhf)
mysql-8.0/8.0.33-0ubuntu0.22.04.2 (s390x)
netgen/6.2.2006+really6.2.1905+dfsg-5build1 (armhf)
notcurses/3.0.6+dfsg.1-1 (armhf)
pango1.0/1.50.6+ds-2ubuntu1 (i386)
phcpack/2.4.85+dfsg-5build1 (arm64)
pinentry/1.1.1-1build2 (i386)
puma/5.5.2-2ubuntu2 (amd64, arm64)
pycryptodome/3.11.0+dfsg1-3build1 (i386)
pygobject/3.42.1-0ubuntu1 (i386)
python-evdev/1.4.0+dfsg-1build2 (i386)
python-llfuse/1.3.8+dfsg-2build1 (armhf)
python-lz4/3.1.3+dfsg-1build3 (i386)
python3.10/3.10.6-1~22.04.2ubuntu1.1 (arm64)
pyyaml/5.4.1-1ubuntu1 (i386)
r-bioc-rhdf5/2.38.0+dfsg-2 (amd64)
r-cran-randomfieldsutils/1.1.0-1 (armhf)
r-cran-spc/1:0.6.5-1 (armhf)
rsync/3.2.7-0ubuntu0.22.04.2 (i386)
ruby-nio4r/2.5.8-2 (amd64)
ruby-nokogiri/1.13.1+dfsg-2 (amd64)
ruby-prof/1.3.1-2build2 (amd64, ppc64el)
rustc/1.65.0+dfsg0ubuntu1-0ubuntu0.22.04.1 (arm64)
seqkit/2.1.0+ds-1 (arm64, s390x)
stunnel4/3:5.63-1build1 (i386)
swtpm/0.6.3-0ubuntu3.2 (s390x)
systemd/249.11-0ubuntu3.9 (amd64, arm64, ppc64el, s390x)
taptempo/1.4.5-1 (amd64)
texinfo/6.8-4build1 (armhf)
ubiquity/22.04.19 (amd64, arm64, armhf, ppc64el)
udisks2/2.9.4-1ubuntu2 (arm64)
utox/0.18.1-1build1 (armhf)
vlc/3.0.16-1build7 (i386)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#glibc

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
jandryuk (jandryuk) wrote : Re: [Bug 2007796] Re: glibc __read_chk not a cancellation point
Download full text (4.2 KiB)

Hi, I upgraded my system to Kinetic in ~March, so I no longer have a setup
to test with. Kinetic's newer glibc with the change identified here works
for me.

On Tue, Jun 6, 2023, 4:02 PM Brian Murray <email address hidden>
wrote:

> Hello jandryuk, or anyone else affected,
>
> Accepted glibc into jammy-proposed. The package will build now and be
> available at https://launchpad.net/ubuntu/+source/glibc/2.35-0ubuntu3.2
> in a few hours, and then in the -proposed repository.
>
> Please help us by testing this new package. See
> https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
> to enable and use -proposed. Your feedback will aid us getting this
> update out to other Ubuntu users.
>
> If this package fixes the bug for you, please add a comment to this bug,
> mentioning the version of the package you tested, what testing has been
> performed on the package and change the tag from verification-needed-
> jammy to verification-done-jammy. If it does not fix the bug for you,
> please add a comment stating that, and change the tag to verification-
> failed-jammy. In either case, without details of your testing we will
> not be able to proceed.
>
> Further information regarding the verification process can be found at
> https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
> advance for helping!
>
> N.B. The updated package will be released to -updates after the bug(s)
> fixed by this package have been verified and the package has been in
> -proposed for a minimum of 7 days.
>
> ** Changed in: glibc (Ubuntu Jammy)
> Status: In Progress => Fix Committed
>
> ** Tags added: verification-needed verification-needed-jammy
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/2007796
>
> Title:
> glibc __read_chk not a cancellation point
>
> Status in glibc package in Ubuntu:
> Fix Released
> Status in glibc source package in Jammy:
> Fix Committed
> Status in glibc source package in Kinetic:
> Fix Released
>
> Bug description:
> [Impact]
> I'm working with Xen and libxenstore. libxenstore, when using a
> "watch", spawns a pthread (read_thread). When libxenstore shuts down, it
> pthread_cancel()s and pthread_join()s the "watch" thread.
>
> That thread never exits and the process shutdown hangs.
>
> read_threads is sitting in __read_chk(). In glibc 2.35, __read_chk is
> not a cancellation point, so the thread never reacts to the
> cancellation.
>
> Upstream glibc fixed it in 2.36 in
>
> https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=dc30acf20bd635d71cd4c84100e842fdf0429e48
>
> Here's the 2.35 disassembly - the lack of __pthread_enable_asynccancel()
> indicates the missing cancellation support:
> (gdb) disassemble
> Dump of assembler code for function __read_chk:
> 0x00007ffff7ea04d0 <+0>: endbr64
> 0x00007ffff7ea04d4 <+4>: cmp %rcx,%rdx
> 0x00007ffff7ea04d7 <+7>: ja 0x7ffff7ea0504 <__read_chk+52>
> 0x00007ffff7ea04d9 <+9>: xor %eax,%eax
> 0x00007ffff7ea04db <+11>: syscall
> => 0x00007ffff7ea04dd <+13>: cmp $0xfffffffffffff000,%rax
> 0x00007f...

Read more...

Revision history for this message
Simon Chopin (schopin) wrote :

Since I see evidence of the added regression test being successfully run in the package build logs, I'm marking this as verified.

tags: added: verification-done verification-done-jammy
removed: verification-needed verification-needed-jammy
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello jandryuk, or anyone else affected,

Accepted glibc into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/glibc/2.35-0ubuntu3.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-needed verification-needed-jammy
removed: verification-done verification-done-jammy
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (glibc/2.35-0ubuntu3.2)

All autopkgtests for the newly accepted glibc (2.35-0ubuntu3.2) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

linux-aws-6.2/6.2.0-1007.7~22.04.1 (arm64)
linux-azure-6.2/6.2.0-1007.7~22.04.1 (arm64)
linux-oracle-5.19/5.19.0-1025.28~22.04.1 (arm64)
mysql-8.0/8.0.33-0ubuntu0.22.04.2 (s390x)
ubiquity/22.04.19 (amd64, arm64, armhf, ppc64el)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#glibc

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (glibc/2.35-0ubuntu3.3)

All autopkgtests for the newly accepted glibc (2.35-0ubuntu3.3) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

adsys/0.9.2~22.04.1 (arm64)
datefudge/1.24 (arm64)
dbus/1.12.20-2ubuntu4.1 (armhf)
dpdk/21.11.3-0ubuntu0.22.04.1 (arm64)
ganeti/3.0.2-1ubuntu1 (armhf)
gcc-9/9.5.0-1ubuntu1~22.04 (armhf)
gjs/1.72.4-0ubuntu0.22.04.1 (amd64)
golang-github-canonical-go-dqlite/1.10.1-1 (arm64)
golang-github-influxdata-tail/1.0.0+git20180327.c434825-4 (amd64)
golang-github-xenolf-lego/4.1.3-3ubuntu1.22.04.1 (armhf)
golang-gogoprotobuf/1.3.2-1 (amd64, s390x)
google-osconfig-agent/20230504.00-0ubuntu1~22.04.0 (armhf)
gvfs/1.48.2-0ubuntu1 (amd64)
hilive/2.0a-3build3 (arm64)
hyphy/2.5.36+dfsg-1 (amd64)
kmediaplayer/5.92.0-0ubuntu1 (armhf)
libclass-methodmaker-perl/2.24-2build2 (armhf)
libflame/5.2.0-3ubuntu3 (arm64)
libimage-sane-perl/5-1build3 (s390x)
libuev/2.4.0-1.1 (s390x)
linux-aws-6.2/6.2.0-1009.9~22.04.2 (arm64)
linux-azure-5.19/5.19.0-1027.30~22.04.2 (arm64)
linux-azure-6.2/6.2.0-1009.9~22.04.2 (arm64)
linux-gcp-5.19/5.19.0-1030.32~22.04.1 (arm64)
linux-gke/5.15.0-1039.44 (arm64)
linux-hwe-5.19/5.19.0-50.50 (arm64)
linux-lowlatency/5.15.0-79.88 (arm64)
linux-nvidia-6.2/6.2.0-1006.6~22.04.2 (arm64)
linux-nvidia-tegra/5.15.0-1015.15 (arm64)
linux-oem-5.17/5.17.0-1035.36 (amd64)
linux-oracle-5.19/5.19.0-1027.30 (arm64)
linux-xilinx-zynqmp/5.15.0-1023.27 (arm64)
log4cxx/0.12.1-4 (armhf)
netplan.io/0.105-0ubuntu2~22.04.3 (arm64)
notcurses/3.0.6+dfsg.1-1 (armhf)
php-luasandbox/4.0.2-3build1 (ppc64el)
prometheus/2.31.2+ds1-1ubuntu1.22.04.2 (s390x)
qcustomplot/2.0.1+dfsg1-5 (armhf)
r-cran-mice/3.14.0-1 (armhf)
r-cran-randomfields/3.3.14-1 (armhf)
r-cran-sys/3.4-1 (s390x)
ruby-rblineprof/0.3.7-2build3 (arm64)
rustc/1.66.1+dfsg0ubuntu1-0ubuntu0.22.04.1 (arm64)
samba/2:4.15.13+dfsg-0ubuntu1.2 (s390x)
stunnel4/3:5.63-1build1 (ppc64el)
swtpm/0.6.3-0ubuntu3.2 (arm64)
tmux/3.2a-4ubuntu0.2 (ppc64el)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#glibc

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Simon Chopin (schopin) wrote :

Marking as verified as the regression test has been run in the latest package build according to the logs.

tags: added: verification-done verification-done-jammy
removed: verification-needed verification-needed-jammy
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.35-0ubuntu3.3

---------------
glibc (2.35-0ubuntu3.3) jammy; urgency=medium

  * Drop SVE patches due to kernal-related performance regression
  * Fix the armhf stripping exception for ld.so (LP: #1927192)

glibc (2.35-0ubuntu3.2) jammy; urgency=medium

  * d/rules.d/debhelper.mk: fix permissions of libc.so (LP: #1989082)
  * Cherry-picks from upstream:
    - d/p/lp1999551/*: arm64 memcpy optimization (LP: #1999551)
    - d/p/lp1995362*.patch: Fix ldd segfault with missing libs (LP: #1995362)
    - d/p/lp2007796*: Fix missing cancellation point in pthread (LP: #2007796)
    - d/p/lp2007599*: add new tunables for s390x (LP: #2007599)
    - d/p/lp2011421/*: Fix crash on TDX-enabled platforms (LP: #2011421)
    - d/p/lp1992159*: Fix socket.h headers for non-GNU compilers (LP: #1992159)

 -- Simon Chopin <email address hidden> Wed, 26 Jul 2023 10:27:54 +0200

Changed in glibc (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote : Update Released

The verification of the Stable Release Update for glibc has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.