CentOS builds of Skydive SEGV on startup

Bug #1940862 reported by Stig Telfer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Low
Unassigned

Bug Description

It appears that CentOS 8 builds of Skydive all immediately segfault on startup, and this appears to have been happening since train-centos8.

Example backtrace from running Skydive within the CentOS8 container environment:

# /usr/bin/skydive
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0xe5 pc=0x7efee038ebd0]

runtime stack:
runtime.throw(0x3b21109, 0x2a)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/runtime/panic.go:608 +0x72
runtime.sigpanic()
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/runtime/signal_unix.go:374 +0x2f2

goroutine 1 [syscall, locked to thread]:
runtime.cgocall(0x2dde8c0, 0xc000501408, 0x29)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/runtime/cgocall.go:128 +0x5e fp=0xc0005013d0 sp=0xc000501398 pc=0x404fbe
os/user._Cfunc_mygetpwuid_r(0x0, 0xc0003ecd20, 0x7a7ad30, 0x400, 0xc0003c22d8, 0x0)
        _cgo_gotypes.go:173 +0x4d fp=0xc000501408 sp=0xc0005013d0 pc=0x220495d
os/user.lookupUnixUid.func1.1(0xc000000000, 0xc0003ecd20, 0x7a7ad30, 0x400, 0xc0003c22d8, 0x45bf30)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/os/user/cgo_lookup_unix.go:100 +0x135 fp=0xc000501448 sp=0xc000501408 pc=0x2205f45
os/user.lookupUnixUid.func1(0x3496b80)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/os/user/cgo_lookup_unix.go:100 +0x50 fp=0xc000501488 sp=0xc000501448 pc=0x2205fe0
os/user.retryWithBuffer(0xc00036d770, 0xc000501560, 0xc00036d770, 0x0)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/os/user/cgo_lookup_unix.go:253 +0x3e fp=0xc0005014e0 sp=0xc000501488 pc=0x220542e
os/user.lookupUnixUid(0x0, 0x0, 0x0, 0x0)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/os/user/cgo_lookup_unix.go:96 +0x130 fp=0xc000501598 sp=0xc0005014e0 pc=0x2204db0
os/user.current(0xc000501600, 0x10, 0xc0005015f8)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/os/user/cgo_lookup_unix.go:49 +0x27 fp=0xc0005015c8 sp=0xc000501598 pc=0x2204c47
os/user.Current.func1()
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/os/user/lookup.go:11 +0x22 fp=0xc0005015f0 sp=0xc0005015c8 pc=0x2205da2
sync.(*Once).Do(0x72cd0e0, 0x3bc73d8)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/sync/once.go:44 +0xb3 fp=0xc000501620 sp=0xc0005015f0 pc=0x468623
os/user.Current(0xc0003de4f0, 0xb, 0xc0003de4f0)
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/os/user/lookup.go:11 +0x3d fp=0xc000501650 sp=0xc000501620 pc=0x220454d
k8s.io/klog.init.1()
        /<email address hidden>/klog_file.go:58 +0x37 fp=0xc0005016a8 sp=0xc000501650 pc=0x236efe7
k8s.io/klog.init()
        <autogenerated>:1 +0x172 fp=0xc0005016d8 sp=0xc0005016a8 pc=0x236fec2
k8s.io/apimachinery/pkg/labels.init()
        <autogenerated>:1 +0x6e fp=0xc000501718 sp=0xc0005016d8 pc=0x2376f5e
k8s.io/apimachinery/pkg/apis/meta/v1.init()
        <autogenerated>:1 +0x85 fp=0xc000501880 sp=0xc000501718 pc=0x23f7405
k8s.io/api/core/v1.init()
        <autogenerated>:1 +0x7b fp=0xc000501f30 sp=0xc000501880 pc=0x25e03ab
github.com/skydive-project/skydive/topology/probes/k8s.init()
        <autogenerated>:1 +0x5a fp=0xc000501f48 sp=0xc000501f30 pc=0x2b5975a
github.com/skydive-project/skydive/topology/probes/istio.init()
        <autogenerated>:1 +0x54 fp=0xc000501f58 sp=0xc000501f48 pc=0x2d1f674
github.com/skydive-project/skydive/analyzer.init()
        <autogenerated>:1 +0x7a fp=0xc000501f68 sp=0xc000501f58 pc=0x2db48ba
github.com/skydive-project/skydive/cmd/allinone.init()
        <autogenerated>:1 +0x75 fp=0xc000501f78 sp=0xc000501f68 pc=0x2db6905
github.com/skydive-project/skydive/cmd/skydive.init()
        <autogenerated>:1 +0x5c fp=0xc000501f88 sp=0xc000501f78 pc=0x2dda98c
main.init()
        <autogenerated>:1 +0x45 fp=0xc000501f98 sp=0xc000501f88 pc=0x2ddaad5
runtime.main()
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/runtime/proc.go:189 +0x1bd fp=0xc000501fe0 sp=0xc000501f98 pc=0x43104d
runtime.goexit()
        /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc000501fe8 sp=0xc000501fe0 pc=0x45f9b1

The stack trace generated when running it within gdb:

#0 0x00007fbedc2a2bd0 in __nss_readline () from /lib64/libnss_files.so.2
#1 0x00007fbedc29fc1f in internal_getent () from /lib64/libnss_files.so.2
#2 0x00007fbedc29fff3 in _nss_files_getpwuid_r () from /lib64/libnss_files.so.2
#3 0x0000000002f62acd in getpwuid_r ()
#4 0x0000000002dde8e6 in mygetpwuid_r (result=<optimized out>, buflen=<optimized out>, buf=<optimized out>, pwd=<optimized out>, uid=<optimized out>)
    at /var/lib/jenkins/.gimme/versions/go1.11.13.linux.amd64/src/os/user/cgo_lookup_unix.go:28
#5 _cgo_a84f89c9c806_Cfunc_mygetpwuid_r (v=0xc0004a1408) at cgo-gcc-prolog:138
#6 0x000000000045f130 in runtime.asmcgocall ()
#7 0x000000000045bfd3 in runtime.newdefer.func2 ()
#8 0x0000000000000040 in ?? ()
#9 0x00000000038220a0 in ?? ()
#10 0x000000000045ab01 in runtime.(*mcache).nextFree.func1 ()
#11 0x000000c0003fd3c0 in ?? ()
#12 0x0000000000000c70 in ?? ()
#13 0x000000c000000300 in ?? ()
#14 0x000000000045d946 in runtime.systemstack ()
#15 0x0000000000434010 in ?? ()
#16 0x000000000045d7d9 in runtime.rt0_go ()
#17 0x0000000002f01d10 in .annobin_libc_tls.c_end ()
#18 0x00007ffd9b020988 in ?? ()
#19 0x0000000002f01d10 in .annobin_libc_tls.c_end ()
#20 0x000000000045d7e0 in runtime.rt0_go ()
#21 0x0000000000000002 in ?? ()
#22 0x00007ffd9b81fb08 in ?? ()
#23 0x0000000000000002 in ?? ()
#24 0x00007ffd9b81fb08 in ?? ()
#25 0x0000000000400640 in __rela_iplt_start ()
#26 0x0000000002f016d1 in __libc_start_main ()
#27 0x0000000000401f5e in _start ()

Run the same binary outside of the container in a CentOS 8.4 environment and it works (printing usage).

The same binary also works in Ubuntu containers.

I have tested it and reproduced the failure on:

- train-wallaby centos8 containers (fails)
- victoria ubuntu (works)
- train centos7 (works)

Revision history for this message
Mark Goddard (mgoddard) wrote :

Says something about k8s in the backtrace, in case that is relevant.

Changed in kolla:
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla/+/846392

Changed in kolla:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/kolla/+/846257

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on kolla (stable/yoga)

Change abandoned by "Daniel Stoye <email address hidden>" on branch: stable/yoga
Review: https://review.opendev.org/c/openstack/kolla/+/846257
Reason: wait till merged in master

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (master)

Reviewed: https://review.opendev.org/c/openstack/kolla/+/846392
Committed: https://opendev.org/openstack/kolla/commit/1f1214d4fe496ca015497d7e2fb5aa994c805736
Submitter: "Zuul (22348)"
Branch: master

commit 1f1214d4fe496ca015497d7e2fb5aa994c805736
Author: Daniel Stoye <email address hidden>
Date: Fri Jun 17 16:23:11 2022 +0200

    Bump skydive version to 0.28

    Skydive versions prior to 0.28.0 panic on newer versions of libc.
    Fixed upstream in 0.28, see: https://github.com/skydive-project/skydive/issues/2329
    This should be backported to at least xena and yoga, as skydive is
    currently not working with centos 8 on these releases.

    Closes-Bug: #1940862
    Change-Id: I177949b9319a977c9cd9121eb28b710256b72a5a

Changed in kolla:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/kolla/+/846554

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/kolla/+/846257
Committed: https://opendev.org/openstack/kolla/commit/66eec33571fdcc6e6d95a284e11925d626329bf1
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 66eec33571fdcc6e6d95a284e11925d626329bf1
Author: Daniel Stoye <email address hidden>
Date: Fri Jun 17 16:23:11 2022 +0200

    Bump skydive version to 0.28

    Skydive versions prior to 0.28.0 panic on newer versions of libc.
    Fixed upstream in 0.28, see: https://github.com/skydive-project/skydive/issues/2329
    This should be backported to at least xena and yoga, as skydive is
    currently not working with centos 8 on these releases.

    Closes-Bug: #1940862
    Change-Id: I177949b9319a977c9cd9121eb28b710256b72a5a
    (cherry picked from commit 1f1214d4fe496ca015497d7e2fb5aa994c805736)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/kolla/+/846554
Committed: https://opendev.org/openstack/kolla/commit/9adffe329cd296edb8ecc0a49f503e28b1ee5623
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 9adffe329cd296edb8ecc0a49f503e28b1ee5623
Author: Daniel Stoye <email address hidden>
Date: Fri Jun 17 16:23:11 2022 +0200

    Bump skydive version to 0.28

    Skydive versions prior to 0.28.0 panic on newer versions of libc.
    Fixed upstream in 0.28, see: https://github.com/skydive-project/skydive/issues/2329
    This should be backported to at least xena and yoga, as skydive is
    currently not working with centos 8 on these releases.

    Closes-Bug: #1940862
    Change-Id: I177949b9319a977c9cd9121eb28b710256b72a5a
    (cherry picked from commit 1f1214d4fe496ca015497d7e2fb5aa994c805736)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 13.2.0

This issue was fixed in the openstack/kolla 13.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 14.2.0

This issue was fixed in the openstack/kolla 14.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/kolla/+/853245

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/kolla/+/853245
Committed: https://opendev.org/openstack/kolla/commit/a7b00b08c3dc9488a308c2939bb6dd2c531bc879
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit a7b00b08c3dc9488a308c2939bb6dd2c531bc879
Author: Daniel Stoye <email address hidden>
Date: Fri Jun 17 16:23:11 2022 +0200

    Bump skydive version to 0.28

    Skydive versions prior to 0.28.0 panic on newer versions of libc.
    Fixed upstream in 0.28, see: https://github.com/skydive-project/skydive/issues/2329
    This should be backported to at least xena and yoga, as skydive is
    currently not working with centos 8 on these releases.

    Closes-Bug: #1940862
    Change-Id: I177949b9319a977c9cd9121eb28b710256b72a5a
    (cherry picked from commit 1f1214d4fe496ca015497d7e2fb5aa994c805736)
    (cherry picked from commit 9adffe329cd296edb8ecc0a49f503e28b1ee5623)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 12.5.0

This issue was fixed in the openstack/kolla 12.5.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 15.0.0.0rc1

This issue was fixed in the openstack/kolla 15.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.