kea-ctrl-agent segfault in ppc64el dep8 test

Bug #2055151 reported by Andreas Hasenack
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
isc-kea (Ubuntu)
Fix Released
High
Andreas Hasenack

Bug Description

https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/ppc64el/i/isc-kea/20240227_095821_776a7@/log.gz

241s ## With no /etc/kea/kea-api-password, the service must not be running
241s × kea-ctrl-agent.service - Kea Control Agent
241s Loaded: loaded (/usr/lib/systemd/system/kea-ctrl-agent.service; enabled; preset: enabled)
241s Active: failed (Result: core-dump) since Tue 2024-02-27 09:50:35 UTC; 46ms ago
241s Duration: 1.145s
241s Condition: start condition unmet at Tue 2024-02-27 09:50:35 UTC; 45ms ago
241s └─ ConditionFileNotEmpty=/etc/kea/kea-api-password was not met
241s Docs: man:kea-ctrl-agent(8)
241s Process: 2265 ExecStart=/usr/sbin/kea-ctrl-agent -c /etc/kea/kea-ctrl-agent.conf (code=dumped, signal=SEGV)
241s Main PID: 2265 (code=dumped, signal=SEGV)
241s CPU: 6ms
241s
241s Feb 27 09:50:34 autopkgtest systemd[1]: Started kea-ctrl-agent.service - Kea Control Agent.
241s Feb 27 09:50:34 autopkgtest kea-ctrl-agent[2265]: 2024-02-27 09:50:34.681 INFO [kea-ctrl-agent.dctl/2265.114279535068448] DCTL_STARTING Control-agent starting, pid: 2265, version: 2.4.1 (stable)
241s Feb 27 09:50:34 autopkgtest kea-ctrl-agent[2265]: INFO CTRL_AGENT_HTTP_SERVICE_STARTED HTTP service bound to address 127.0.0.1:8000
241s Feb 27 09:50:34 autopkgtest kea-ctrl-agent[2265]: INFO DCTL_CONFIG_COMPLETE server has completed configuration: listening on 127.0.0.1, port 8000, control sockets: d2 dhcp4 dhcp6, requires basic HTTP authentication, 0 lib(s):
241s Feb 27 09:50:34 autopkgtest kea-ctrl-agent[2265]: INFO CTRL_AGENT_STARTED Kea Control Agent version 2.4.1 started
241s Feb 27 09:50:35 autopkgtest systemd[1]: Stopping kea-ctrl-agent.service - Kea Control Agent...
241s Feb 27 09:50:35 autopkgtest systemd[1]: kea-ctrl-agent.service: Main process exited, code=dumped, status=11/SEGV
241s Feb 27 09:50:35 autopkgtest systemd[1]: kea-ctrl-agent.service: Failed with result 'core-dump'.
241s Feb 27 09:50:35 autopkgtest systemd[1]: Stopped kea-ctrl-agent.service - Kea Control Agent.
241s Feb 27 09:50:35 autopkgtest systemd[1]: kea-ctrl-agent.service - Kea Control Agent was skipped because of an unmet condition check (ConditionFileNotEmpty=/etc/kea/kea-api-password).
241s ## ERROR, service is failed
241s autopkgtest [09:50:36]: test kea-ctrl-agent-debconf: -----------------------]
241s kea-ctrl-agent-debconf FAIL non-zero exit status 1

smoke test is also failing, likely the same issue:
473s autopkgtest [09:54:28]: test smoke-tests: [-----------------------
473s + debconf-set-selections
473s + dpkg-reconfigure kea-ctrl-agent
474s + kea_password_file=/etc/kea/kea-api-password
474s + '[' -s /etc/kea/kea-api-password ']'
474s + for f in kea-dhcp4.kea-dhcp4.pid kea-dhcp6.kea-dhcp6.pid kea-ctrl-agent.kea-ctrl-agent.pid kea-dhcp-ddns.kea-dhcp-ddns.pid
474s + test -f /run/kea/kea-dhcp4.kea-dhcp4.pid
474s + for f in kea-dhcp4.kea-dhcp4.pid kea-dhcp6.kea-dhcp6.pid kea-ctrl-agent.kea-ctrl-agent.pid kea-dhcp-ddns.kea-dhcp-ddns.pid
474s + test -f /run/kea/kea-dhcp6.kea-dhcp6.pid
474s + for f in kea-dhcp4.kea-dhcp4.pid kea-dhcp6.kea-dhcp6.pid kea-ctrl-agent.kea-ctrl-agent.pid kea-dhcp-ddns.kea-dhcp-ddns.pid
474s + test -f /run/kea/kea-ctrl-agent.kea-ctrl-agent.pid
474s + for f in kea-dhcp4.kea-dhcp4.pid kea-dhcp6.kea-dhcp6.pid kea-ctrl-agent.kea-ctrl-agent.pid kea-dhcp-ddns.kea-dhcp-ddns.pid
474s + test -f /run/kea/kea-dhcp-ddns.kea-dhcp-ddns.pid
474s + for socket in kea-ddns-ctrl-socket kea4-ctrl-socket kea6-ctrl-socket
474s + test -S /run/kea/kea-ddns-ctrl-socket
474s + for socket in kea-ddns-ctrl-socket kea4-ctrl-socket kea6-ctrl-socket
474s + test -S /run/kea/kea4-ctrl-socket
474s + for socket in kea-ddns-ctrl-socket kea4-ctrl-socket kea6-ctrl-socket
474s + test -S /run/kea/kea6-ctrl-socket
474s + test -f /run/lock/kea/logger_lockfile
474s + kea-dhcp4 -t /etc/kea/kea-dhcp4.conf
474s + kea-dhcp6 -t /etc/kea/kea-dhcp6.conf
474s + auth_params=
474s + basic_auth_params=
474s + '[' -s /etc/kea/kea-api-password ']'
474s ++ cat /etc/kea/kea-api-password
474s + auth_params='--auth-user kea-api --auth-password oyfunXELI1RCrfeOJejd'
474s ++ cat /etc/kea/kea-api-password
474s + basic_auth_params='-u kea-api:oyfunXELI1RCrfeOJejd'
474s ++ curl -u kea-api:oyfunXELI1RCrfeOJejd -s -X POST -H 'Content-Type: application/json' -d '{ "command": "version-get", "service": [ "dhcp4" ] }' 127.0.0.1:8000
474s ++ jq -r '.[0].text'
474s + TEST_KEA_VERSION=2.4.1
474s + check_kea_version 2.4.1
474s + CHECKED_VERSION=2.4.1
474s + [[ ! 2.4.1 =~ [0-9]+(\.[0-9]+){2} ]]
474s ++ echo
474s ++ kea-shell --service dhcp4 --host 127.0.0.1 --port 8000 --auth-user kea-api --auth-password oyfunXELI1RCrfeOJejd version-get
474s ++ jq -r '.[0].text'
475s jq: parse error: Invalid numeric literal at line 1, column 7
475s + TEST_KEA_VERSION=
475s autopkgtest [09:54:30]: test smoke-tests: -----------------------]
476s smoke-tests FAIL non-zero exit status 5

CVE References

Revision history for this message
Paride Legovini (paride) wrote :

I also happened to notice:

474s ++ jq -r '.[0].text'
475s jq: parse error: Invalid numeric literal at line 1, column 7

which is maybe a side effect of the crash?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

> which is maybe a side effect of the crash?

Yeah, it's what I thought, who knows what jq got in stdin.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

We were able to reproduce this with the -2 build, but not the -1 one. A simple "dpkg-reconfigure kea-ctrl-agent" on ppc64el crashes with a segfault:

[Fri Mar 1 13:34:41 2024] kea-ctrl-agent[6366]: segfault (11) at 0 nip 63821b5fbafc lr 63821b5fbad8 code 1 in libkea-asiolink.so.56.0.0[63821b5d0000+70000]
[Fri Mar 1 13:34:41 2024] kea-ctrl-agent[6366]: code: ebd40030 38628460 4bff064d 60000000 e9230000 4800001c 60000000 60000000
[Fri Mar 1 13:34:41 2024] kea-ctrl-agent[6366]: code: 60000000 e9290010 2c290000 41820014 <e9490000> 7c3e5000 4082ffec e9290008

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The other daemons are also crashin, and it's on shutdown. Either via systemctl stop, or even starting them in the foreground and just hitting ctrl-c, they segfault.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :
Download full text (6.2 KiB)

Maybe boost related? Anyway, caught this backtrace:
(gdb) r -c /etc/kea/kea-ctrl-agent.conf
Starting program: /usr/sbin/kea-ctrl-agent -c /etc/kea/kea-ctrl-agent.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff67ef0e0 (LWP 69142)]
[New Thread 0x7ffff5fdf0e0 (LWP 69143)]
[New Thread 0x7ffff57cf0e0 (LWP 69144)]
[New Thread 0x7ffff4fbf0e0 (LWP 69145)]
2024-03-01 14:07:19.107 INFO [kea-ctrl-agent.dctl/69140.140737354122528] DCTL_STARTING Control-agent starting, pid: 69140, version: 2.4.1 (stable)
INFO CTRL_AGENT_HTTP_SERVICE_STARTED HTTP service bound to address 127.0.0.1:8000
INFO DCTL_CONFIG_COMPLETE server has completed configuration: listening on 127.0.0.1, port 8000, control sockets: d2 dhcp4 dhcp6, requires basic HTTP authentication, 0 lib(s):
INFO CTRL_AGENT_STARTED Kea Control Agent version 2.4.1 started

Thread 1 "kea-ctrl-agent" received signal SIGTERM, Terminated.
Download failed: Invalid argument. Continuing without source file ./misc/../sysdeps/unix/sysv/linux/epoll_wait.c.
0x00007ffff7081444 in epoll_wait (epfd=<optimized out>, events=0x7fffffffde78, maxevents=<optimized out>, timeout=<optimized out>)
    at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
warning: 30 ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory
(gdb) c
Continuing.

Thread 1 "kea-ctrl-agent" received signal SIGSEGV, Segmentation fault.
0x00007ffff7bebafc in boost::asio::detail::call_stack<boost::asio::detail::thread_context, boost::asio::detail::thread_info_base>::contains (k=0x1000b6330)
    at /usr/include/boost/asio/detail/call_stack.hpp:98
98 if (elem->key_ == k)
(gdb) bt full
#0 0x00007ffff7bebafc in boost::asio::detail::call_stack<boost::asio::detail::thread_context, boost::asio::detail::thread_info_base>::contains (k=0x1000b6330)
    at /usr/include/boost/asio/detail/call_stack.hpp:98
        elem = 0x0
        elem = <optimized out>
#1 boost::asio::detail::scheduler::compensating_work_started (this=0x1000b6330) at /usr/include/boost/asio/detail/impl/scheduler.ipp:330
        this_thread = <optimized out>
        this_thread = <optimized out>
#2 boost::asio::detail::epoll_reactor::perform_io_cleanup_on_block_exit::~perform_io_cleanup_on_block_exit (this=<optimized out>, this=<optimized out>)
    at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:751
No locals.
#3 boost::asio::detail::epoll_reactor::descriptor_state::perform_io (events=<optimized out>, this=0x1000d3730) at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:803
        io_cleanup = {reactor_ = 0x1000d2040, ops_ = {<boost::asio::detail::noncopyable> = {<No data fields>}, front_ = 0x0, back_ = 0x0}, first_op_ = 0x0}
        descriptor_lock = <optimized out>
        io_cleanup = <optimized out>
        descriptor_lock = <optimized out>
        j = <optimized out>
        op = <optimized out>
        status = <optimized out>
#4 boost::asio::detail::epoll_reactor::descriptor_state::do_complete (bytes_transferred=<optimized out>, ec=..., base=0x1000d3730, owner=0x1000b6330)
    at /usr/include/boost/asio/detail/impl/epoll_reactor.ipp:813
        op = ...

Read more...

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

A rebuild of 2.4.1-1 with noble release produced binaries that worked.

A rebuild of the same 2.4.1-1 with noble-proposed produced binaries that crashed on shutdown.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :
Revision history for this message
Andreas Hasenack (ahasenack) wrote :
tags: added: server-todo
Changed in isc-kea (Ubuntu):
assignee: nobody → Andreas Hasenack (ahasenack)
status: New → Triaged
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I'm looking at this again.

Changed in isc-kea (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I had a ppc64el machine where I reproduced this segfault before. I just logged in again, updated the environment to latest noble proposed, and did a new rebuild. Those new binaries didn't segfault anymore.

I'm now doing a "proper" build in a ppa, and will run the DEP8 tests there (or locally, if the infra is overwhelmed). Confirming that "fix", I'll upload.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Sorry, wrong analysis, the rebuild is still segfaulting

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

A rebuild of the package in noble release (not proposed) currently also segfaults in ppc64el :/

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

When using the shutdown command via the API, kea-ctrl-agent does not segfault:

# kea-shell --auth-user kea-api --auth-password $(cat /etc/kea/kea-api-password) --host localhost shutdown
[ { "result": 0, "text": "Control Agent is shutting down" } ]

I had the server running inside gdb:
(...)
DEBUG CTRL_AGENT_RUN_EXIT application is exiting the event loop
INFO DCTL_SHUTDOWN Control-agent has shut down, pid: 460849, version: 2.4.1
[Thread 0x7ffff4fcf0e0 (LWP 460854) exited]
[Thread 0x7ffff57df0e0 (LWP 460853) exited]
[Thread 0x7ffff5fef0e0 (LWP 460852) exited]
[Thread 0x7ffff67ff0e0 (LWP 460851) exited]
[Inferior 1 (process 460849) exited normally]
(gdb)

So, normal exit. But when reacting to the TERM signal, it segfaults:
Thread 1 "kea-ctrl-agent" received signal SIGTERM, Terminated.
Download failed: Invalid argument. Continuing without source file ./misc/../sysdeps/unix/sysv/linux/epoll_wait.c.
0x00007ffff7091444 in epoll_wait (epfd=<optimized out>, events=0x7fffffffde88, maxevents=<optimized out>, timeout=<optimized out>)
    at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
warning: 30 ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory
(gdb) c
Continuing.

Thread 1 "kea-ctrl-agent" received signal SIGSEGV, Segmentation fault.
0x00007ffff7bebafc in boost::asio::detail::call_stack<boost::asio::detail::thread_context, boost::asio::detail::thread_info_base>::contains (k=0x1000b6330)
    at /usr/include/boost/asio/detail/call_stack.hpp:98
warning: Source file is more recent than executable.
98 if (elem->key_ == k)
(gdb)

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I think a build with an older boost worked:
--- a/debian/control
+++ b/debian/control
@@ -19,8 +19,8 @@ Build-Depends:
  docbook-xsl,
  elinks,
  flex,
- libboost-dev,
- libboost-system-dev,
+ libboost1.74-dev,
+ libboost-system1.74-dev,
  liblog4cplus-dev,
  libpq-dev,
  libssl-dev,

Locally I don't get a segfault anymore. I'm now waiting for a DEP8 run on ppc64el to confirm.

That being said, I don't know yet if that's a viable solution, i.e., I don't know what the lifecycle of lib boost versions is in the archive. Maybe this 1.74 is about to be removed, or is in universe, I haven't checked any of that yet.

The other tip I got recently was to try a build with less optimizations, specifically, -O2 instead of -O3. I remember other ppc64el-specific bugs caused by -O3 in the past. I'll try that next, with the current libboost version instead of the older 1.74 one.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

After some more debugging/testing, it seems like the problem is indeed caused by compiler optimization. More specifically, LTO. Disabling it seems to have done the trick, and I can't reproduce the segmentation fault anymore.

Basically:

ifeq ($(DEB_HOST_ARCH),ppc64el)
export DEB_BUILD_MAINT_OPTIONS += optimize=-lto
endif

on d/rules.

Revision history for this message
Paride Legovini (paride) wrote :

Thanks Sergio for thinking of LTO!

I opened:

https://salsa.debian.org/debian/isc-kea/-/merge_requests/54

We can't check from the CI logs that it actually disables LTO, but the diff _really_ seems right.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :
Changed in isc-kea (Ubuntu):
status: In Progress → Fix Committed
Changed in isc-kea (Ubuntu):
milestone: none → ubuntu-24.04-beta
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package isc-kea - 2.4.1-3build1

---------------
isc-kea (2.4.1-3build1) noble; urgency=medium

  * No-change rebuild for CVE-2024-3094

 -- Steve Langasek <email address hidden> Sun, 31 Mar 2024 18:32:35 +0000

Changed in isc-kea (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.