omshell returns inconsistent results or segfaults

Bug #1916931 reported by Bill MacAllister
84
This bug affects 17 people
Affects Status Importance Assigned to Milestone
isc-dhcp (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

I have just built a Ubuntu 20.04 server and installed isc-dhcp-server
4.4.1 on it and I am seeing inconsistent returns from omshell.
Initially omshell returns data as expected, but when I exit and re-enter
omshell connections fail.

Here is the initial, working, session:

# omshell
> server localhost
> port 7911
> key omapi_key <the key>
> connect
obj: <null>
> new failover-state
obj: failover-state
> set name = "dhcp-failover"
obj: failover-state
name = "dhcp-failover"
> open
obj: failover-state
name = "dhcp-failover"
partner-address = c0:9d:e9:76:e9:55:00:00
partner-port = 00:00:02:07
local-address = 10:9d:e9:76:e9:55:00:00
local-port = 00:00:02:07
max-outstanding-updates = 00:00:00:0a
mclt = 00:00:01:2c
load-balance-max-secs = 00:00:00:03
load-balance-hba =
ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
partner-state = 00:00:00:02
local-state = 00:00:00:02
partner-stos = 60:36:d0:68
local-stos = 60:36:8b:3b
hierarchy = 00:00:00:01
last-packet-sent = 00:00:00:00
last-timestamp-received = 00:00:00:00
skew = 00:00:00:00
max-response-delay = 00:00:00:3c
cur-unacked-updates = 00:00:00:00

Here is what I see when the connect fails. Well, just hangs really.

# omshell
> server localhost
> port 7911
> key omapi_key <the key>
> connect

And then I hit ctrl-c to break out and tried again:

# omshell
> server localhost
> port 7911
> key omapi_key <the key>
> connect
Segmentation fault (core dumped)

Note, the peer to this server is still running Ubuntu 18.04 with
isc-dhcp-server 4.3.5. Running the exact same commands on the peer
works reliably. (They are using the same python script to drive
omshell.) The DHCP server on the new system appears to be working
just fine as reported by omshell on the peer and systemctl.

I was curious if the problem could be with the mis-matched versions
of isc-dhcp-server so I shutdown isc-dhcp-server on the 18.04 system
and get the same results.

I also tried using a python script with the pypureomapi module to
try and determine if the problem was in omshell or the server. I
got very similar results when I attempted to get information about
the failover state of the server. Interestingly interrogating
the server about host information seems to work just fine.

This is a critical bug since I don't see how to fail over a DHCP
that is running the isc-dhcp-server on 20.04 without being able
to issue omapi commands.

I am attaching apport output to this bug report.

Tags: focal
Revision history for this message
Bill MacAllister (oill) wrote :
Revision history for this message
Bill MacAllister (oill) wrote :

I imported version 4.4.2 from upstream and built for focal. The problem persists. OMAPI failover queries work sometimes and fail most of the time.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in isc-dhcp (Ubuntu):
status: New → Confirmed
tags: added: focal
Revision history for this message
Paweł Moll (pawel-moll) wrote :

I've noticed that the omshell as shipped in the package is doing bad things memory access wise...

$ valgrind /usr/bin/omshell
==735304== Memcheck, a memory error detector
==735304== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==735304== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==735304== Command: /usr/bin/omshell
==735304==
==735304== Conditional jump or move depends on uninitialised value(s)
==735304== at 0x48760C0: irs_resconf_load (in /usr/lib/x86_64-linux-gnu/libirs-export.so.161.0.1)
==735304== by 0x155D67: dhcp_dns_client_setservers (isclib.c:51)
==735304== by 0x15620B: dns_client_init (isclib.c:381)
==735304== by 0x15620B: dns_client_init (isclib.c:354)
==735304== by 0x1562BB: dhcp_context_create (isclib.c:246)
==735304== by 0x114C15: dhcpctl_initialize (dhcpctl.c:45)
==735304== by 0x11374B: main (omshell.c:118)
==735304==
> connect
==735304== Invalid read of size 8
==735304== at 0x14EC6B: omapi_one_dispatch (dispatch.c:676)
==735304== by 0x14F462: omapi_wait_for_completion (dispatch.c:455)
==735304== by 0x114D62: dhcpctl_connect (dhcpctl.c:115)
==735304== by 0x113F23: main (omshell.c:420)
==735304== Address 0x7010cf0 is 32 bytes inside a block of size 96 free'd
==735304== at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==735304== by 0x14A39F: dfree (alloc.c:203)
==735304== by 0x14A39F: omapi_object_dereference (alloc.c:709)
==735304== by 0x14E1DA: omapi_io_dereference (dispatch.c:39)
==735304== by 0x14E1DA: omapi_unregister_io_object (dispatch.c:410)
==735304== by 0x14B217: omapi_disconnect.part.0 (connection.c:528)
==735304== by 0x14B4C4: omapi_disconnect (connection.c:465)
==735304== by 0x14B4C4: omapi_connection_connect_internal (connection.c:728)
==735304== by 0x14B61C: omapi_connection_connect (connection.c:615)
==735304== by 0x14EC4E: omapi_one_dispatch (dispatch.c:693)
==735304== by 0x14F462: omapi_wait_for_completion (dispatch.c:455)
==735304== by 0x114D62: dhcpctl_connect (dhcpctl.c:115)
==735304== by 0x113F23: main (omshell.c:420)
==735304== Block was alloc'd at
==735304== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==735304== by 0x149F11: dmalloc (alloc.c:71)
==735304== by 0x14A094: omapi_object_allocate (alloc.c:543)
==735304== by 0x14DDDD: omapi_io_allocate (dispatch.c:39)
==735304== by 0x14DDDD: omapi_register_io_object (dispatch.c:223)
==735304== by 0x14BCA7: omapi_connect_list (connection.c:219)
==735304== by 0x14C012: omapi_connect (connection.c:96)
==735304== by 0x146BD6: omapi_protocol_connect (protocol.c:55)
==735304== by 0x114D21: dhcpctl_connect (dhcpctl.c:106)
==735304== by 0x113F23: main (omshell.c:420)
==735304==
dhcpctl_connect: no more

When I build ISC's dhcpd from their git repository manually, not using the source package, it works fine. I haven't got to the bottom of it, though.

Revision history for this message
Daniel C (daniel314) wrote :

I also noticed problems with omshell after upgrading to Ubuntu 20.04, and after finding this bug report I can confirm that compiling the ISC sources (w/o the Ubuntu/Debian patches) does resolve the problem(s) reported here.

Revision history for this message
Andrea (aturbiglio) wrote :

I confirm the bug on our environment with Ubuntu 20.04 and isc-dhcp-server 4.4.1.
Omshell randomly goes on segfault:
[Tue Sep 28 11:05:22 2021] omshell[4604]: segfault at 0 ip 000055623cdd06dc sp 00007ffd5a2c7c78 error 4 in omshell[55623cd97000+45000]

Revision history for this message
Fabian Goebel (fabiangoebel) wrote :

I can also confirm this bug with ubuntu 20.04 and isc-dhcp-server 4.4.1

I use the isc-dhcp-server as part of foreman infrastructure and with this bug it is not possible to automate the isc-dhcp-server from foreman.

The Problem was solved for the moment as I just compile the isc-dhcp-server from source and install it over the package... but this should not be the long time solution I think..

For whom ever interested this is my solution at this moment: https://community.theforeman.org/t/foreman-proxy-failed-to-add-dhcp-reservation-for-new-vm/27091/3

Revision history for this message
Brendan Holmes (whiling) wrote :

I confirm this bug with this package on ubuntu 20.04 too, not happening when build from source

Revision history for this message
Massimiliano Ballerini (mballerini) wrote (last edit ):

Updated isc-dhcp-server to 4.4.1 from apt and omshell crashing 100% of the time with my configuration.

Compiled the isc-dhcp-server 4.4.3 from source and I'm using omshell-4.4.3 bin with isc-dhcp-server 4.4.1 without problem for 3 months now. Probably just fixed in 4.4.3 or 4.4.1 compile problem.

Revision history for this message
Lukas Märdian (slyon) wrote :

Could somebody confirm if this issue is still happening with omshell v4.4.3 from the Ubuntu Kinetic package?

I wonder if this is a problem with the build environment (LTO, etc...), the Debian/Ubuntu distro patches, or if it was fixed upstream in between 4.4.1->4.4.3?

Changed in isc-dhcp (Ubuntu):
status: Confirmed → Incomplete
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for isc-dhcp (Ubuntu) because there has been no activity for 60 days.]

Changed in isc-dhcp (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Bill MacAllister (oill) wrote :

Hmm, why would an unresolved bug be expired? The problem has been confirmed by more than one of us. I just finished tested with 22.04 and can report that the problem persists.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Bill, Lukas asked a question in comment #10 and set the bug to 'incomplete', hoping to get feedback from someone who could reproduce the problem. If you can provide an answer, please do set the bug back to 'confirmed' when answering.

Thanks

Revision history for this message
Bill MacAllister (oill) wrote :

> Bill, Lukas asked a question in comment #10 and set the bug to 'incomplete', hoping
> to get feedback from someone who could reproduce the problem. If you can provide an
> answer, please do set the bug back to 'confirmed' when answering.

I don't agree with Lukas' action. The bug is still valid for Jammy, the LTS version.
If Lukas wants to provide a package to test on Jammy I can do that, but setting up
a DHCP peer environment in a version of Ubuntu that I don't have a build environment
for will take quite a bit of effort. It is particularly onerous because I only use
LTS versions of Ubuntu, so even if the Kinetic works it doesn't solve my problem
for another 2 years.

> Could somebody confirm if this issue is still happening with omshell v4.4.3 from
> the Ubuntu Kinetic package?

I suppose there is an alternative. I'll see if I can backport the 4.4.3 version from
Kinetic to Jammy.

I can confirm that the bug exists in 4.4.1 on Debian 11, Bullseye.

Bill

Revision history for this message
Bill MacAllister (oill) wrote :

Apologies for the poorly wrapped response.

I pulled down the isc-dhcp source package from Kinetic and built it for Jammy. In my initial test showed that this does indeed seem to fix the problem.

There was an issue with 'gbp buildpackage' deleting some files in the source repository that caused the initial build to fail:

deleted: config.guess
deleted: config.sub
deleted: keama/tests/samples/example.conf.orig
deleted: keama/tests/samples/test-a6.conf.orig
deleted: keama/tests/samples/vmnet8.conf.orig

I just 'git commit'ed the deletions and the package built successfully.

Bill

Changed in isc-dhcp (Ubuntu):
status: Expired → New
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in isc-dhcp (Ubuntu):
status: New → Confirmed
Revision history for this message
Stefan Lasiewski (stefanlasiewski) wrote :

> Could somebody confirm if this issue is still happening with omshell v4.4.3 from the Ubuntu Kinetic package?

Unfortunately, the package requires a new version of libc. It's not a simple upgrade.

```
# apt list --upgradable
Listing... Done
isc-dhcp-server/kinetic 4.4.3-2ubuntu4 amd64 [upgradable from: 4.4.1-2.1ubuntu5.20.04.5]
#
# apt -q install isc-dhcp-server
Reading package lists...
Building dependency tree...
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 isc-dhcp-server : Depends: libc6 (>= 2.36) but 2.31-0ubuntu9.9 is to be installed
E: Unable to correct problems, you have held broken packages.
#
```

Revision history for this message
Stefan Lasiewski (stefanlasiewski) wrote :

Thanks to Massimiliano's lead. I was able to workaround this bug by replacing /usr/bin/omshell with a freshly compiled version from source. Note that I've only been using my custom version for about 20 minutes, so YMMV.

I really dislike replacing packages with homemade utilities, as it's a pain to maintain. So I chose to simply replace /usr/bin/omshell instead of the whole thing. Here are the steps that I did:

```
systemctl stop isc-dhcp-server
curl -OL https://github.com/isc-projects/dhcp/archive/refs/tags/v4_4_3.tar.gz
tar zxf v4_4_3.tar.gz
cd dhcp-4_4_3
./configure
make
# NO make install
# Instead, replace the 1 utility that is causing us problems
mv /usr/bin/omshell /usr/bin/omshell.bad
cp dhcpctl/omshell /usr/bin/omshell
```

I will point out that the default version includes many more libraries then my custom version, and so presumably one of these system libraries is responsible for the bug, as suggested at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995242#40

```
# ldd /usr/bin/omshell
        linux-vdso.so.1 (0x00007ffc617e4000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f62c6330000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f62c67fc000)
# ldd /usr/bin/omshell.bad
        linux-vdso.so.1 (0x00007ffe1ebf8000)
        libirs-export.so.161 => /lib/x86_64-linux-gnu/libirs-export.so.161 (0x00007fe6b1d18000)
        libdns-export.so.1109 => /lib/x86_64-linux-gnu/libdns-export.so.1109 (0x00007fe6b1ae3000)
        libisc-export.so.1105 => /lib/x86_64-linux-gnu/libisc-export.so.1105 (0x00007fe6b1a6c000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe6b187a000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe6b1857000)
        libisccfg-export.so.163 => /lib/x86_64-linux-gnu/libisccfg-export.so.163 (0x00007fe6b1828000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe6b1820000)
        libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007fe6b1549000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe6b1daa000)
```

I tried recompiling isc-dhcp-4.4.1 from Ubuntu's source, but it had the same exact error.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.