Bionic: Luminous radosgw incompatible with libssl1.1

Bug #1822872 reported by Kellen Renshaw on 2019-04-02
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ceph (Ubuntu)
High
Unassigned
Bionic
High
Eric Desrochers

Bug Description

[Impact]

Since the introduction of OpenSSL 1.1.1 in 18.04 LTS:
https://launchpad.net/bugs/1797386

This is breaking Ceph cluster https service.

# logs:
2019-04-02 16:40:14.846313 7ff8c1736000 0 starting handler: civetweb
2019-04-02 16:40:14.846397 7ff8c1736000 0 civetweb: 0x56114520d620: load_dll: libcrypto.so.1.1: cannot find CRYPTO_num_locks
2019-04-02 16:40:14.846424 7ff8c1736000 -1 ERROR: failed run

[Test Case]

1) Generate a self-signed certificate or use whatever existing SSL certificate already in place.

If one want to create a PEM file for civetweb, instructions can be found here :
https://github.com/civetweb/civetweb/blob/master/docs/OpenSSL.md

** Note: "CivetWeb requires one certificate file in PEM format" **

2) Enable logging and debugging in "/etc/ceph/ceph.conf"

Example:
------
log to syslog = true
err to syslog = true
clog to syslog = true
debug rgw = 10/5
debug civetweb = 1/10
------

http://docs.ceph.com/docs/mimic/rados/troubleshooting/log-and-debug/

3) From the radosgw node, modify "/etc/ceph/ceph.conf" as follow:
rgw_frontends = civetweb port=443s ssl_certificate=/<path_to_PEM_FILE>/<PEM_FILE>

4) Restart the daemon:
systemctl restart ceph-radosgw@rgw.`hostname -s`

5) Look logs:
2019-04-10 12:02:53.535133 7fcd20c4e000 0 civetweb: 0x562d710ed620: load_dll: libcrypto.so.1.1: cannot find CRYPTO_num_locks

6) Look radosgw which should FAILED to start.
systemctl status ceph-radosgw@rgw.`hostname -s`

What we are looking for here is radosgw to be 'Active' and to have a LISTEN port on 443 as follow :

$ netstat -anputa | grep LISTEN | grep 443 # or any port mentioned in the configuration above.
tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 10153/radosgw

[Potential Regression]

* Same downgrade approach has been made for 'nodejs' via LP: #1798367

* The proposed packages has been tested on at least 2 different Ceph clusters impacted by the issue, and have been tested at various level (no package update problem, radosgw is now working fine when civetweb is configure over ssl, ...)

* Nothing can be worst than current situation, considering that civetweb is non-functional when SSL is in used due to the incompatibility with 1.1 and make radosgw daemon to fail.

* libssl1.0 and libssl1.1 are coinstallable ABIs so it shouldn't be a problem here.

* See discussion IRC discussion (xnox/jamespage) on comment #11

* All autopkgtest 'passed'
http://autopkgtest.ubuntu.com/packages/ceph

[Other Information]

* Adding the OpenSSL 1.1 support has been explored and revealed to be non-trivial :
https://github.com/civetweb/civetweb/pull/384/commits
https://github.com/civetweb/civetweb/commit/adac9c916fa892ec5edce7b565803f1e62d304a2
https://github.com/civetweb/civetweb/commit/5d83900fd29fb6fa1cd604676cb0562dc984dcc9

http://docs.ceph.com/docs/bobtail/radosgw/troubleshooting/

See discussion IRC discussion on comment #11

[Original Description]

Bionic's radosgw package (Version 12.2.11-0ubuntu0.18.04.1 ) can't run on Bionic, because the version of civetweb in Luminous is incompatible with libssl1.1, but it's built against libssl1.1.

This has been known about upstream for a while now, and as noted in the bug-tracker (https://tracker.ceph.com/issues/20696), it can be fixed by building Luminous in an environment that has only libssl1.0 available (or, in a more invasive manner, by incorporating a newer civetweb). A patch is in the tracker.ceph.com issue.

Eric Desrochers (slashd) wrote :

Here's what has been brought to my attention by someone impacted by the problem:
"
This is breaking our test cluster's https service, and blocks upgrading our production cluster to 18.04.

2019-04-02 16:40:14.846313 7ff8c1736000 0 starting handler: civetweb
2019-04-02 16:40:14.846397 7ff8c1736000 0 civetweb: 0x56114520d620: load_dll: libcrypto.so.1.1: cannot find CRYPTO_num_locks
2019-04-02 16:40:14.846424 7ff8c1736000 -1 ERROR: failed run
"

tags: added: sts
Eric Desrochers (slashd) wrote :

It also been brought to my attention the following:
"
I can confirm that adjusting the Build-Depends to build against the older libssl (from 1.1 to 1.0) works (per the patch you can see I added to the upstream issue).
"

Eric Desrochers (slashd) wrote :

Here's my thought process about this:

----
Package "2.2.11-0ubuntu0.18.04.1" uses: civetweb version "1.8"

Confirmation:
src/civetweb/include/civetweb.h:#define CIVETWEB_VERSION "1.8"
-----

While I'm sure that downgrading libssl as 'Build-Depends' works for that particular case, I am concerned about what downgrading libssl may introduce as potential regression in Ceph since Bionic Ceph has been build/tested against libssl 1.1. We would need to be very careful if we go that route IMHO.

For the moment, I see 3 options:
1) Downgrade libssl Build-Depends from 1.1 to 1.0 in order to make civetweb works, but possibly risk to introduce (or not) potential Ceph regression/ Ceph undesired behaviour change/ ... (tbd)
2) Upgrade civetweb to adapt to 1.1 by identifying the right commits/patchset :

From what I read so far, it seems like there might be good potential candidates:
https://github.com/civetweb/civetweb/pull/384/commits
https://github.com/civetweb/civetweb/commit/adac9c916fa892ec5edce7b565803f1e62d304a2
https://github.com/civetweb/civetweb/commit/5d83900fd29fb6fa1cd604676cb0562dc984dcc9

3) Upgrade ceph' source pkg's civetweb version to a version where libssl 1.1 is fully supported. (if doable/compatible/...)

Currently, option 2) is definitely my favourite approach.

I don't fully ignore option 1), but I would prefer spending time to investigate how feasible the backport of libssl 1.1 adaptation/fixes patchset into 1.8 goes and/or evaluate an upgrade from v1.8 to <RECENT_VERSION_INCLUDING_WHAT_IT_NEEDS> of civetweb into Ceph.

And of course, I would appreciate to have the Openstack team opinion about my chain of thought here.

Regards,
Eric

Eric Desrochers (slashd) wrote :

Additional information:

https://github.com/civetweb/civetweb/blob/master/RELEASE_NOTES.md

----------
Release Notes v1.10
Objectives: OpenSSL 1.1 support, add server statistics and diagnostic data
......
* OpenSSL 1.1 support
----------

Eric Desrochers (slashd) wrote :

# Ceph: debian/control (just like Ceph upstream)
Build-Depends: cmake (>= 3.5),
               cpio,
               cryptsetup-bin | cryptsetup,
               cython,
               cython3,
               debhelper (>= 9),
               ....
               libsnappy-dev,
=> libssl-dev,

$ apt-cache policy libssl-dev
libssl-dev:
  Installed: (none)
  Candidate: 1.1.0g-2ubuntu4.3
  Version table:
     1.1.0g-2ubuntu4.3 500
        500 http://us.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages
     1.1.0g-2ubuntu4 500
        500 http://us.archive.ubuntu.com/ubuntu bionic/main amd64 Packages

$ apt-cache policy libssl1.0-dev
libssl1.0-dev:
  Installed: (none)
  Candidate: 1.0.2n-1ubuntu5.3
  Version table:
     1.0.2n-1ubuntu5.3 500
        500 http://us.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages
     1.0.2n-1ubuntu5 500
        500 http://us.archive.ubuntu.com/ubuntu bionic/main amd64 Packages

Eric Desrochers (slashd) on 2019-04-03
Changed in ceph (Ubuntu Bionic):
status: New → Confirmed
importance: Undecided → Medium
Changed in ceph (Ubuntu):
status: New → Fix Released
Eric Desrochers (slashd) wrote :

Note that disco and cosmic uses "1.10", and seems to contains the necessary 1.1 support.

* civetweb.h:
#define CIVETWEB_VERSION "1.10"
#define CIVETWEB_VERSION_MAJOR (1)
#define CIVETWEB_VERSION_MINOR (10)
#define CIVETWEB_VERSION_PATCH (0)
#define CIVETWEB_VERSION_RELEASED

James Page (james-page) wrote :

I think we probably just want to pick the civetweb fix that resolves the compatibility problem

Eric Desrochers (slashd) on 2019-04-10
description: updated
description: updated
Eric Desrochers (slashd) on 2019-04-12
description: updated
Dimitri John Ledkov (xnox) wrote :

In bionic we ship two OpenSSL: 1.0.2 and 1.1.0, both in main with security support. The latter is getting upgraded from 1.1.0 to 1.1.1.

If ceph in bionic only supports 1.0.2 abi, it should build-depend on libssl1.0-dev and use that both at build time and runtime.

If cepth in bionic support 1.1.0 abi, it would build-depend on libssl-dev and use that both at build time and runtime.

libssl1.0 and libssl1.1 are coinstallable ABIs.

Eric Desrochers (slashd) wrote :

# IRC Discussion on freenode (#ubuntu-devel)
[11:46:46] <xnox> slashd, jamespage - what am i missing about https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1822872 ? =)
[11:46:47] <ubottu>Launchpad bug 1822872 in ceph (Ubuntu Bionic) "Bionic: Luminous radosgw incompatible with libssl1.1" [Medium,Confirmed]
[11:47:10] <xnox> slashd, jamespage - libssl/libcrypto 1.0 and 1.1 are coinstallable and both are support in bionic, in main from now and until forever.
[11:47:13] <xnox> so what's broken?
[11:48:41] <xnox> slashd, jamespage - sound slike load_dll should be dll opening libcrypto.so.1.0 if that's what it expects?
[11:50:30] <xnox> slashd, jamespage - imho we should builddepend on libssl1.0 and set CIVETWEB_SSL_SSL_LIB and CIVETWEB_SSL_CRYPTO_LIB to versioned sonames of libssl.so.1.0 and libcrypto.so.1.0
[11:52:38] <jamespage> xnox, slashd: tbh I think that's fine
[11:52:57] <jamespage> I'm easy either way - slashd are you ok to SRU that?
[11:53:21] <slashd> jamespage, yeah I can SRU the libssl-dev downgrade to 1.0
[11:53:31] <xnox>jamespage, well reading the code, it sounds slightly harder. cause WITH_RADOSGW tries to build with SSL_INCLUDE_DIR
[11:54:35] <xnox>slashd, if it works. cause it does look that radosgw, rgw, civetweb all need to use libssl1.0-dev then.
[11:55:11] <xnox>slashd, ah, and that's all that does ssl there, so it's fine.
[12:04:48] <xnox> slashd, and we reverted and forced to use libssl1.0-dev with nodejs 8 in bionic
[12:05:01] <xnox> slashd, and one should use libssl-dev (aka 1.1) with nodejs in disco.

Eric Desrochers (slashd) on 2019-04-15
description: updated
Eric Desrochers (slashd) on 2019-04-15
description: updated
Eric Desrochers (slashd) on 2019-04-15
description: updated
Eric Desrochers (slashd) on 2019-04-15
Changed in ceph (Ubuntu Bionic):
status: Confirmed → In Progress
assignee: nobody → Eric Desrochers (slashd)
Eric Desrochers (slashd) on 2019-04-15
description: updated
Eric Desrochers (slashd) on 2019-04-16
description: updated
description: updated
Eric Desrochers (slashd) on 2019-04-16
description: updated
Eric Desrochers (slashd) on 2019-04-16
Changed in ceph (Ubuntu Bionic):
importance: Medium → High
Eric Desrochers (slashd) wrote :

The backport of the civetweb fix in the attempts to resolve the compatibility problem, revealed not being trivial, as it was made on top of things not yet there in v1.8.

Eric Desrochers (slashd) on 2019-04-17
description: updated
Eric Desrochers (slashd) wrote :

It has been brought to my attention by an impacted user who tested a test package [2.2.11-0ubuntu0.18.04.1+testpkg15042019b2] with the fix:

"
Hi,
..
All three machines are now running 12.2.11-0ubuntu0.18.04.1+testpkg15042019b2
...
and yes, they are all listening on the https port, and some light testing with s3cmd suggests the S3 service is properly operational again.
..
"

Changed in ceph (Ubuntu):
importance: Undecided → High
Eric Desrochers (slashd) wrote :

Uploaded in bionic upload queue. Waiting for sru verfication team approval in order to start building in bionic-proposed.

- Eric

Eric Desrochers (slashd) on 2019-04-26
description: updated

Hello Kellen, or anyone else affected,

Accepted ceph into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/12.2.11-0ubuntu0.18.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ceph (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Eric Desrochers (slashd) wrote :

[VERIFICATION BIONIC]

Here's what has been brought to my attention from an impacted user testing the package found in bionic-proposed:

"
Hi,
I've tested these packages, and they seem to work OK.
I've done:
check bucket contents uploaded before upgrade still OK after upgrade
list, get, put, retrieve objects (check with md5sum), including small files and large (21G).
"

- Eric

tags: added: verification-done-bionic
removed: verification-needed-bionic
Eric Desrochers (slashd) on 2019-04-30
description: updated
Eric Desrochers (slashd) wrote :

[VERIFICATION BIONIC - Part 2] ^

The entire test cluster has been upgraded from bionic-proposed (ceph). The test cluster has a RGW on every node, and the packages have tightly versioned dependencies, so that upgraded the osds, mons, mgrs.

Eric Desrochers (slashd) wrote :
Download full text (3.8 KiB)

[VERIFICATION BIONIC]

I have deployed a Ceph cluster using juju deploy and then have updated the entire cluster[1] to the ceph packages found in bionic-proposed (built against libssl1.0.0).

On the rgw node, I have setup a ssl certificate, and instruct civetweb in /etc/ceph/ceph.conf to use ssl[2].

radosgw is now running just fine[3][4] and civetweb LISTEN on port 443 as it should[5].

[1] Ceph cluster:
.......
Unit Workload Agent Machine Public address Ports Message
ceph-mon/0* active idle 0 10.5.0.4 Unit is ready and clustered
ceph-osd/0* active idle 1 10.5.0.5 Unit is ready (1 OSD)
ceph-osd/1 active idle 2 10.5.0.27 Unit is ready (1 OSD)
ceph-osd/2 active idle 3 10.5.0.6 Unit is ready (1 OSD)
ceph-rgw/0* active idle 4 10.5.0.18 80/tcp Unit is ready
......

[2] /etc/ceph/ceph.conf
[client.rgw.<HOSTNAME>]
......
rgw_frontends = civetweb port=443s ssl_certificate=/etc/ssl/server.pem
.......

[3] sudo systemctl status ceph-radosgw@rgw.`hostname -s`
● <email address hidden> - Ceph rados gateway
   Loaded: loaded (/lib/systemd/system/ceph-radosgw@.service; indirect; vendor preset: enabled)
   Active: active (running) since Wed 2019-05-01 19:51:55 UTC; 10min ago
 Main PID: 4225 (radosgw)
    Tasks: 580
   CGroup: /system.slice/system-ceph\<email address hidden>
           └─4225 /usr/bin/radosgw -f --cluster ceph --name client.rgw.juju-521d82-default-4 --setuser ceph --setgroup

May 01 19:59:59 juju-521d82-default-4 radosgw[4225]: 2019-05-01 19:59:59.208671 7f19095f0700 2 RGWDataChangesLog::Cha
May 01 20:00:21 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:00:21.208946 7f19095f0700 2 RGWDataChangesLog::Cha
May 01 20:00:43 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:00:43.209214 7f19095f0700 2 RGWDataChangesLog::Cha
May 01 20:01:05 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:01:05.209332 7f19095f0700 2 RGWDataChangesLog::Cha
May 01 20:01:27 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:01:27.209500 7f19095f0700 2 RGWDataChangesLog::Cha
May 01 20:01:49 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:01:49.209716 7f19095f0700 2 RGWDataChangesLog::Cha
May 01 20:01:56 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:01:56.129879 7f1907ded700 2 object expiration: sta
May 01 20:02:11 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:02:11.209902 7f19095f0700 2 RGWDataChangesLog::Cha
May 01 20:02:12 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:02:12.346598 7f1907ded700 2 object expiration: sto
May 01 20:02:33 juju-521d82-default-4 radosgw[4225]: 2019-05-01 20:02:33.210102 7f19095f0700 2 RGWDataChangesLog::Cha

[4] logs
May 1 19:51:56 juju-521d82-default-4 radosgw: 2019-05-01 19:51:56.115874 7f1924299000 0 starting handler: civetweb
May 1 19:51:56 juju-521d82-default-4 radosgw: 2019-05-01 19:51:56.186842 7f1924299000 1 mgrc service_daemon_register rgw.juju-521d82-default-4 metadata {arch=x86_64,ceph_version=ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stab...

Read more...

description: updated

The verification of the Stable Release Update for ceph has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ceph - 12.2.11-0ubuntu0.18.04.2

---------------
ceph (12.2.11-0ubuntu0.18.04.2) bionic; urgency=medium

  * d/control: Use openssl1.0 at build and runtime as
    civetweb v1.8 is incompatible with 1.1 abi. It is
    only compatible starting with civetweb v1.10 and late.
    (LP: #1822872)

 -- Eric Desrochers <email address hidden> Fri, 26 Apr 2019 08:17:04 -0400

Changed in ceph (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.