updating user-keys after deployment doesn't get pushed to related application

Bug #1810917 reported by Jeff Hillman
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Gnocchi Charm
Fix Released
Undecided
Unassigned
OpenStack Ceph-Proxy Charm
Fix Released
High
Pen Gale

Bug Description

After a deployment from a juju bundle, if the user-keys are changed (or inputted incorectly the first time around), running 'juju config ceph-proxy user-keys="client.something:new-ceph-key" doesn't properly get pushed down to the underlying application related to ceph proxy.

For a very specific example.

juju config ceph-proxy user-keys="client.glance:ABCDEFGHIJKLMNOPQRSTUV=="

Will not change the key in /etc/ceph/ceph.client.glance.keyring on the glance server(s)

This is true for any related application, in our case glance, cinder-ceph and gnocchi.

We have to manually go in and change this. Also, the jujud-unit-glance-X doesn't appear to be monitoring this file, in that, if we manually change it, juju never reverts this file back to what it should have been.

This file is generated from the relation to ceph-proxy, so it appears the relation update or config-changed is broken for that relation.

Tags: cpe-onsite
Revision history for this message
Jeff Hillman (jhillman) wrote :

Subscribed field-high

Revision history for this message
Jeff Hillman (jhillman) wrote :

Approved by Nobuto

Ryan Beisner (1chb1n)
Changed in charm-ceph-proxy:
assignee: nobody → Andrew McLeod (admcleod)
importance: Undecided → Medium
milestone: none → 19.04
importance: Medium → High
Revision history for this message
Andrew McLeod (admcleod) wrote :

A resolution has been suggested which involves adding these two lines:

https://github.com/openstack/charm-ceph-proxy/blob/master/hooks/ceph_hooks.py#L115-L116

to this config changed

https://github.com/openstack/charm-ceph-proxy/blob/master/hooks/ceph_hooks.py#L122

I will attempt to test this now

Changed in charm-ceph-proxy:
status: New → In Progress
Revision history for this message
Andrew McLeod (admcleod) wrote :

Could we please see the bundle used for the deployment?

Revision history for this message
Jeff Hillman (jhillman) wrote :

Not easily. This is in a secure environment. I can't pull files off. If you have some explicit sections you want I can retype them. But a full bundle will take ne a while to retype.

Just let me know and if its the full bundle you need i will provide it.

Revision history for this message
Andrew McLeod (admcleod) wrote :

We have verified that there is a problem - it seems to be an issue with updating the actual keyring file - continuing to investigate.

Revision history for this message
Andrew McLeod (admcleod) wrote :

The problem is specifically with charm-helpers - we've identified the issue and are working on a patch and then testing/approval

Revision history for this message
Pen Gale (pengale) wrote :

I'm picking this up. Andrew has a fix, and the remaining work is to address review comments and discussion, and fix a failing test. PR on github: https://github.com/juju/charm-helpers/pull/265

Changed in charm-ceph-proxy:
assignee: Andrew McLeod (admcleod) → Pete Vander Giessen (petevg)
Revision history for this message
Pen Gale (pengale) wrote :
Revision history for this message
Pen Gale (pengale) wrote :

Charm helpers fixes merged. Next step is to get the right charms updated, as well as drop in functional tests to catch regressions, or adjacent bugs.

Revision history for this message
Pen Gale (pengale) wrote :

Gerrit reviews for the charmhelpers sync here: https://review.openstack.org/631269 (cinder-ceph), https://review.openstack.org/631672 (glance), https://review.openstack.org/631673 (cinder-backup), https://review.openstack.org/631675 (cinder), https://review.openstack.org/631676 (nova-compute)

Revision history for this message
Jeff Hillman (jhillman) wrote :

Is there a review going for gnocchi?

Revision history for this message
Pen Gale (pengale) wrote :

@jhillman: We didn't catch the call to ensure_ceph_keyring in gnocchi.

Since it's a reactive charm, we'll have to cut a release of charmhelpers, and then rebuild the charm.

I think that most of the people who can do a charmhelpers release are flying back from the product Sprint right now. I'll make a note to poke them after they get back.

Everything else is merged, and available on Next.

Revision history for this message
Pen Gale (pengale) wrote :
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Fyi for the smoke failure, this is 'gnocchi status' failing. I thought I knew what the fix was for the but I was wrong and have had a hard time debugging the issue. 'gnocchi status' is returning 500 which could mean many things. Unfortunately with all logging/debug on for apache2 and gnocchi there are not many details available. Seems that gnocchi-api is segfaulting. I'd like to get a core dump to see a traceback. I haven't successfully been able to get one yet. I tried several releases and even the stable charm and still hit it. It's possible that a recent stable point release caused a regression but still need to get a traceback or hint to figure out what's failing.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

James Page was able to get a traceback from bionic-queens: http://paste.ubuntu.com/p/yMXhpP2jwg/

Revision history for this message
Corey Bryant (corey.bryant) wrote :

I confirmed the path in the traceback above isn't taken for the 2 deployments I've been poking at (queens and rocky). Still digging but not tracebacks.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

gnocchi smoke (bionic-queens) was last successful against 4.2.5-0ubuntu1 (which is the latest version for queens) in this review. what's changed since? https://review.openstack.org/#/c/626199/

Revision history for this message
Corey Bryant (corey.bryant) wrote :

mod-wsgi hasn't changed since bionic released. apache2 was last changed in nov 2018. smoke fails for stable charm too..

Revision history for this message
Corey Bryant (corey.bryant) wrote :

'gnocchi-status' seems to consistently work fine for python2-gnocchi

Revision history for this message
Corey Bryant (corey.bryant) wrote :

gnocchi status --debug: https://paste.ubuntu.com/p/vc5TDs6gyz/

I've tried all sorts of timeout and size options from here with no luck: https://modwsgi.readthedocs.io/en/develop/configuration-directives/WSGIDaemonProcess.html

Note that there were a few very random times that 'gnocchi status' was successful.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

Note also that settings the threads=1 and processes=1 in /etc/apache2/sites-enabled/gnocchi-api.conf doesn't help, though it does help resource usage on the gnocchi unit.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

I went through all the package changes listed here for Dec and Jan and nothing jumps out as something that could affect gnocchi: https://lists.ubuntu.com/archives/bionic-changes/

One things is possibly there's a difference in behavior with the new versions of keystone or ceph that's regressing the gnocchi status path?

Revision history for this message
Frode Nordahl (fnordahl) wrote :
Download full text (17.9 KiB)

I have caught a gdb backtrace from the Apache WSGI process after setting threads=1 and processes=1:

Although I have installed the libpython3.6 debug symbols package the frame does not appear to contain useful information.

Including the complete backtrace below:
Thread 4 "(wsgi:gnocchi-a" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fdc337d8700 (LWP 25792)]
0x00007fdc361ba9ee in _PyCFunction_FastCallDict () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
(gdb) bt
#0 0x00007fdc361ba9ee in _PyCFunction_FastCallDict () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#1 0x00007fdc3620fb61 in _PyObject_FastCallDict () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#2 0x00007fdc361f50e5 in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#3 0x00007fdc28405ffc in __Pyx_PyObject_Call (func=0x7fdc28b31c60, arg=0x7fdbf3424b88, kw=0x7fdbf3406f78)
    at ./obj-x86_64-linux-gnu/src/pybind/rados3/pyrex/rados.c:74295
#4 0x00007fdc2848ac14 in __pyx_pf_5rados_8requires_7wrapper_validate_func (__pyx_v_kwargs=0x7fdbf3406f78, __pyx_v_args=0x7fdbf3424b88,
    __pyx_self=<optimized out>) at ./obj-x86_64-linux-gnu/src/pybind/rados3/pyrex/rados.c:5406
#5 __pyx_pw_5rados_8requires_7wrapper_1validate_func (__pyx_self=<optimized out>, __pyx_args=0x7fdbf3424b88, __pyx_kwds=<optimized out>)
    at ./obj-x86_64-linux-gnu/src/pybind/rados3/pyrex/rados.c:4829
#6 0x00007fdc3620fa99 in _PyObject_FastCallDict () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#7 0x00007fdc36210126 in _PyObject_FastCallKeywords () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#8 0x00007fdc36122ae8 in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#9 0x00007fdc36127e65 in _PyEval_EvalFrameDefault () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#10 0x00007fdc3612263f in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#11 0x00007fdc36122d1e in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#12 0x00007fdc36127e65 in _PyEval_EvalFrameDefault () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#13 0x00007fdc361214a3 in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#14 0x00007fdc36122eeb in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#15 0x00007fdc36129092 in _PyEval_EvalFrameDefault () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#16 0x00007fdc3612263f in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#17 0x00007fdc36122d1e in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#18 0x00007fdc36129092 in _PyEval_EvalFrameDefault () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#19 0x00007fdc3612263f in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#20 0x00007fdc361230fe in PyEval_EvalCodeEx () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#21 0x00007fdc361e3863 in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#22 0x00007fdc362102d8 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
#23 0x00007fdc36127fb5 in _PyEval_EvalFrameDefault () from targ...

Revision history for this message
Frode Nordahl (fnordahl) wrote :

Moving tracking of the gnocchi python-rados traceback to bug 1813582

Changed in charm-gnocchi:
status: New → In Progress
milestone: none → 19.04
Revision history for this message
Vern Hart (vern) wrote :

Trying to decipher what is going on with this bug. Comments 15 thru 25 seem to be about an issue addressed by bug 1813582.

As far as I can tell, this bug is blocked by 1813582 but that bug is fix-committed. If so, are we ready to re-test the fix for this bug?

Are there plans to backport this fix to current stable (18.11) charms?

Pen Gale (pengale)
Changed in charm-ceph-proxy:
status: In Progress → Fix Committed
Changed in charm-gnocchi:
status: In Progress → Fix Committed
David Ames (thedac)
Changed in charm-ceph-proxy:
status: Fix Committed → Fix Released
Changed in charm-gnocchi:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on charm-gnocchi (master)

Change abandoned by "James Page <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/charm-gnocchi/+/632578
Reason: This review is > 12 weeks without comment, and failed testing the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.