ceph-mgr dashboard incompatible with cython >= 0.29 (disco)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| ceph (Ubuntu) |
High
|
James Page | ||
| Disco |
High
|
James Page | ||
| Eoan |
High
|
James Page |
Bug Description
[Impact]
The ceph-mgr daemon is unable to load additional module due to a new check in cython >= 0.29. This limits the function of the manager.
[Test Case]
Deploy ceph
Check /var/log/
Errors about loading rados module in subprocesses will be seen.
[Regression Potential]
The fix from upstream actually just works around this issue by overriding the check that cython does; the code works in a subprocess when loaded multiple times. Regression potential low; cython may produce a longer term fix which means we can drop this patch.
[Original Bug Report]
If Ubuntu is really committed to ceph as I think I've been reading: Notice the ceph dashboard went entirely broken in a major regression of the disco upgrade. It won't load at all in 13.2.4+
The detail is ceph-mgr (and lots of ceph) relied on a non-feature in cython that went away in cython v29, to do with sub-interpreters. The ceph folks responded with a hack/workaround to avoid the bug being noticed, and a requirement of the package for an earlier version of cython. This was done some weeks and months ago. Actually fixing the problem is a major project the ceph maintainers are struggling to engage, perhaps waiting for later versions of cython to provide a different way forward.
However, as of today, on disco ths error message remains:
Module 'dashboard' has failed dependency: Interpreter change detected - this module can only be loaded into one interpreter per process.
The ceph primary development platform is Debian, on which the workaround has been available for some time.
However in our ubuntu case, a major feature of a core packge (web health/
I urge quick attention to the necessary backports.
https:/
http://
http://
James Page (james-page) wrote : | #1 |
Changed in ceph (Ubuntu): | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → James Page (james-page) |
Changed in ceph (Ubuntu Eoan): | |
status: | Triaged → Fix Released |
Changed in ceph (Ubuntu Disco): | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → James Page (james-page) |
James Page (james-page) wrote : | #2 |
Cosmic and earlier not impacted as this only impacts with cython >= 0.29.
Neither are UCA backports to bionic which has older cython as well.
Changed in ceph (Ubuntu Disco): | |
status: | Triaged → In Progress |
summary: |
- ceph-mgr Dashboard entirely broken in Disco + ceph-mgr dashboard incompatible with cython >= 0.29 (disco) |
James Page (james-page) wrote : | #3 |
Test packages in:
https:/
initial fix reveal some Python 3 syntax issues which will be fixed at the same time
Harry Coin (hcoin) wrote : | #4 |
Thanks for the effort. I see effort for ceph v12. Notice that for disco:
ceph -v
ceph version 13.2.4 (b10be4d44915a4
with /etc/apt/
/etc/apt/
deb http://
deb http://
deb http://
deb http://
deb http://
deb http://
deb http://
deb http://
deb http://
deb http://
Harry Coin (hcoin) wrote : | #5 |
Notice we need the binaries to work in disco it's ceph v13, notice your ppa is only v12. Also Its not enough for users to be forced to build from source in Eoan (and what does Nautilus have to do with the manager web server?)
Here we are about a month after the initial report of this regression and it's still totally non-functional.
Harry Coin (hcoin) wrote : | #6 |
Looking at the ppa I see nothing that fixes this bug in Eoan, nor disco.
I think you should change 'fix released' in Eaon to 'still broken'.
James Page (james-page) wrote : | #7 |
description: | updated |
Harry Coin (hcoin) wrote : Re: [Bug 1832105] Re: ceph-mgr dashboard incompatible with cython >= 0.29 (disco) | #8 |
FYI, I'm attempting as suggested to use Nautilus via Eoan. I've learned
if you have IP6 enabled in disco's ceph.conf none of the osds will load
in eoan / nautilus until you add ms_bind_ipv4 = false to ceph.conf.
Also the dashboard remains broken in eoan / ceph nautilus at least as
far as the simple 'do-release-upgrade --devel' provides. I wonder if
the dashboard really was tested before the announced 'fix released' was
posted for eoan.
I don't know all of the causes for the dashboard being broken but one
of them is systemd appears to create manager services for the hostname
and for the hostname.
module enable dashboard --force" fails to create a manager with a
working dashboard instance.
Here we see a little example of why our linux world faces problems in
acceptance. It's one thing for a release to offer a new feature that's
somewhat broken. It's a whole other thing for a major user-facing
feature (dashboard) of an enterprise/core system (fail-tolerant storage)
next release to obviously never have been tested beforehand and ship
broken. You want to trust that doesn't happen and not be nervous when
doing release upgrades.
You can understand how that could happen in an entirely community
supported distro but I've seen it in both RHEL (viz: freeipa) and
Ubuntu/ceph.
I appreciate the suggested 'solution' to move to the next version
development set to be released in 4 months. But then that not only
doesn't restore the desired module but brings the whole cluster offline
until a non-documented flag gets set (ms_bind_ipv4 isn't documented that
I could find, ms_bind_ipv6 is.)
I'm sharing this experience not to complain as such but for
information. Ubuntu ships with so many notifications about available
upgrades of security and other sorts every log in one feels they must be
ready for prime time or Canonical wouldn't have pushed them out. Then a
big stopper like this happens.
On 7/12/19 8:33 AM, James Page wrote:
> Sorry wrong PPA:
>
> https:/
> service/
>
> ** Description changed:
>
> - If Ubuntu is really committed to ceph as I think I've been reading:
> - Notice the ceph dashboard went entirely broken in a major regression of
> - the disco upgrade. It won't load at all in 13.2.4+
> + [Impact]
> + The ceph-mgr daemon is unable to load additional module due to a new check in cython >= 0.29. This limits the function of the manager.
> +
> +
> + [Test Case]
> + Deploy ceph
> + Check /var/log/
> + Errors about loading rados module in subprocesses will be seen.
> +
> + [Regression Potential]
> + The fix from upstream actually just works around this issue by overriding the check that cython does; the code works in a subprocess when loaded multiple times. Regression potential low; cython may produce a longer term fix which means we can drop this patch.
> +
> + [Original Bug Report]
> + If Ubuntu is really committed to ceph as I think I've been reading: Notice the ceph dashboard went entirely broken in a major regression of the disco upgrade. It won't load at a...
James Page (james-page) wrote : | #9 |
@hcoin
I don't think anyone is recommending the solution to this bug is to use an unreleased development version of Ubuntu which by its nature has not been through full testing. That's just a part of the process - to be able to update a released version of Ubuntu we have to evidence that the same software bug has been fixed in the development release; otherwise when users upgrade to the new release, they regress the fix to this issue.
You'll note activity on this bug - its moving forward and we will provide stable release updates for the fixes into 19.04 (Disco) which *is* the solution to this bug.
If you encounter separate issues please feel free to raise new bugs against the ceph package.
Harry Coin (hcoin) wrote : | #10 |
Here's some help for others facing this:
If the ceph dashboard was working in before upgrading to disco (which
killed it in a regression), then your hope to get it working via upgrade
to nautilus (owing to 'fixed-released' advertising in the bug report)
was to move to ceph v14/nautlius available in ubuntu-eoan.
After 'do-release-upgrade --devel' to eoan / ceph nautilus on every
system running ceph do:
systemctl status ceph<esc> and make sure there is only one entry there
for every osd/mon/mgr/mds. On my system there were entries there with
the hostname and with the hostname.domainname as well.
There are a number of other instructions involved in getting nautilus
running, see them here:
http://
Also one of the osd's that was not managed by LVM was ignored and not
started. I 'replaced it' with itself and it started backfilling normally.
On the systems meant to run the dashboard, this is now necessary:
apt install ceph-mgr-dashboard
The following will tell you 'the module is already enabled'.
And when you think you're done and ready to log in ... the screen
accepts your password then does nothing further other than redisplay the
login screen. If you put in the wrong password, it tells you. The
correct password does nothing. So, __on every instance of ceph mgr
even the ones you are not using __ you have to
ceph mgr module disable dashboard
then edit
/usr/share/
and change line 186 from self.lastUpdate = int(time.
to
self.lastUpdate = int(time.time())
Be sure to use spaces and not tabs.
Then
ceph mgr module enable dashboard
ceph dashboard ac-user-
And then, you get to where you were before the update to disco with a
working dashboard. Hopefully this saved you a day or two.
I'm no longer able/interested to test whether mimic's dashboard works in
disco, sorry. If I'd somehow known an official release would break a
major user facing function on something as central to operations as ceph
I would have skipped disco entirely and waited for eoan.
On 7/12/19 8:33 AM, James Page wrote:
> Sorry wrong PPA:
>
> https:/
> service/
>
> ** Description changed:
>
> - If Ubuntu is really committed to ceph as I think I've been reading:
> - Notice the ceph dashboard went entirely broken in a major regression of
> - the disco upgrade. It won't load at all in 13.2.4+
> + [Impact]
> + The ceph-mgr daemon is unable to load additional module due to a new check in cython >= 0.29. This limits the function of the manager.
> +
> +
> + [Test Case]
> + Deploy ceph
> + Check /var/log/
> + Errors about loading rados module in subprocesses will be seen.
> +
> + [Regression Potential]
> + The fix from upstream actually just works around this issue by overriding the check that cython does; the code works in a subprocess when loaded multiple times. Regression potential low; cython may produce a longer term fix which means we can drop this patch.
> +
> + [Original Bug Report]
> + If Ub...
Hello Harry, or anyone else affected,
Accepted ceph into disco-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Changed in ceph (Ubuntu Disco): | |
status: | In Progress → Fix Committed |
tags: | added: verification-needed verification-needed-disco |
Rgpublic (rgpublic) wrote : | #12 |
Okay, I added the proposed repo and pinned the packages as described and then did this:
apt install ceph-mgr/
Afterwards, the error message disappeared: "HEALTH_OK". I only did this on the server where the mgr is currently active (ceph -s displayed which server is the active mgr). If I stop the service so that the active mgr changes to some other server where I didn't yet install the proposed packages, the error message appears again.
Summary: The proposed packages seem to be working as intended. The error message disappears. I can still access the Ceph filesystem and I can now access the Ceph dashboard again - everything seems to be working normally. Big thank you to everyone working on this.
One question: If the packages appear on the final non-proposed repository... What would I need to do to switch over to them so everything is back to normal?
James Page (james-page) wrote : | #13 |
ubuntu@
cluster:
id: 57558fde-
health: HEALTH_OK
services:
mon: 3 daemons, quorum juju-05ad98-
mgr: juju-05ad98-
osd: 3 osds: 3 up, 3 in
data:
pools: 3 pools, 44 pgs
objects: 1 objects, 14 B
usage: 3.0 GiB used, 27 GiB / 30 GiB avail
pgs: 44 active+clean
ubuntu@
ceph-mon:
Installed: 13.2.6-
Candidate: 13.2.6-
Version table:
*** 13.2.6-
500 http://
100 /var/lib/
13.
500 http://
13.
500 http://
13.
500 http://
ubuntu@
ceph-mgr:
Installed: 13.2.6-
Candidate: 13.2.6-
Version table:
*** 13.2.6-
500 http://
100 /var/lib/
13.
500 http://
13.
500 http://
13.
500 http://
tags: |
added: verification-done verification-done-disco removed: verification-needed verification-needed-disco |
James Page (james-page) wrote : | #14 |
@rgpublic
Once the update is released, you'll just need to upgrade your installed packages to pickup the new version; if you're already installed from disco-proposed then you won't get an update as the binary is identical to the released update version.
Rgpublic (rgpublic) wrote : | #15 |
@james-page: Thanks a lot for the clarification! I already assumed that, but it's very good to know for sure what will happen.
Launchpad Janitor (janitor) wrote : | #16 |
This bug was fixed in the package ceph - 13.2.6-
---------------
ceph (13.2.6-
* d/p/bug1832105.patch: Cherry pick fix to avoid cython interpreter
check raising import error when loading ceph mgr modules
(LP: #1832105).
* d/p/mgr-*.patch: Misc fixes to resolve Python 3 syntax issues
(LP: #1835354).
-- James Page <email address hidden> Fri, 12 Jul 2019 12:03:05 +0100
Changed in ceph (Ubuntu Disco): | |
status: | Fix Committed → Fix Released |
The verification of the Stable Release Update for ceph has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Eoan has Nautilus which has the required fix to the build process for the newer cython.