VM data loss after host reboot

Bug #1968369 reported by Felipe Alencastro
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Ceph
New
Undecided
Unassigned
Ceph Monitor Charm
Fix Released
Critical
Unassigned
charms.ceph
Fix Released
Critical
Unassigned

Bug Description

After a nova-compute host reboot, all vms on said host experience data loss and fail to boot. This is also reproducible with a kill -9 on an instance PID.

Issue is the same as https://bugs.launchpad.net/nova/+bug/1773449/ and culprit seems to be Ceph replacing "osd blacklist" with "osd blocklist" since Ceph Pacifc (https://docs.ceph.com/en/latest/releases/pacific/)

Ceph-mon log:
2022-04-04T14:29:12.758+0000 7fc12a7fc700 0 log_channel(audit) log [INF] : from='client.? 100.126.4.26:0/1783852411' entity='client.nova-compute-prod-1b' cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": "100.126.4.26:0/716520341"}]: access denied

Ceph 16.2.6
Ceph-mon charm: 61
Openstack Xena

Changed in charm-ceph-mon:
status: New → Confirmed
Changed in charm-ceph-mon:
importance: Undecided → Critical
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Per release notes [0] ceph changed the blacklist commands and permissions to blocklist.

[0] https://ceph.io/en/news/blog/2021/v16-2-0-pacific-released/#upgrade-from-pre-nautilus-releases-(like-mimic-or-luminous)

Changed in charms.ceph:
status: New → Confirmed
importance: Undecided → Critical
Changed in charms.ceph:
status: Confirmed → In Progress
Changed in charm-ceph-mon:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-mon (master)

Reviewed: https://review.opendev.org/c/openstack/charm-ceph-mon/+/837642
Committed: https://opendev.org/openstack/charm-ceph-mon/commit/444f91f559bad93ad7091871715257504942bf25
Submitter: "Zuul (22348)"
Branch: master

commit 444f91f559bad93ad7091871715257504942bf25
Author: Luciano Lo Giudice <email address hidden>
Date: Tue Apr 12 22:05:33 2022 -0300

    Update the charm to use the latest changes in charms.ceph

    Change-Id: I7aee1d27021e259367d6fe88002f996ab62a61c3
    Closes-Bug: #1968369

Changed in charm-ceph-mon:
status: In Progress → Fix Committed
Changed in charms.ceph:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charms.ceph (master)

Reviewed: https://review.opendev.org/c/openstack/charms.ceph/+/837640
Committed: https://opendev.org/openstack/charms.ceph/commit/5745ed3ba856b34b0094cbc8e6aed312b142c019
Submitter: "Zuul (22348)"
Branch: master

commit 5745ed3ba856b34b0094cbc8e6aed312b142c019
Author: Luciano Lo Giudice <email address hidden>
Date: Tue Apr 12 21:46:04 2022 -0300

    Add permission in mon key for newly named command

    Starting with Pacific, the 'osd blacklist' command was renamed to
    'osd blocklist'. This patchset changes the allowed commands to
    reflect this change.

    Change-Id: If2169734f67d21c1c7c1b75677f14ebd0ea054ae
    Closes-Bug: #1968369

Changed in charm-ceph-mon:
milestone: none → 22.04
Changed in charm-ceph-mon:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charms.ceph (stable/pacific)

Fix proposed to branch: stable/pacific
Review: https://review.opendev.org/c/openstack/charms.ceph/+/845849

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charms.ceph (stable/pacific)

Reviewed: https://review.opendev.org/c/openstack/charms.ceph/+/845849
Committed: https://opendev.org/openstack/charms.ceph/commit/d43881c20f4312ebc559e6d8b616c7479c29f7c5
Submitter: "Zuul (22348)"
Branch: stable/pacific

commit d43881c20f4312ebc559e6d8b616c7479c29f7c5
Author: Luciano Lo Giudice <email address hidden>
Date: Tue Apr 12 21:46:04 2022 -0300

    Add permission in mon key for newly named command

    Starting with Pacific, the 'osd blacklist' command was renamed to
    'osd blocklist'. This patchset changes the allowed commands to
    reflect this change.

    Change-Id: If2169734f67d21c1c7c1b75677f14ebd0ea054ae
    Closes-Bug: #1968369
    (cherry picked from commit 5745ed3ba856b34b0094cbc8e6aed312b142c019)

tags: added: in-stable-pacific
Changed in charm-ceph-mon:
status: Fix Released → Confirmed
Revision history for this message
Felipe Alencastro (falencastro) wrote :

This is still not working in pacifc/edge charm version 113, I see the patch applied on all my units however the new permissions weren't assigned to exiting auth entries:

ubuntu@juju-2752e1-0-lxd-0:~$ sudo ceph auth ls 2>/dev/null | egrep "(blocklist|blacklist)"
 caps: [mon] allow r; allow command "osd blacklist"
 caps: [mon] allow r; allow command "osd blacklist"
 caps: [mon] allow r; allow command "osd blacklist"
 caps: [mon] allow r; allow command "osd blacklist"

ubuntu@juju-2752e1-0-lxd-0:~$ grep -C1 blocklist /var/lib/juju/agents/unit-ceph-mon-0/charm/lib/charms_ceph/broker.py
    return ['mon', ('allow r, allow command "osd blacklist"'
                    ', allow command "osd blocklist"'),

As it is it will only work for newly created pools/clients

Changed in charms.ceph:
status: Fix Released → Confirmed
Revision history for this message
Alan Baghumian (alanbach) wrote :

I confirm. I just upgraded my home lab's Ceph Pacific charm from stable to edge (rev. 112) and see the same result:

$ juju ssh ceph-mon/leader 'sudo ceph auth ls 2>/dev/null | egrep "(blocklist|blacklist)"'
 caps: [mon] allow r; allow command "osd blacklist"
 caps: [mon] allow r; allow command "osd blacklist"
 caps: [mon] allow r; allow command "osd blacklist"

No change was made to the existing OSDs.

Revision history for this message
Alan Baghumian (alanbach) wrote :

Looking at the source code, it seems like these are getting applied to new pools only.

Perhaps this should be a "feature request" if existing pools need to follow suit.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

I think the best approach may be to re-process all broker requests in the upgrade-charm hook, which should allow us to pick up updates like this there, and ensure that the latest permissions are updated onto keys.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-mon (master)
Changed in charm-ceph-mon:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charms.ceph (master)
Changed in charms.ceph:
status: Confirmed → In Progress
Changed in charm-ceph-mon:
status: In Progress → Fix Released
Changed in charms.ceph:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.