Jewel -> Luminous upgrade actions: filestore -> bluestore and msgr1 -> msgr2 upgrade path

Bug #1710477 reported by Dmitrii Shcherbakov on 2017-08-13
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack ceph-mon charm
OpenStack ceph-osd charm

Bug Description

Given the fact that Ceph Luminous will be shipped with the Pike Ubuntu Cloud Archive, a natural question is how to upgrade to Pike not only in terms of OpenStack packages but also for the ceph package as well.

ceph 12.1.2-0ubuntu2

There are several major changes coming, including:

* a change of a wire protocol

    TYPE_LEGACY = 1, ///< legacy msgr1 protocol (ceph jewel and older)
    TYPE_MSGR2 = 2, ///< msgr2 protocol (new in ceph kraken)

* a change of a default backing store from FileStore to BlueStore

there are several migration strategies available but it is possible to run a cluster in a mixed FileStore & BlueStore mode and perform a rolling upgrade

The upgrade procedure from Jewel or Kraken is described in a link below and includes steps that would be good to incorporate into charm actions that would later be used in a more complex upgrade script.
"1. Ensure that the sortbitwise flag is enabled:

# ceph osd set sortbitwise


5. Set the noout flag for the duration of the upgrade. (Optional but recommended.):

# ceph osd set noout


6. Upgrade monitors by installing the new packages and restarting the monitor daemons. Note that, unlike prior releases, the ceph-mon daemons must be upgraded first <---- !!!


Dmitrii Shcherbakov (dmitriis) wrote :

For a converged architecture where nova-compute resides on the same host as ceph-osd there is another potential problem: if nova-compute is upgraded before ceph-osd, ceph packages will get upgraded as well due to `apt upgrade` before ceph-mons are upgraded on other units (although an osd process might still be running with an old version unless explicitly restarted).

The same applies to charm-neutron-gateway and charm-ceph-osd if they are collocated on the same logical host.

Even though this is not directly related to the ceph-osd charm it is an important side-effect to remember about.

The converged architecture issue shouldn't be too problematic as the Ubuntu packages for Ceph intentionally do *not* restart running ceph* services on upgrade specifically to assist with not taking down the world.

James Page (james-page) wrote :

I think there are a few potential bugs here; specifically please can we break out

1) feature to convert an existing filestore based cluster to bluestore

this should be end-user driven via actions or suchlike.

2) a change of a wire protocol

how is this implemented into the cluster? I can't see anything explicit in the release notes so I'm assuming it has something todo with the minimum osd version required?

3) ceph osd set sortbitwise

that should be part of the upgrade process right now - its probably OK generally without it as most clusters will already be in this state.

James Page (james-page) on 2017-09-04
Changed in charm-ceph-osd:
status: New → Incomplete
Changed in charm-ceph-mon:
status: New → Incomplete
Dmitrii Shcherbakov (dmitriis) wrote :

1. Agreed. A change of a storage backend does not affect what gets transferred over the wire so this can be an action-based feature.

2. Looking at this again, I can see
  typedef enum {
    TYPE_NONE = 0,
    TYPE_LEGACY = 1, ///< legacy msgr1 protocol (ceph jewel and older)
    TYPE_MSGR2 = 2, ///< msgr2 protocol (new in ceph kraken)
  } type_t;
  static const type_t TYPE_DEFAULT = TYPE_LEGACY;

So it's still TYPE_LEGACY as a default and I can only see plumbing code for msgr2:

doc/dev/msgr2.rst|1| msgr2 protocol
src/msg/|74| } else if (strncmp("msgr2:", s, 6) == 0) {
src/msg/msg_types.h|208| TYPE_MSGR2 = 2, ///< msgr2 protocol (new in ceph kraken)
src/msg/msg_types.h|215| case TYPE_MSGR2: return "msgr2";
src/pybind/|125| for t in ('legacy:', 'msgr2:'):

So, it doesn't look like the default protocol has been changed.

As per release notes:
"Upgrade monitors by installing the new packages and restarting the monitor daemons. Note that, unlike prior releases, the ceph-mon daemons ***must*** be upgraded first"

I think that's the only consideration to be taken into account for now.

Apart from that, I've found the following:
"I hope to work on the msgr2 protocol change (which will enable encryption
on the wire) during the next cycle, but I definitely can't promise it'll
happen by luminous."

So, msgr2-related changes might include configurable encryption of all the storage traffic.

3. If it's idempotent I think it's better to set it in any case just to make sure.


So the bottom line is that it seems like we only need two things for this bug:

1. document ceph-mon-first upgrade requirement;
2. provide actions to do per-unit filestore -> bluestore upgrades so that a more advanced script can loop over that.

James Page (james-page) wrote :

1) ok that's a great release note and addition to the ceph-mon/ceph-osd docs
2) that can happen next development cycle (not for 17.08)

Changed in charm-ceph-osd:
status: Incomplete → New
Changed in charm-ceph-mon:
status: Incomplete → New
James Page (james-page) on 2017-09-21
Changed in charm-ceph-mon:
status: New → Triaged
Changed in charm-ceph-osd:
status: New → Triaged
importance: Undecided → Wishlist
Changed in charm-ceph-mon:
importance: Undecided → Wishlist
Ante Karamatić (ivoks) on 2017-09-27
tags: added: cpe-onsite
removed: cpec
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers