Upgrade from Nautilus to Octopus does not restart services, leaving them running Nautilus versions

Bug #1943854 reported by Chris Johnston
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph Monitor Charm
Triaged
Medium
Unassigned
OpenStack Ceph-FS Charm
Triaged
Medium
Unassigned

Bug Description

$ juju export-bundle >> ~/juju_export_bundle_before_ceph_upgrade.txt # [1]
$ juju status >> ~/juju_status_before_ceph_upgrade.txt # [1]
$ juju run -u ceph-mon/leader -- sudo ceph -s && juju run -u ceph-mon/leader -- sudo ceph versions
  cluster:
    id: f2b72582-1703-11ec-82f3-fa163e15a8b3
    health: HEALTH_OK

  services:
    mon: 1 daemons, quorum juju-0d931d-ck-3 (age 66m)
    mgr: juju-0d931d-ck-3(active, since 66m)
    mds: ceph-fs:1 {0=juju-0d931d-ck-1=up:active} 2 up:standby
    osd: 3 osds: 3 up (since 65m), 3 in (since 65m)

  task status:
    scrub status:
        mds.juju-0d931d-ck-1: idle

  data:
    pools: 2 pools, 16 pgs
    objects: 22 objects, 2.2 KiB
    usage: 3.0 GiB used, 27 GiB / 30 GiB avail
    pgs: 16 active+clean

{
    "mon": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 1
    },
    "mgr": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 1
    },
    "osd": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
    },
    "mds": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
    },
    "overall": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 8
    }
}
$ juju upgrade-charm ceph-mon --revision 58
Added charm-store charm "ceph-mon", revision 58 in channel stable, to the model
Leaving endpoints in "alpha": admin, bootstrap-source, client, cluster, mds, mon, nrpe-external-master, osd, prometheus, public, radosgw, rbd-mirror
$ juju run -u ceph-mon/leader -- sudo ceph -s && juju run -u ceph-mon/leader -- sudo ceph versions
  cluster:
    id: f2b72582-1703-11ec-82f3-fa163e15a8b3
    health: HEALTH_OK

  services:
    mon: 1 daemons, quorum juju-0d931d-ck-3 (age 77m)
    mgr: juju-0d931d-ck-3(active, since 77m)
    mds: ceph-fs:1 {0=juju-0d931d-ck-1=up:active} 2 up:standby
    osd: 3 osds: 3 up (since 76m), 3 in (since 76m)

  task status:
    scrub status:
        mds.juju-0d931d-ck-1: idle

  data:
    pools: 2 pools, 16 pgs
    objects: 22 objects, 2.2 KiB
    usage: 3.0 GiB used, 27 GiB / 30 GiB avail
    pgs: 16 active+clean

{
    "mon": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 1
    },
    "mgr": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 1
    },
    "osd": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
    },
    "mds": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
    },
    "overall": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 8
    }
}
$ juju config ceph-mon source=cloud:bionic-ussuri
$ juju run -u ceph-mon/leader -- sudo ceph -s && juju run -u ceph-mon/leader -- sudo ceph versions
  cluster:
    id: f2b72582-1703-11ec-82f3-fa163e15a8b3
    health: HEALTH_OK

  services:
    mon: 1 daemons, quorum juju-0d931d-ck-3 (age 85m)
    mgr: juju-0d931d-ck-3(active, since 6m)
    mds: ceph-fs:1 {0=juju-0d931d-ck-1=up:active} 2 up:standby
    osd: 3 osds: 3 up (since 84m), 3 in (since 84m)

  task status:
    scrub status:
        mds.juju-0d931d-ck-1: idle

  data:
    pools: 2 pools, 16 pgs
    objects: 22 objects, 2.2 KiB
    usage: 3.0 GiB used, 27 GiB / 30 GiB avail
    pgs: 16 active+clean

{
    "mon": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 1
    },
    "mgr": {
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 1
    },
    "osd": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
    },
    "mds": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
    },
    "overall": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 7,
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 1
    }
}
$ juju status >> ~/juju_status_after_ceph_mon_upgrade.txt # [2]
$ juju export-bundle >> ~/juju_export_bundle_after_ceph_mon_upgrade.txt # [2]

### Note:
The 'mon' version still reports 14.2.18 where the mgr version reports 15.2.13.

[1] https://pastebin.canonical.com/p/ZWKMMRJN9F/
[2] https://pastebin.canonical.com/p/yN4SqPthwm/

Revision history for this message
Chris Johnston (cjohnston) wrote :

Manually restarting ceph-mon causes the new version to be running:
$ juju run -u ceph-mon/0 -- sudo systemctl restart ceph-mon.target
$ juju run -u ceph-mon/leader -- sudo ceph -s && juju run -u ceph-mon/leader -- sudo ceph versions
  cluster:
    id: f2b72582-1703-11ec-82f3-fa163e15a8b3
    health: HEALTH_WARN
            client is using insecure global_id reclaim
            mon is allowing insecure global_id reclaim
            2 pools have too few placement groups

  services:
    mon: 1 daemons, quorum juju-0d931d-ck-3 (age 114s)
    mgr: juju-0d931d-ck-3(active, since 108s)
    mds: ceph-fs:1 {0=juju-0d931d-ck-1=up:active} 2 up:standby
    osd: 3 osds: 3 up (since 99m), 3 in (since 99m)

  task status:
    scrub status:
        mds.juju-0d931d-ck-1: idle

  data:
    pools: 3 pools, 17 pgs
    objects: 22 objects, 2.7 KiB
    usage: 3.0 GiB used, 27 GiB / 30 GiB avail
    pgs: 17 active+clean

  io:
    client: 170 B/s wr, 0 op/s rd, 0 op/s wr

{
    "mon": {
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 1
    },
    "mgr": {
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 1
    },
    "osd": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
    },
    "mds": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3
    },
    "overall": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 6,
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 2
    }
}

Revision history for this message
Chris Johnston (cjohnston) wrote :

Also seeing very similar results with ceph-fs:

$ juju status >> ~/juju_status_before_ceph_fs_upgrade.txt # [3]
$ juju export-bundle >> ~/juju_export_bundle_before_ceph_fs_upgrade.txt # [3]
$ juju upgrade-charm ceph-fs --revision 43
Added charm-store charm "ceph-fs", revision 43 in channel stable, to the model
Adding endpoint "certificates" to default space "alpha"
Leaving endpoints in "alpha": ceph-mds, public
$ juju config ceph-fs source=cloud:bionic-ussuri
$ juju run -u ceph-mon/leader -- sudo ceph -s && juju run -u ceph-mon/leader -- sudo ceph versions
  cluster:
    id: f2b72582-1703-11ec-82f3-fa163e15a8b3
    health: HEALTH_WARN
            client is using insecure global_id reclaim
            mon is allowing insecure global_id reclaim
            3 OSD(s) reporting legacy (not per-pool) BlueStore omap usage stats
            2 pools have too few placement groups

  services:
    mon: 1 daemons, quorum juju-0d931d-ck-3 (age 20m)
    mgr: juju-0d931d-ck-3(active, since 20m)
    mds: ceph-fs:1 {0=juju-0d931d-ck-0=up:active} 2 up:standby
    osd: 3 osds: 3 up (since 9m), 3 in (since 118m)

  task status:
    scrub status:
        mds.juju-0d931d-ck-0: idle

  data:
    pools: 3 pools, 17 pgs
    objects: 22 objects, 2.7 KiB
    usage: 3.0 GiB used, 27 GiB / 30 GiB avail
    pgs: 17 active+clean

{
    "mon": {
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 1
    },
    "mgr": {
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 1
    },
    "osd": {
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 3
    },
    "mds": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 1,
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 2
    },
    "overall": {
        "ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 1,
        "ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable)": 7
    }
}

### Note:
It looks like one of the three mds units didn't get properly restarted. This appears to be the active mds.

$ juju status >> ~/juju_status_after_ceph_fs_upgrade.txt # [4]
$ juju export-bundle >> ~/juju_export_bundle_after_ceph_fs_upgrade.txt # [4]

$ sudo ceph fs status
ceph-fs - 0 clients
=======
RANK STATE MDS ACTIVITY DNS INOS
 0 active juju-0d931d-ck-0 Reqs: 0 /s 10 13
      POOL TYPE USED AVAIL
ceph-fs_metadata metadata 1536k 8693M
  ceph-fs_data data 0 8693M
  STANDBY MDS
juju-0d931d-ck-1
juju-0d931d-ck-2
                                     VERSION DAEMONS
ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable) juju-0d931d-ck-0
 ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus (stable) juju-0d931d-ck-1, juju-0d931d-ck-2

 [3] https://pastebin.canonical.com/p/TqqGxQ2n9z/
 [4] https://pastebin.canonical.com/p/CcH6PMkp8G/

summary: - Upgrade from Nautilus to Octopus does not restart services, leaving the
- versions running Nautilus versions
+ Upgrade from Nautilus to Octopus does not restart services, leaving them
+ running Nautilus versions
tags: added: openstack-upgrade
Changed in charm-ceph-mon:
importance: Undecided → Medium
Changed in charm-ceph-fs:
importance: Undecided → Medium
Changed in charm-ceph-mon:
status: New → Triaged
Changed in charm-ceph-fs:
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.