during upgrade to luminous charm updated the kv store that the upgrade was finished before all OSDs were up/in

Bug #1821028 reported by dongdong tao
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Fix Released
High
Corey Bryant

Bug Description

When we do upgrade from Jewel to Luminous.

The charm would create the mon-kv entry saying the upgrade was complete prior to the OSDs being up/in and the next would fire off.

So, this could cause multiple osd nodes to be down at a very short time which causes some of the PG not available and causes many slow request.

I think we should have a mechanism to ensure that only when all the osd on one node is completely "up/in" then we can insert the "upgrade_done" key to the mon kv.

Tags: ceph-upgrade
dongdong tao (taodd)
Changed in charm-ceph-osd:
assignee: nobody → dongdong tao (taodd)
description: updated
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hello. How did you do the upgrade from Jewel to Luminous? I'm assuming that you are on Ubuntu Xenial and upgraded to OpenStack xenial-pike or later? Or did you upgrade Xenial to Bionic?

Please could you state the Ubuntu version (and version changes)
Please could you state the OpenStack version (if using Ceph with OpenStack)

Please could you include the Juju bundle (or fragment).

Please could you add any relevant Juju debug-log files (for the osd(s) in question).

Thanks.

Changed in charm-ceph-osd:
status: New → Incomplete
Revision history for this message
Sandor Zeestraten (szeestraten) wrote :

Hi, I would like to add to this as we are prepping an upgrade from Jewel to Luminous.

As mentioned above, the ceph-osd charm rolling upgrade process is awfully aggressive as it restarts all osd processes on one unit and then quickly jumps to the next unit repeating this process.

We have somewhat dense nodes with 12-24 osds per node and this can impact client traffic.

OS: Ubuntu 16.04
From: Jewel 10.2.11 (source: cloud:xenial-newton)
To: Luminous 12.2.12 (source: cloud:xenial-pike)
ceph-mon charm rev: 291
ceph-osd charm rev: 42

A workaround we are looking at is to stop all jujud-unit-ceph-osd services on all units, then manually starting them on each node to control the pace of the upgrade.

It would be better to be able control this manually via the charm or add some more logic and back off on rolling to the next units depending on the status of the cluster.

Revision history for this message
James Page (james-page) wrote :

The upgrade from Jewel->Luminous does not require a directory permissions upgrade, so all of the OSD services on each unit are restarted at the same time:

  https://opendev.org/openstack/charm-ceph-osd/src/branch/master/lib/ceph/utils.py#L2385

as the up/in is async to the restart of the service (i.e. systemd won't wait until ceph reports up/in), the unit marks itself as done which will trigger the next unit to start upgrading.

Changed in charm-ceph-osd:
status: Incomplete → Confirmed
importance: Undecided → Medium
importance: Medium → High
status: Confirmed → Triaged
James Page (james-page)
Changed in charm-ceph-osd:
assignee: dongdong tao (taodd) → nobody
James Page (james-page)
Changed in charm-ceph-osd:
milestone: none → 20.01
Ryan Beisner (1chb1n)
tags: added: ceph-upgrade
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Flagging back to NEW state. Please provide the next steps, or specific next steps toward resolution in order to TRIAGE. Thank you.

Changed in charm-ceph-osd:
status: Triaged → New
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

charm-ceph-osd should wait until the OSD logs report something like the following:

    osd.4 147 state: booting -> active

I'd suggest waiting for /var/log/ceph/ceph-osd.$ID.log to have a line matching "booting -> active" with a timestamp after the restart

Changed in charm-ceph-osd:
status: New → Triaged
James Page (james-page)
Changed in charm-ceph-osd:
milestone: 20.01 → 20.05
Revision history for this message
James Page (james-page) wrote :

I'm not a huge fan of the approach proposed in #5 - we should just use the admin socket of the OSD to determine status:

$ ceph daemon /var/run/ceph/ceph-osd.1.asok status

{
    "cluster_fsid": "b6e69f5a-6445-11ea-9213-00163e5f3d2d",
    "osd_fsid": "476f5b3d-184a-489e-8aa9-374cf89016b7",
    "whoami": 1,
    "state": "active",
    "oldest_map": 1,
    "newest_map": 123,
    "num_pgs": 91
}

this is a local check only but will determine the status of the daemon accurately.

Revision history for this message
James Page (james-page) wrote :

A context manager that uses 'get_local_osd_ids' and determines the status of each OSD pre-restart and then waits for that status post restart would appear to be a nice solution to this issue.

Changed in charm-ceph-osd:
assignee: nobody → Corey Bryant (corey.bryant)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-osd (master)

Fix proposed to branch: master
Review: https://review.opendev.org/713449

Changed in charm-ceph-osd:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-osd (master)

Reviewed: https://review.opendev.org/713449
Committed: https://git.openstack.org/cgit/openstack/charm-ceph-osd/commit/?id=cb0f757f185565bcec1179087be037d596e9885a
Submitter: Zuul
Branch: master

commit cb0f757f185565bcec1179087be037d596e9885a
Author: Corey Bryant <email address hidden>
Date: Tue Mar 17 09:27:08 2020 -0400

    Maintain OSD state on upgrade

    Sync charms.ceph

    Ensure each OSD reaches its pre-restart state before proceeding
    after restart. This prevents the charm from finalizing the upgrade
    prior to OSDs recovering after upgrade. For example, if the state
    is 'active' prior to restart, then it must reach 'active' after
    restart, at which point the upgrade will be allowed to complete.

    Change-Id: I1067a8cdd1e2b706db07f194eca6fb2efeccb817
    Depends-On: https://review.opendev.org/#/c/713743/
    Closes-Bug: #1821028

Changed in charm-ceph-osd:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-osd (stable/20.02)

Fix proposed to branch: stable/20.02
Review: https://review.opendev.org/716928

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-osd (stable/20.02)

Reviewed: https://review.opendev.org/716928
Committed: https://git.openstack.org/cgit/openstack/charm-ceph-osd/commit/?id=e03a664f681834d740492cfbbe6e4b96ec19e57a
Submitter: Zuul
Branch: stable/20.02

commit e03a664f681834d740492cfbbe6e4b96ec19e57a
Author: Corey Bryant <email address hidden>
Date: Tue Mar 17 09:27:08 2020 -0400

    Maintain OSD state on upgrade

    Sync charms.ceph

    Ensure each OSD reaches its pre-restart state before proceeding
    after restart. This prevents the charm from finalizing the upgrade
    prior to OSDs recovering after upgrade. For example, if the state
    is 'active' prior to restart, then it must reach 'active' after
    restart, at which point the upgrade will be allowed to complete.

    Change-Id: I1067a8cdd1e2b706db07f194eca6fb2efeccb817
    Depends-On: https://review.opendev.org/#/c/713743/
    Closes-Bug: #1821028
    (cherry picked from commit cb0f757f185565bcec1179087be037d596e9885a)

David Ames (thedac)
Changed in charm-ceph-osd:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.