charm upgrade from 241 to 261 fails with TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

Bug #1770740 reported by Sandor Zeestraten
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Fix Released
High
Alex Kavanagh

Bug Description

# Versions
juju 2.3.4
MAAS 2.1.3
ceph-osd rev. 241 and 261

# Issue
After upgrading from rev. 241 to 261 in of our deployments, one of many ceph-osd units ended up with a hook failure:

INFO juju-log upgrade position: None
DEBUG config-changed Traceback (most recent call last):
DEBUG config-changed File "/var/lib/juju/agents/unit-ceph-osd-7/charm/hooks/config-changed", line 562, in <module>
DEBUG config-changed hooks.execute(sys.argv)
DEBUG config-changed File "/var/lib/juju/agents/unit-ceph-osd-7/charm/hooks/charmhelpers/core/hookenv.py", line 800, in execute
DEBUG config-changed self._hooks[hook_name]()
DEBUG config-changed File "/var/lib/juju/agents/unit-ceph-osd-7/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 79, in _harden_inner2
DEBUG config-changed return f(*args, **kwargs)
DEBUG config-changed File "/var/lib/juju/agents/unit-ceph-osd-7/charm/hooks/config-changed", line 351, in config_changed
DEBUG config-changed check_for_upgrade()
DEBUG config-changed File "/var/lib/juju/agents/unit-ceph-osd-7/charm/hooks/config-changed", line 118, in check_for_upgrade
DEBUG config-changed upgrade_key='osd-upgrade')
DEBUG config-changed File "lib/ceph/utils.py", line 1924, in roll_osd_cluster
DEBUG config-changed osd_sorted_list[position - 1].name))
DEBUG config-changed TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

# Logs
Excerpt from unit-ceph-osd.log: https://pastebin.com/wKi53b31
Haven't checked the code but here are the OSDs from `lsblk`: https://pastebin.com/eHMYk96z

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

So the problem is in this function in lib/ceph/utils.py:

def get_upgrade_position(osd_sorted_list, match_name):
    """Return the upgrade position for the given osd.

    :param osd_sorted_list: list. Osds sorted
    :param match_name: str. The osd name to match
    :returns: int. The position or None if not found
    """
    for index, item in enumerate(osd_sorted_list):
        if item.name == match_name:
            return index
    return None

It returns None. In "roll_osd_cluster()" function, it needs to check for the return value of None and pick the right thing to do which is raise an error, as it should be blocked.

Changed in charm-ceph-osd:
assignee: nobody → Alex Kavanagh (ajkavanagh)
status: New → In Progress
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-osd (master)

Fix proposed to branch: master
Review: https://review.openstack.org/569085

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-osd (master)

Reviewed: https://review.openstack.org/569085
Committed: https://git.openstack.org/cgit/openstack/charm-ceph-osd/commit/?id=8b2303e8636186a79b1bdbd845848cd05daea9c7
Submitter: Zuul
Branch: master

commit 8b2303e8636186a79b1bdbd845848cd05daea9c7
Author: Alex Kavanagh <email address hidden>
Date: Thu May 17 11:44:23 2018 +0100

    Fix Traceback issue when ceph-osd upgrade fails

    Bug/1770740 surfaced an issue where get_upgrade_position() returns None
    but the calling function expects and exception to the thrown if the
    "None" condition exists. This just fixes the code so that the Traceback
    is stopped and the appropriate error/message is logged for the
    condition.

    Change-Id: Ib7d1fdc8f91bc992ccf618ef6f57e99bb90c2dbc
    Partial-Bug: #1770740

Changed in charm-ceph-osd:
status: In Progress → Fix Committed
James Page (james-page)
Changed in charm-ceph-osd:
milestone: none → 18.05
David Ames (thedac)
Changed in charm-ceph-osd:
status: Fix Committed → Fix Released
Revision history for this message
Paul Goins (vultaire) wrote :

Unfortunately, it looks like this change was reverted in changeset 2c3eae272fd85619f80a0af9cd6c9bd87472e912; my best guess is that along with other intentional changes, an old copy of the function was accidentally included in the commit.

I encountered this bug in cs:ceph-osd-270, which I think is the second-to-last release of the stable/18.08 series.

Looking at the current trunk, I think this bug is present there as well.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-osd (master)

Fix proposed to branch: master
Review: https://review.opendev.org/695165

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Paul, good catch, thanks! It wasn't committed to charms.ceph and so it got overwritten. I've now addressed this in: https://review.opendev.org/#/c/695163/.

Changed in charm-ceph-osd:
status: Fix Released → In Progress
milestone: 18.05 → 20.01
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-osd (master)

Reviewed: https://review.opendev.org/695165
Committed: https://git.openstack.org/cgit/openstack/charm-ceph-osd/commit/?id=b15a23ef6a26a4b4ece03020e7076d2bc7573937
Submitter: Zuul
Branch: master

commit b15a23ef6a26a4b4ece03020e7076d2bc7573937
Author: Alex Kavanagh <email address hidden>
Date: Wed Nov 20 10:26:28 2019 +0000

    Re-fix Traceback issue when ceph-osd upgrade fails

    This was originally fixed in Ib7d1fdc8f91bc992ccf618ef6f57e99bb90c2dbc
    but unfortunately wasn't also added to the charms.ceph library. Thus,
    this is a re-application of that fix; the charms to ceph fix is in [1].

    Bug/1770740 surfaced an issue where get_upgrade_position() returns None
    but the calling function expects and exception to the thrown if the
    "None" condition exists. This just fixes the code so that the Traceback
    is stopped and the appropriate error/message is logged for the
    condition.

    [1] https://review.opendev.org/#/c/695163/
        I16539b2bc35104eed54033bebb1154cad8a5cf0f

    Change-Id: Ieee8d13f25027ad540a23a6428c2226b6c20999a
    Partial-Bug: #1770740

James Page (james-page)
Changed in charm-ceph-osd:
milestone: 20.01 → 20.05
David Ames (thedac)
Changed in charm-ceph-osd:
milestone: 20.05 → 20.08
Changed in charm-ceph-osd:
milestone: 20.08 → 20.02
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.