Swift-proxy charm needs to be able to handle 'None' devices

Bug #1708310 reported by Jill Rouleau
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Swift Proxy Charm
Fix Released
Medium
Billy Olsen

Bug Description

If 'None' devices exist in a builder file hooks will error.

2017-08-02 01:04:16 WARNING juju-log Sync rings called by non-leader - skipping
2017-08-02 01:04:16 INFO config-changed Traceback (most recent call last):
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/config-changed", line 738, in <module>
2017-08-02 01:04:16 INFO config-changed main()
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/config-changed", line 734, in main
2017-08-02 01:04:16 INFO config-changed assess_status(CONFIGS)
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/lib/swift_utils.py", line 1179, in assess_status
2017-08-02 01:04:16 INFO config-changed assess_status_func(configs, check_services)()
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1820, in _assess_status_func
2017-08-02 01:04:16 INFO config-changed state, message = _determine_os_workload_status(*args, **kwargs)
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1225, in _determine_os_workload_status
2017-08-02 01:04:16 INFO config-changed state, message, lambda: charm_func(configs))
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1353, in _ows_check_charm_func
2017-08-02 01:04:16 INFO config-changed charm_state, charm_message = charm_func_with_configs()
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1225, in <lambda>
2017-08-02 01:04:16 INFO config-changed state, message, lambda: charm_func(configs))
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/lib/swift_utils.py", line 1153, in customer_check_assess_status
2017-08-02 01:04:16 INFO config-changed if not has_minimum_zones(rings):
2017-08-02 01:04:16 INFO config-changed File "/var/lib/juju/agents/unit-swift-proxy-11/charm/hooks/lib/swift_utils.py", line 1124, in has_minimum_zones
2017-08-02 01:04:16 INFO config-changed zones = [dev['zone'] for dev in builder['devs']]
2017-08-02 01:04:16 INFO config-changed TypeError: 'NoneType' object has no attribute '__getitem__'

swift-ring-builder now creates these devices as a normal operation when devices have been removed from the ring (in our case, after removing and redeploying a node).
https://github.com/openstack/swift/blob/master/swift/common/ring/builder.py#L341
http://pastebin.ubuntu.com/25229652/

The charm needs to be able to handle the existence of these 'None' devices when parsing builder files.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

After discussing this further, it appears that the way to recreate this issue is to remove a dev or host from a builder file. Anywhere we've had this happen has been related to removing a node or a bad disk from swift.

Changed in charm-swift-proxy:
status: New → Triaged
importance: Undecided → High
importance: High → Medium
assignee: nobody → Billy Olsen (billy-olsen)
milestone: none → 17.08
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-swift-proxy (master)

Fix proposed to branch: master
Review: https://review.openstack.org/498114

Ryan Beisner (1chb1n)
tags: added: backport-potential uosci
James Page (james-page)
Changed in charm-swift-proxy:
milestone: 17.08 → 17.11
Revision history for this message
Edward Hope-Morley (hopem) wrote :

@afreiberger i've had a look at the proposed patch and while it seems fine it has got me thinking about the fact that the charms do not yet natively support removing devices from rings (see bug 1448190) yet you have run into this issue as a result of manually removing a device from a builder/ring. I would encourage you to exercise extreme caution when doing this since the swift-proxy application is maintaining/managing the ring and builder files and is responsible for ensuring their consistency across the entire cluster - something that if broken will very likely result in data loss at some point. Also, the swift-storage charms maintain records of devices that have been configured for use with Swift on the nodes they run on in which, again, could run you into trouble if e.g. you re-added a device and perhaps expect it to be reformatted but the charm thinks that device was never removed so may refuse to do so. Of course this is mostly conjecture based a somewhat distant memory from when these areas were implemented but I just wanted to point this out in case you were not aware of how these areas could be impacted by the action of removing disks.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Just to close the loop, we now have a spec under to review to implement missing features in the swift charms - https://review.openstack.org/#/c/504191/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-swift-proxy (master)

Reviewed: https://review.openstack.org/498114
Committed: https://git.openstack.org/cgit/openstack/charm-swift-proxy/commit/?id=ac553419f4757112344a302acbb010838e340abb
Submitter: Jenkins
Branch: master

commit ac553419f4757112344a302acbb010838e340abb
Author: Billy Olsen <email address hidden>
Date: Fri Aug 25 15:48:33 2017 -0700

    Handle holes in swift rings

    The Swift rings may have "holes" for devices which no longer
    exist, but the code does not handle the holes in general. Holes
    appear in the Swift rings as None objects in the devices list.

    This change adds checks to the places in the code which are loading
    the devices from the Swift ring to account for None. Generally,
    this is just checking the truthy value of a device.

    However, this change also removes the next device id calculation
    from the add_to_ring function, deferring the device id selection
    to the Swift RingBuilder. Upon examining the Swift RingBuilder
    code, it was seen that the RingBuilder will automatically calculate
    a device id for a new device when no id is specified. The Swift
    algorithm considers factors other than a hole in the ring (e.g.
    out of order devices) which were missing from the charm's code.

    Change-Id: Ibb0908338ac958ebf1adf17c12f9484f6963c695
    Closes-Bug: #1708310
    Co-Authored-By: Drew Freiberger <email address hidden>"

Changed in charm-swift-proxy:
status: In Progress → Fix Committed
tags: added: sts
James Page (james-page)
Changed in charm-swift-proxy:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.