Ceph monitors not removed when deleting controller

Bug #1435824 reported by Jon Skarpeteig
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Stanislav Makar

Bug Description

ceph --admin-daemon /var/run/ceph/ceph-mon.node-16.asok mon_status

Reports:

"mons": [
            { "rank": 0,
              "name": "node-10",
              "addr": "10.0.0.4:6789\/0"},
            { "rank": 1,
              "name": "node-15",
              "addr": "10.0.0.9:6789\/0"},
            { "rank": 2,
              "name": "node-16",
              "addr": "10.0.0.10:6789\/0"}]}}

Although node-10 and node-15 has been deleted through Fuel UI.

This in turn makes rados commands (Puppet) time out on deployment of replacement controller nodes.

Changed in fuel:
importance: Undecided → Medium
Stanislav Makar (smakar)
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
Revision history for this message
Stanislav Makar (smakar) wrote :

Thanks for bug
Please provide Fuel version and it would be fine if you can upload diagnostic snapshot?

Changed in fuel:
status: New → Incomplete
Revision history for this message
Jon Skarpeteig (jskarpet) wrote :
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

Stas, I think we can work on this issue without diagnostic snapshot.

Changed in fuel:
status: Incomplete → Confirmed
milestone: none → 6.1
importance: Medium → High
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Stanislav Makar (smakar)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/172457

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Stanislav Makar (smakar) wrote :

above patch is a temporary solution due to we do not have granular way for node
removal. As soon as we have it we will do it as separate task.

What is left - update the config file (/etc/ceph/ceph.conf) on left nodes

Revision history for this message
Stanislav Makar (smakar) wrote :

but if we deploy new node after deletion, ceph.conf will be fixed

Changed in fuel:
assignee: Stanislav Makar (smakar) → Aleksandr Didenko (adidenko)
Changed in fuel:
assignee: Aleksandr Didenko (adidenko) → Sergii Golovatiuk (sgolovatiuk)
Changed in fuel:
assignee: Sergii Golovatiuk (sgolovatiuk) → Stanislav Makar (smakar)
Changed in fuel:
assignee: Stanislav Makar (smakar) → Vladimir Sharshov (vsharshov)
Changed in fuel:
assignee: Vladimir Sharshov (vsharshov) → Stanislav Makar (smakar)
Revision history for this message
Stanislav Makar (smakar) wrote :

Custom iso is building http://jenkins-product.srt.mirantis.net:8080/view/custom_iso/job/custom_6.1_iso/1169/
Would be fine to test before merge

Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Main test scenario:

- deploy HA-cluster (ubuntu or centos) with 3 controllers and 2 ceph nodes (Glance, Cinder);
- after success deployment mark one of controller for deletion;
- run deletion.

Expecting result:

- node remove success;
- in Astute log you can see this messages:
- - Removing ceph mons ["node-<X>"] from cluster
- - Ceph mons are left in cluster: ["node-<X>, ..., node-<X>"]
- - shell command `ceph -f json mon dump` on the one of remaining controllers returns info only about remaining controllers, e.g.:
{"epoch":5,"fsid":"c7fc3eaf-ead2-4bd4-ae3a-61f5db6a236c","modified":"2015-05-07 11:03:06.153854","created":"0.000000","mons":[{"rank":0,"name":"node-6","addr":"192.168.0.11:6789\/0"}],"quorum":[0]}

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/172457
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=80d33dd44ec09340c275ea5ee7200d1ef47dd467
Submitter: Jenkins
Branch: master

commit 80d33dd44ec09340c275ea5ee7200d1ef47dd467
Author: Stanislav Makar <email address hidden>
Date: Fri Apr 10 13:56:11 2015 +0000

    Fix the problem with removing ceph-mon from cluster

    It is a temporary solution due to we do not have granular way for node
    removal. As soon as we have it we will do it as separate task.

    Change-Id: Ib77227c94a3634f7c91a4d7af4c1ff11ef525553
    Closes-bug: #1435824

Changed in fuel:
status: In Progress → Fix Committed
tags: added: on-verification
tags: removed: on-verification
Revision history for this message
Dmytro Iurchenko (diurchenko) wrote :

Verified on Ubuntu. Monitors are removed and added correctly according to corresponding operations on the contollers.

Stanislav Makar (smakar)
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.