The ceph cluster will get stuck when multiple mons are deployed.

Bug #1824783 reported by wangwei
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Medium
Unassigned
Train
Fix Released
Medium
Unassigned

Bug Description

When I deployed ceph with kolla and kolla-ansible(rocky version), I set up five mon nodes.

```
[storage-mon]
ceph-node1
ceph-node2
ceph-node3
ceph-node4
ceph-node5
```

Kolla-ansible will get stuck while performing this task:

```
TASK [ceph : Getting ceph mgr keyring]

```

I checked the status of mon, it was stuck.

```
(ceph-mon)[root@ceph-node1 /]# ceph -s
[errno 110] error connecting to the cluster

```

This is the first mon log:

```
    431 2019-04-15 14:13:35.265201 7f3b1fd43000 0 mon.ceph-node1@-1(probing) e0 my rank is now 0 (was -1)
    432 2019-04-15 14:13:42.364548 7f3b1236a700 0 mon.ceph-node1@0(probing) e0 ms_verify_authorizer bad authorizer from mon 192.168.10.13:6789/0
    433 2019-04-15 14:13:42.365666 7f3b15b71700 1 mon.ceph-node1@0(probing) e0 adding peer 192.168.10.13:6789/0 to list of hints
    434 2019-04-15 14:13:42.684252 7f3b1336c700 0 mon.ceph-node1@0(probing) e0 ms_verify_authorizer bad authorizer from mon 192.168.10.15:6789/0
    435 2019-04-15 14:13:42.690717 7f3b15b71700 1 mon.ceph-node1@0(probing) e0 adding peer 192.168.10.15:6789/0 to list of hints
    436 2019-04-15 14:13:42.708722 7f3b12b6b700 0 mon.ceph-node1@0(probing) e0 ms_verify_authorizer bad authorizer from mon 192.168.10.12:6789/0
    437 2019-04-15 14:13:42.709649 7f3b15b71700 1 mon.ceph-node1@0(probing) e0 adding peer 192.168.10.12:6789/0 to list of hints
    438 2019-04-15 14:13:42.733018 7f3b1236a700 0 mon.ceph-node1@0(probing) e0 ms_verify_authorizer bad authorizer from mon 192.168.10.14:6789/0
    439 2019-04-15 14:13:42.739142 7f3b15b71700 1 mon.ceph-node1@0(probing) e0 adding peer 192.168.10.14:6789/0 to list of hints
    440 2019-04-15 14:13:43.267431 7f3b15b71700 1 mon.ceph-node1@0(probing) e0 learned initial mon ceph-node2 addr 192.168.10.12:6789/0
    441 2019-04-15 14:13:43.267443 7f3b15b71700 0 mon.ceph-node1@0(probing) e0 my rank is now 1 (was 0)
    442 2019-04-15 14:13:43.271262 7f3b12b6b700 0 -- 192.168.10.11:6789/0 >> 192.168.10.13:6789/0 conn(0x5583a20e0000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH p gs=12 cs=1 l=0).process missed message? skipped from seq 0 to 970548727
    443 2019-04-15 14:13:43.271461 7f3b15b71700 1 mon.ceph-node1@1(probing) e0 learned initial mon ceph-node3 addr 192.168.10.13:6789/0
    444 2019-04-15 14:13:43.271470 7f3b15b71700 0 mon.ceph-node1@1(probing) e0 my rank is now 2 (was 1)
    445 2019-04-15 14:13:43.276008 7f3b1336c700 0 -- 192.168.10.11:6789/0 >> 192.168.10.12:6789/0 conn(0x5583a2075800 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH p gs=11 cs=1 l=0).process missed message? skipped from seq 0 to 1302247443
    446 2019-04-15 14:13:43.276346 7f3b1336c700 0 -- 192.168.10.11:6789/0 >> 192.168.10.13:6789/0 conn(0x5583a20b4800 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH p gs=14 cs=1 l=0).process missed message? skipped from seq 0 to 807601511
    447 2019-04-15 14:13:43.276459 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    448 2019-04-15 14:13:43.276560 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    449 2019-04-15 14:13:43.277178 7f3b12b6b700 0 mon.ceph-node1@2(electing) e0 ms_verify_authorizer bad authorizer from mon 192.168.10.15:6789/0
    450 2019-04-15 14:13:43.277403 7f3b1336c700 0 mon.ceph-node1@2(electing) e0 ms_verify_authorizer bad authorizer from mon 192.168.10.14:6789/0
    451 2019-04-15 14:13:43.280056 7f3b12b6b700 0 -- 192.168.10.11:6789/0 >> 192.168.10.15:6789/0 conn(0x5583a20e3000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 2 vs existing csq=0 existing_state=STATE_CONNECTING_WAIT_CONNECT_REPLY
    452 2019-04-15 14:13:43.280079 7f3b12b6b700 0 -- 192.168.10.11:6789/0 >> 192.168.10.15:6789/0 conn(0x5583a20e3000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept we reset (peer sent cseq 2, 0x5583a20fb800.cseq = 0), sending RESETSESSION
     453 2019-04-15 14:13:43.280299 7f3b1336c700 0 -- 192.168.10.11:6789/0 >> 192.168.10.14:6789/0 conn(0x5583a2106000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 2 vs existing csq=0 existing_state=STATE_CONNECTING_WAIT_CONNECT_REPLY
    454 2019-04-15 14:13:43.280338 7f3b1336c700 0 -- 192.168.10.11:6789/0 >> 192.168.10.14:6789/0 conn(0x5583a2106000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept we reset (peer sent cseq 2, 0x5583a20fa000.cseq = 0), sending RESETSESSION
    455 2019-04-15 14:13:43.283014 7f3b12b6b700 0 -- 192.168.10.11:6789/0 >> 192.168.10.15:6789/0 conn(0x5583a20e3000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 0 vs existing csq=0 existing_state=STATE_CONNECTING_WAIT_CONNECT_REPLY
    456 2019-04-15 14:13:43.283333 7f3b1336c700 0 -- 192.168.10.11:6789/0 >> 192.168.10.14:6789/0 conn(0x5583a2106000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 0 vs existing csq=0 existing_state=STATE_CONNECTING_WAIT_CONNECT_REPLY
    457 2019-04-15 14:13:43.293830 7f3b1236a700 0 -- 192.168.10.11:6789/0 >> 192.168.10.15:6789/0 conn(0x5583a20fb800 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH p gs=15 cs=1 l=0).process missed message? skipped from seq 0 to 1798268226
    458 2019-04-15 14:13:43.294164 7f3b15b71700 1 mon.ceph-node1@2(electing) e0 learned initial mon ceph-node5 addr 192.168.10.15:6789/0
    459 2019-04-15 14:13:43.294069 7f3b1336c700 0 -- 192.168.10.11:6789/0 >> 192.168.10.14:6789/0 conn(0x5583a20fa000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH p gs=9 cs=1 l=0).process missed message? skipped from seq 0 to 69241847
    460 2019-04-15 14:13:43.294603 7f3b15b71700 1 mon.ceph-node1@2(probing) e0 learned initial mon ceph-node4 addr 192.168.10.14:6789/0
    461 2019-04-15 14:13:43.296143 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    462 2019-04-15 14:13:43.296201 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    463 2019-04-15 14:13:48.299861 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    464 2019-04-15 14:13:48.300003 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    465 2019-04-15 14:13:53.304789 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    466 2019-04-15 14:13:53.304845 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    467 2019-04-15 14:13:58.309354 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    468 2019-04-15 14:13:58.309428 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    469 2019-04-15 14:14:03.314816 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    470 2019-04-15 14:14:03.314882 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    471 2019-04-15 14:14:08.319605 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    472 2019-04-15 14:14:08.319667 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    473 2019-04-15 14:14:13.324962 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    474 2019-04-15 14:14:13.325064 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    475 2019-04-15 14:14:18.329602 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    476 2019-04-15 14:14:18.329730 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1
    477 2019-04-15 14:14:23.335145 7f3b15b71700 0 log_channel(cluster) log [INF] : mon.ceph-node1 calling monitor election
    478 2019-04-15 14:14:23.335462 7f3b15b71700 1 mon.ceph-node1@2(electing).elector(0) init, first boot, initializing epoch at 1

```
The log shows that mon is always in the election state.

I debugged the deployment process of ceph-deploy, the initialization of mon is different from that in kolla.

-ceph-deploy
```
ceph-mon --cluster ceph --mkfs -i ceph-node1 --keyring /var/lib/ceph/tmp/ceph-ceph-node1.mon.keyring --setuser 167 --setgroup 167
```

- kolla (kolla\docker\ceph\ceph-mon\extend_start.sh)

```
    KEYRING_TMP="/tmp/ceph.mon.keyring"

    # Generate keyring for current monitor
    ceph-authtool --create-keyring "${KEYRING_TMP}" --import-keyring "${KEYRING_ADMIN}"
    ceph-authtool "${KEYRING_TMP}" --import-keyring "${KEYRING_MON}"
    mkdir -p "${MON_DIR}"
    ceph-mon --mkfs -i "${HOSTNAME}" --monmap "${MONMAP}" --keyring "${KEYRING_TMP}"
    rm "${KEYRING_TMP}"
```

When the monmap option is removed, the cluster deployment is back to normal.

Revision history for this message
wangwei (wangwei-david) wrote :

https://review.openstack.org/652606

This commit fixes this problem.

Revision history for this message
Mark Goddard (mgoddard) wrote :

Could be related to or duplicate of https://bugs.launchpad.net/kolla/+bug/1814172?

Revision history for this message
Mark Goddard (mgoddard) wrote :
Revision history for this message
wangwei (wangwei-david) wrote :
Download full text (7.6 KiB)

@Mark Goddard (mgoddard)

Yes, it seems to be the same bug.

I debugged the deployment process of ceph-deploy.

1.When executing the cluster initialization command,the initialization command does not contain monmap:

```
[root@ceph-node1 ceph-deploy]# ceph-deploy new ceph-node1 ceph-node2
...
[root@ceph-node1 ceph-deploy]# ceph-deploy mon create-initial

[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon create-initial
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create-initial
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f36f6646878>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mon at 0x7f36f6ed5848>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] keyrings : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-node1 ceph-node2
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-node1 ...
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph-node1][DEBUG ] detect platform information from remote host
[ceph-node1][DEBUG ] detect machine type
[ceph-node1][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro info: CentOS Linux 7.4.1708 Core
[ceph-node1][DEBUG ] determining if provided host has same hostname in remote
[ceph-node1][DEBUG ] get remote short hostname
[ceph-node1][DEBUG ] deploying mon to ceph-node1
[ceph-node1][DEBUG ] get remote short hostname
[ceph-node1][DEBUG ] remote hostname: ceph-node1
[ceph-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node1][DEBUG ] create the mon path if it does not exist
[ceph-node1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-node1/done
[ceph-node1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-node1/done
[ceph-node1][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-node1.mon.keyring
[ceph-node1][DEBUG ] create the monitor keyring file

#########################mkfs command #########################
[ceph-node1][INFO ] Running command: ceph-mon --cluster ceph --mkfs -i ceph-node1 --keyring /var/lib/ceph/tmp/ceph-ceph-node1.mon.keyring --setuser 167 --setgroup 167
##############################################################

[ceph-node1][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-node1.mon.keyring
[ceph-node1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph-node1][DEBUG ] create the init path if it does not exist
[ceph-node1][INFO ] Running command: systemctl enable ceph.target
[ceph-node1][INFO ] Running command: systemctl enable ceph-mon@ceph-node1
[ceph-no...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (master)

Reviewed: https://review.opendev.org/652606
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=285814353af29395932e15d78c6a154796978a21
Submitter: Zuul
Branch: master

commit 285814353af29395932e15d78c6a154796978a21
Author: wangwei <email address hidden>
Date: Mon Apr 15 18:36:40 2019 +0900

    Fix multiple mon deployment failures

    When 4 or 5 mons are deployed, all mons will always be in the election
    state, and the ceph cluster cannot be accessed.

    After comparing with ceph-deploy, I found that when the cluster was
    initialized, it did not use monmap. When adding a new mon node, it
    needs to use monmap. According to the above situation, I fixed the
    problem.

    Change-Id: Idbfc6741d371adcb06ed97d578d75d3bce862743
    Closes-bug: #1824783

Changed in kolla:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/660864

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/rocky)

Reviewed: https://review.opendev.org/660864
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=63d63ab97c87d4470be1cd72575869c32e1cb9c9
Submitter: Zuul
Branch: stable/rocky

commit 63d63ab97c87d4470be1cd72575869c32e1cb9c9
Author: wangwei <email address hidden>
Date: Mon Apr 15 18:36:40 2019 +0900

    Fix multiple mon deployment failures

    When 4 or 5 mons are deployed, all mons will always be in the election
    state, and the ceph cluster cannot be accessed.

    After comparing with ceph-deploy, I found that when the cluster was
    initialized, it did not use monmap. When adding a new mon node, it
    needs to use monmap. According to the above situation, I fixed the
    problem.

    Change-Id: Idbfc6741d371adcb06ed97d578d75d3bce862743
    Closes-bug: #1824783
    (cherry picked from commit 285814353af29395932e15d78c6a154796978a21)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/stein)

Reviewed: https://review.opendev.org/660863
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=877df152ddc6dcf83e5a92964a275de2dfc1444b
Submitter: Zuul
Branch: stable/stein

commit 877df152ddc6dcf83e5a92964a275de2dfc1444b
Author: wangwei <email address hidden>
Date: Mon Apr 15 18:36:40 2019 +0900

    Fix multiple mon deployment failures

    When 4 or 5 mons are deployed, all mons will always be in the election
    state, and the ceph cluster cannot be accessed.

    After comparing with ceph-deploy, I found that when the cluster was
    initialized, it did not use monmap. When adding a new mon node, it
    needs to use monmap. According to the above situation, I fixed the
    problem.

    Change-Id: Idbfc6741d371adcb06ed97d578d75d3bce862743
    Closes-bug: #1824783
    (cherry picked from commit 285814353af29395932e15d78c6a154796978a21)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 7.0.3

This issue was fixed in the openstack/kolla 7.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 8.0.0.0rc2

This issue was fixed in the openstack/kolla 8.0.0.0rc2 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 9.0.0.0rc1

This issue was fixed in the openstack/kolla 9.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.