Ceph-mon on one of the storage master nodes fails after provisioning

Bug #1416163 reported by Vinod Nair
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Fix Committed
Critical
saravanan purushothaman
R2.0
Fix Committed
High
Jeya ganesh babu J
R2.1
Fix Committed
Critical
saravanan purushothaman
R2.20
Fix Committed
Undecided
saravanan purushothaman

Bug Description

One of the ceph-mons on one of the storage master fails . This happens after provisioning with 2.10 buildl 16
on a HA cluster

ceph -s
2015-01-29 15:32:39.664947 7f83ac31d700 0 -- :/1030201 >> 13.1.0.1:6789/0 pipe(0x7f83a800e0d0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f83a800e360).fault
    cluster e9b5da88-7c2b-41d5-a29e-8f1c5bd6d82c
     health HEALTH_WARN 1 mons down, quorum 1,2,3,4,5,6,7 cs-scale-2,cs-scale-3,cs-scale-4,cs-scale-5,cs-scale-6,cs-scale-7,cs-scale-8; clock skew detected on mon.cs-scale-3, mon.cs-scale-4, mon.cs-scale-5, mon.cs-scale-6, mon.cs-scale-7
     monmap e8: 8 mons at {cs-scale-1=13.1.0.1:6789/0,cs-scale-2=13.1.0.2:6789/0,cs-scale-3=13.1.0.3:6789/0,cs-scale-4=13.1.0.4:6789/0,cs-scale-5=13.1.0.5:6789/0,cs-scale-6=13.1.0.6:6789/0,cs-scale-7=13.1.0.7:6789/0,cs-scale-8=13.1.0.8:6789/0}, election epoch 36, quorum 1,2,3,4,5,6,7 cs-scale-2,cs-scale-3,cs-scale-4,cs-scale-5,cs-scale-6,cs-scale-7,cs-scale-8
     osdmap e156: 20 osds: 20 up, 20 in
      pgmap v5937: 1500 pgs, 4 pools, 245 GB data, 38039 objects
            488 GB used, 15883 GB / 16371 GB avail
                1500 active+clean

root@cs-scale-1:/opt/contrail/utils# /usr/bin/ceph-mon --cluster=ceph -i cs-scale-1 -f
*** Caught signal (Segmentation fault) **
 in thread 7f872c7f77c0
 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
 1: /usr/bin/ceph-mon() [0x8cf0b5]
 2: (()+0xfcb0) [0x7f872bf60cb0]
 3: (get_str_map_key(std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, std::string const&, std::string const*)+0x37) [0x859547]
 4: (LogMonitor::update_from_paxos(bool*)+0x803) [0x66da23]
 5: (PaxosService::refresh(bool*)+0x357) [0x5d2a47]
 6: (Monitor::refresh_from_paxos(bool*)+0x273) [0x57c373]
 7: (Monitor::init_paxos()+0xd5) [0x57c685]
 8: (Monitor::preinit()+0x7f3) [0x580b53]
 9: (main()+0x2559) [0x552ab9]
 10: (__libc_start_main()+0xed) [0x7f872a69376d]
 11: /usr/bin/ceph-mon() [0x5571c9]
2015-01-29 15:33:35.435896 7f872c7f77c0 -1 *** Caught signal (Segmentation fault) **
 in thread 7f872c7f77c0

 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
 1: /usr/bin/ceph-mon() [0x8cf0b5]
 2: (()+0xfcb0) [0x7f872bf60cb0]
 3: (get_str_map_key(std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, std::string const&, std::string const*)+0x37) [0x859547]
 4: (LogMonitor::update_from_paxos(bool*)+0x803) [0x66da23]
 5: (PaxosService::refresh(bool*)+0x357) [0x5d2a47]
 6: (Monitor::refresh_from_paxos(bool*)+0x273) [0x57c373]
 7: (Monitor::init_paxos()+0xd5) [0x57c685]
 8: (Monitor::preinit()+0x7f3) [0x580b53]
 9: (main()+0x2559) [0x552ab9]
 10: (__libc_start_main()+0xed) [0x7f872a69376d]
 11: /usr/bin/ceph-mon() [0x5571c9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

     0> 2015-01-29 15:33:35.435896 7f872c7f77c0 -1 *** Caught signal (Segmentation fault) **
 in thread 7f872c7f77c0

 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
 1: /usr/bin/ceph-mon() [0x8cf0b5]
 2: (()+0xfcb0) [0x7f872bf60cb0]
 3: (get_str_map_key(std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, std::string const&, std::string const*)+0x37) [0x859547]
 4: (LogMonitor::update_from_paxos(bool*)+0x803) [0x66da23]
 5: (PaxosService::refresh(bool*)+0x357) [0x5d2a47]
 6: (Monitor::refresh_from_paxos(bool*)+0x273) [0x57c373]
 7: (Monitor::init_paxos()+0xd5) [0x57c685]
 8: (Monitor::preinit()+0x7f3) [0x580b53]
 9: (main()+0x2559) [0x552ab9]
 10: (__libc_start_main()+0xed) [0x7f872a69376d]
 11: /usr/bin/ceph-mon() [0x5571c9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Segmentation fault
root@cs-scale-1:/opt/contrail/utils#

Vinod Nair (vinodnair)
summary: - Storage: Ceph-mon on one of the storage master fails
+ Storage: Ceph-mon on one of the storage master fails after provisioning
Vinod Nair (vinodnair)
tags: added: blocker
tags: added: releasenote
summary: - Storage: Ceph-mon on one of the storage master fails after provisioning
+ Ceph-mon on one of the storage master fails after provisioning
Changed in juniperopenstack:
importance: High → Critical
summary: - Ceph-mon on one of the storage master fails after provisioning
+ Ceph-mon on one of the storage master nodes fails after provisioning
tags: removed: releasenote
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/7237
Committed: http://github.org/Juniper/contrail-packaging/commit/33c5270eecd387b6d7dae5c648cfc20cf7533fe9
Submitter: Zuul
Branch: R2.1

commit 33c5270eecd387b6d7dae5c648cfc20cf7533fe9
Author: spuru <email address hidden>
Date: Mon Feb 9 21:10:15 2015 -0800

Closes-Bug: #1416163
Issue: wrong syslog facility argument passed on to channel call
Fixed the above mentioned issue.
Tests: long running FIO test

Change-Id: I22ff746349e7678a340c17c8522be127f2e152f4

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.0
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/8029
Committed: http://github.org/Juniper/contrail-packaging/commit/773231cf081cc75154ceb91796b4523f50fcc95e
Submitter: Zuul
Branch: R2.0

commit 773231cf081cc75154ceb91796b4523f50fcc95e
Author: Jeya ganesh babu J <email address hidden>
Date: Tue Mar 3 17:41:40 2015 -0800

Storage package bug fix merge from 2.10

Closes-Bug: #1410486
Closes-Bug: #1416163
qemu-utils package dependency added to be installed on openstack
nodes to support boot volume creation
Included fix for Ceph mon log which was passing a wrong argument
to a channel call. Use ceph-0.87-2 which is built with the fix

Change-Id: I671cf25544ffb1f479fb3ecdb51e933bad3e5565

information type: Proprietary → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/9427
Submitter: saravanan purushothaman (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/9428
Submitter: saravanan purushothaman (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/9428
Committed: http://github.org/Juniper/contrail-packaging/commit/75a777e246204c70a1cb82e433ce23d506c5cfa3
Submitter: Zuul
Branch: R2.20

commit 75a777e246204c70a1cb82e433ce23d506c5cfa3
Author: spuru <email address hidden>
Date: Wed Apr 22 19:15:31 2015 -0700

Closes-Bug: #1416163
Issue: wrong syslog facility argument passed on to channel call
Fixed the above mentioned issue.
Tests: long running FIO test

Change-Id: Ie520b60e7dcdee3ca48225ebd0467f6ae55b3d3c

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/9427
Committed: http://github.org/Juniper/contrail-packaging/commit/a3d242782befbf7ec21290471fe7c9fd26224b6f
Submitter: Zuul
Branch: master

commit a3d242782befbf7ec21290471fe7c9fd26224b6f
Author: spuru <email address hidden>
Date: Wed Apr 22 19:06:37 2015 -0700

Closes-Bug: #1416163
Issue: wrong syslog facility argument passed on to channel call
Fixed the above mentioned issue.
Tests: long running FIO test

Change-Id: Idb843090a2801c84eaf41387204560f449d2b002

Changed in juniperopenstack:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.