Ceph backend is killed by SIGSEGV under load

Bug #1541026 reported by Mikhail Chernik
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Invalid
Medium
MOS Ceph

Bug Description

While running tests on scale (env with 200 nodes) several Cinder tests failed due to SIGSEGV in librbd.so

Environment: MOS 8.0 build 482, Neutron-VLAN+DVR, 3 conrtrollers, 180 computes, 20 Ceph+compute nodes

Cinder log: http://paste.openstack.org/show/485744/

root@node-95:~# dmesg | grep seg
[480235.420378] cinder-volume[35184]: segfault at 0 ip 00007fb8b937d70a sp 00007fb829ffabb0 error 4 in librbd.so.1.0.0[7fb8b9234000+52b000]
[570389.086761] cinder-volume[8696]: segfault at 0 ip 00007fb8b937d70a sp 00007fb7b97f9bb0 error 4 in librbd.so.1.0.0[7fb8b9234000+52b000]

Diagnostic snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2016-02-02_21-23-53.tar.xz

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Mikhail, please provide a diagnostic snapshot.

Changed in mos:
assignee: nobody → Mikhail Chernik (mchernik)
status: New → Incomplete
description: updated
Revision history for this message
Mikhail Chernik (mchernik) wrote :

Added link to diagnostic snapshot to bug description

Changed in mos:
assignee: Mikhail Chernik (mchernik) → Roman Podoliaka (rpodolyaka)
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Ceph team, please take a look at this.

Changed in mos:
assignee: Roman Podoliaka (rpodolyaka) → MOS Ceph (mos-ceph)
importance: Undecided → Medium
milestone: none → 9.0
status: Incomplete → Confirmed
tags: added: area-ceph
removed: ceph
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> cinder-volume[8696]: segfault at 0 ip 00007fb8b937d70a sp 00007fb7b97f9bb0 error 4 in librbd.so.1.0.0[7fb8b9234000+52b000]

This is an error in an application (presumably a part of cinder) using librbd. Unfortunately the report lacks any information which could be used to debug the problem, such as

1) cluster configuration
   a) ceph -s
   b) the content of /etc/ceph/*
2) the exact steps to reproduce the error, preferably as a sequence of shell commands
3) core dump of the crashed process

Changed in mos:
status: Confirmed → Incomplete
Revision history for this message
Bug Checker Bot (bug-checker) wrote : Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

expected result

steps to reproduce

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
Revision history for this message
Dina Belova (dbelova) wrote :

More than a month in the Incomplete state -> moving to Invalid. Please provide more info and more to Confirmed if issue is still in place.

Changed in mos:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.