Gnocchi / Aodh fail to work when RBD backend is enabled

Bug #1646506 reported by Giulio Fidente
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Emilien Macchi

Bug Description

One of the scenario jobs is failing when enabling the RBD driver for Gnocchi

See the logs for https://review.openstack.org/#/c/404462/3

Changed in tripleo:
status: Triaged → Confirmed
Revision history for this message
Pradeep Kilambi (pkilambi) wrote :

So we from telemetry side dug through the logs a bit and the issue seems to be on Ceph side not gnocchi. Ceilometer is trying to post resources to gnocchi, gnocchi tries to connect to ceph but keeps waiting. So all the requests are stuck in write() to end.

Also 'osd_pool_default_size = 3' in ceph.conf while there is only one osd running.

from what we can tell this seems to be an issue with ceph configuration.

The resources.gnocchi_res_alarm: Resource 4f6a65d9-6659-4d76-b8c5-8d51132acea9 does not exist (HTTP 404) error is hence because the post dint go through.

Can we try running 3 osd's or switch the size to 1 if we indeed want 1 osd and see if that helps?

Revision history for this message
Giulio Fidente (gfidente) wrote :

Hi Pradeep thanks. Ceph has a min replica setting too, set to 1 so that shouldn't be an issue, other services are using it, for example glance when uploading the image for the pingtest stack.

Is it possible we're configuring Gnocchi too early and trying to write in Ceph when it isn't ready to accept writes yet? Can you figure from one of the failed attmpts when is it that the first write from Gnocchi gets stuck?

Changed in tripleo:
milestone: ocata-2 → ocata-3
Revision history for this message
Pradeep Kilambi (pkilambi) wrote :

I was able to deploy aodh/gnocchi with ceph and dont see this issue. COuld it be that the scenario test is configuring ceph a bit later than we want ? could we move the ceph config earlier ?

Changed in tripleo:
status: Confirmed → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Emilien Macchi (<email address hidden>) on branch: master
Review: https://review.openstack.org/413707

Changed in tripleo:
milestone: ocata-3 → ocata-rc1
Changed in tripleo:
milestone: ocata-rc1 → ocata-rc2
Changed in tripleo:
milestone: ocata-rc2 → pike-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/442795

Changed in tripleo:
assignee: nobody → Emilien Macchi (emilienm)
status: Triaged → In Progress
Changed in tripleo:
milestone: pike-1 → pike-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/442795
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=152df0164f67ea5ff4488185c4f72a9a8be07669
Submitter: Jenkins
Branch: master

commit 152df0164f67ea5ff4488185c4f72a9a8be07669
Author: Emilien Macchi <email address hidden>
Date: Tue Mar 7 16:38:36 2017 -0500

    scenario001/pingtest: enable Gnocchi resource again

    We disabled it because it stopped working. Let's see how it works now.

    Change-Id: If1efb86cb1d6ada357d4562408a566ac702fb6be
    Closes-Bug: #1646506

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 7.0.0.0b2

This issue was fixed in the openstack/tripleo-heat-templates 7.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.