integration tests fail randomly

Bug #1519477 reported by gordon chung
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
High
gordon chung

Bug Description

see https://review.openstack.org/#/c/243032/.

every few runs, integration tests will fail to find one of the servers.

error output shows gnocci resource exists for both instances but one does not have metrics. the logs show 404 when trying to create metrics for instance.

Revision history for this message
gordon chung (chungg) wrote :

so based on logs, it seems like the auto scaled vm is not created in time... only after the check for gnocchi resources.

there appears to a full minute between polled data and initial write to db

--creates vm--
2015-11-23 21:16:24.393 INFO nova.osapi_compute.wsgi.server [req-033f94d7-c800-4cb0-8f45-66c88f6672e8 admin admin] 127.0.0.1 "POST /v2.1/a0e8f5ffca4e41f9b2ca06c35ef65775/servers HTTP/1.1" status: 202 len: 826 time: 1.9208431

--vm initialised and polled--
2015-11-23 21:16:31.509 13218 DEBUG ceilometer.compute.pollsters.memory [-] Checking resident memory for instance 3ffca170-67f2-45a1-980d-b94d392d0f09 get_samples /opt/stack/new/ceilometer/ceilometer/compute/pollsters/memory.py:76

--first resource check--
127.0.0.1 - - [23/Nov/2015:21:16:36 +0000] "GET //v1/resource/instance HTTP/1.1" 200 2154 "-" "gabbi/1.10.0 (Python httplib2)"

--first create resource [collector]--
127.0.0.1 - - [23/Nov/2015:21:17:23 +0000] "POST /v1/resource/instance HTTP/1.1" 400 459 "-" "python-requests/2.8.1"

--last resource check--
127.0.0.1 - - [23/Nov/2015:21:17:31 +0000] "GET //v1/resource/instance HTTP/1.1" 200 2154 "-" "gabbi/1.10.0 (Python httplib2)"

--first successful resource create--
127.0.0.1 - - [23/Nov/2015:21:17:31 +0000] "POST /v1/resource/instance HTTP/1.1" 201 2399 "-" "python-requests/2.8.1"

Revision history for this message
gordon chung (chungg) wrote :

strange, the first metric posting is:

127.0.0.1 - - [23/Nov/2015:21:16:35 +0000] "POST /v1/resource/instance/3ffca170-67f2-45a1-980d-b94d392d0f09/metric/disk.ephemeral.size/measures HTTP/1.1" 404 424 "-" "python-requests/2.8.1"

but the first failed resource creation is almost a full minute later

127.0.0.1 - - [23/Nov/2015:21:17:23 +0000] "POST /v1/resource/instance HTTP/1.1" 400 459 "-" "python-requests/2.8.1"

Revision history for this message
gordon chung (chungg) wrote :

i'm pretty sure this relates to cache and the threading.Lock we use. i wonder if we need it.

Changed in ceilometer:
assignee: nobody → gordon chung (chungg)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/253181
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=f15e952895baae019191cf7c874425b75096b88c
Submitter: Jenkins
Branch: master

commit f15e952895baae019191cf7c874425b75096b88c
Author: gordon chung <email address hidden>
Date: Thu Dec 3 15:03:46 2015 -0500

    add per resource lock

    this creates a lock per resource so every sample isn't blocked.
    it also adds ability to delete resource locks if not used.

    Related-Bug: #1519477
    Change-Id: I0933a8aa287a6cc314b1d4e74ad2ec4adf6655b4

Changed in ceilometer:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/255569
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=c410ff9e821499cd6a38a45eefe5be6d2253ca26
Submitter: Jenkins
Branch: master

commit c410ff9e821499cd6a38a45eefe5be6d2253ca26
Author: gordon chung <email address hidden>
Date: Wed Dec 9 22:07:47 2015 +0000

    Revert "Revert "devstack config for dogpile cache""

    This reverts commit a0bb0f16b1fd9785ceb5a8dc933ba2bcae54412a.

    Closes-Bug: #1519477
    Co-Authored-By: Chris Dent <email address hidden>
    Change-Id: I34aa352c39d80692d7c3aa71eb108f0257d36484

Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/ceilometer 6.0.0.0b2

This issue was fixed in the openstack/ceilometer 6.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.