Traces from benchmark context should be gracefull handled

Bug #1375802 reported by Artem Panchenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Rally
Fix Released
High
Boris Pavlovic

Bug Description

Recently I ran rally test (nova boot-and-delete) on OpenStack cloud deployed by Fuel that was running few days, here is the task config:

http://paste.openstack.org/show/117109/

During the test I performed different actions with OpenStack cluster (like adding/removing nodes, simulation of network/hardware failures) which would help to measure how quickly and efficiently HA recovers cloud services under real load. Some of actions caused services downtime and Rally reported, that API returns error, for example:

http://paste.openstack.org/show/117112/

but after API services were recovered it continued to handle requests correctly. In the end of testing (Rally) I turned down some cluster nodes and cloud became unhealthy: HAProxy returned 502/504 errors on all API requests to Nova/Keystone. It caused failure of benchmark tests due to authorization error:

http://paste.openstack.org/show/117108/

and I wasn't able to get tests results (e.g. percent of failed iterations). I think Rally should print error if cluster services become unreachable, but still return tests result even if all requests failed.

I attached full log of tests to the bug (it lacks of beginning of the test, because I enabled logging is screen a little bit later)

Revision history for this message
Artem Panchenko (apanchenko-8) wrote :
summary: - Test fails and doesn't print results: 'Authorization Failed: Bad Gateway
- (HTTP 502)'
+ Traces from bechmark context should be handled
summary: - Traces from bechmark context should be handled
+ Traces from benchmark context should be handled
Changed in rally:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Boris Pavlovic (boris-42)
summary: - Traces from benchmark context should be handled
+ Traces from benchmark context should be gracefull handled
Revision history for this message
Boris Pavlovic (boris-42) wrote :

During the bunch of various patches this was more or less fixed.

The changes include next:

  - Rally won't fail in case if we get random exceptions from rally benchmark context and runners

  - Rally will LOG these exceptions

  - Rally will put SLA (unexpected_failure) in such case, so you will see that test failed

Changed in rally:
milestone: none → 0.0.3
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.