benchmark scenarios with high "times" can result in 'MySQL server has gone away'

Bug #1370166 reported by Chris Dent
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Rally
Invalid
Undecided
Unassigned

Bug Description

Using the follow scenario as a task:

{
    "CeilometerQueries.create_and_query_samples": [
        {
            "args": {
                "filter": {"=": {"counter_unit": "instance"}},
                "orderby": null,
                "limit": 10,
                "counter_name": "cpu_util",
                "counter_type": "gauge",
                "counter_unit": "instance",
                "counter_volume": 1.0,
                "resource_id": "resource_id"
            },
            "runner": {
                "type": "constant",
                "times": 5000,
                "concurrency": 20
            },
            "context": {
                "users": {
                    "tenants": 5,
                    "users_per_tenant": 5
                }
            }
        }
    ]
}

The traceback below happens. The error happens after all ceilometer api requests have been made, so presumably during the results processing phase.

Exception in thread Thread-10:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 764, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/opt/stack/rally/rally/benchmark/engine.py", line 266, in consume_results
    "scenario_duration": self.duration})
  File "/opt/stack/rally/rally/objects/task.py", line 57, in append_results
    db.task_result_create(self.task['uuid'], key, value)
  File "/opt/stack/rally/rally/db/api.py", line 159, in task_result_create
    return IMPL.task_result_create(task_uuid, key, data)
  File "/opt/stack/rally/rally/db/sqlalchemy/api.py", line 160, in task_result_create
    result.save()
  File "/opt/stack/rally/rally/db/sqlalchemy/models.py", line 48, in save
    super(RallyBase, self).save(session=session)
  File "/usr/lib/python2.7/site-packages/oslo/db/sqlalchemy/models.py", line 48, in save
    session.flush()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 1919, in flush
    self._flush(objects)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2037, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 63, in __exit__
    compat.reraise(type_, value, traceback)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2037, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 403, in rollback
    transaction._rollback_impl()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 431, in _rollback_impl
    t[1].rollback()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1323, in rollback
    self._do_rollback()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1361, in _do_rollback
    self.connection._rollback_impl()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 509, in _rollback_impl
    self._handle_dbapi_exception(e, None, None, None, None)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1156, in _handle_dbapi_exception
    util.raise_from_cause(newraise, exc_info)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 507, in _rollback_impl
    self.engine.dialect.do_rollback(self.connection)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/dialects/mysql/base.py", line 2263, in do_rollback
    dbapi_connection.rollback()
DBConnectionError: (OperationalError) (2006, 'MySQL server has gone away') None None

Revision history for this message
Chris Dent (cdent) wrote :

From:

5:59pm] msdubov_: cdent Hi, thanks for the report. Has this bug appeared in your machine right after Rally installation or after some time you used it?
[6:00pm] cdent: I update my rally code to master earlier today and have run several tests throughout the day
[6:00pm] cdent: when the "times" is less than 2000 or so, it doesn't happen
[6:00pm] cdent: up around 5000 it does
[6:01pm] cdent: I initially assumed it was a concurrency problem, but it appears to have more to do with volume of data.
[6:01pm] msdubov_: cdent Quite interesting, could you please add that as a comment to that bug?

Is there a huge query happening within mysql that is perhaps allowing the connection to time out?

Revision history for this message
Boris Pavlovic (boris-42) wrote :

I'll try to reproduce..

Revision history for this message
Chris Dent (cdent) wrote :

Apologies I should have provided this information before, in case it is relevant.

* 5.5.39-MariaDB
* Fedora release 20 (Heisenbug)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to rally (master)

Fix proposed to branch: master
Review: https://review.openstack.org/123290

Revision history for this message
Boris Pavlovic (boris-42) wrote :
Changed in rally:
status: New → Incomplete
Revision history for this message
Chris Dent (cdent) wrote :

I'll update to the latest code and see if it is still happening.

Revision history for this message
Chris Dent (cdent) wrote :
Download full text (3.3 KiB)

I pulled from rally master today and tried again. Same problem. The could very well be the result of something about my environment, but in this case it is just a default mariadb install on fedora20, so I'm not sure what's up.

21-1503-4d54-902e-2adf951d9513 | Starting: Exit context: `users`
2014-10-02 12:27:57.254 14985 INFO rally.benchmark.context.users [-] Task 26d26321-1503-4d54-902e-2adf951d9513 | Completed: Exit context: `users`
Exception in thread Thread-10:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 764, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/opt/stack/rally/rally/benchmark/engine.py", line 266, in consume_results
    "scenario_duration": self.duration})
  File "/home/opt/stack/rally/rally/objects/task.py", line 57, in append_results
    db.task_result_create(self.task['uuid'], key, value)
  File "/home/opt/stack/rally/rally/db/api.py", line 159, in task_result_create
    return IMPL.task_result_create(task_uuid, key, data)
  File "/home/opt/stack/rally/rally/db/sqlalchemy/api.py", line 160, in task_result_create
    result.save()
  File "/home/opt/stack/rally/rally/db/sqlalchemy/models.py", line 48, in save
    super(RallyBase, self).save(session=session)
  File "/usr/lib/python2.7/site-packages/oslo/db/sqlalchemy/models.py", line 48, in save
    session.flush()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 1919, in flush
    self._flush(objects)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2037, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 63, in __exit__
    compat.reraise(type_, value, traceback)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2037, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 403, in rollback
    transaction._rollback_impl()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 431, in _rollback_impl
    t[1].rollback()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1323, in rollback
    self._do_rollback()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1361, in _do_rollback
    self.connection._rollback_impl()
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 509, in _rollback_impl
    self._handle_dbapi_exception(e, None, None, None, None)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1156, in _handle_dbapi_exception
    util.raise_from_cause(newraise, exc_info)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 507, in _rollback_impl
    self.engine.dialect.do_rollback(self.connection)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/dialects/mysql/base.py", line 2263...

Read more...

Revision history for this message
Boris Pavlovic (boris-42) wrote :

heh me either.. we are using mysql (not mariadb) in our integration gate...

Changed in rally:
status: Incomplete → Confirmed
Revision history for this message
Chris Dent (cdent) wrote :

Sorry, turns out that I was using a relatively old oslo.db. Somewhere between the version I was using and the current master reconnection logic was made to work. I get good results on the task described above.

I _thought_ I was up to date, but was clearly wrong.

Changed in rally:
status: Confirmed → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on rally (master)

Change abandoned by Boris Pavlovic (<email address hidden>) on branch: master
Review: https://review.openstack.org/123290

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.