[Packaging][CI] Deployment often exceeds 7800 seconds and time out

Bug #1626657 reported by Roman Podoliaka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Won't Fix
Medium
Pawel Brzozowski

Bug Description

A few recent builds of 9.0-pkg-systest-ubuntu failed with:

======================================================================
ERROR: Deploy ceph HA with RadosGW for objects
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/case.py", line 296, in testng_method_mistake_capture_func
    compatability.capture_type_error(s_func)
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/compatability/exceptions_2_6.py", line 27, in capture_type_error
    func()
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/case.py", line 350, in func
    func(test_case.state.get_state())
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 120, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/tests/test_ceph.py", line 503, in ceph_rados_gw
    self.fuel_web.deploy_cluster_wait(cluster_id)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 455, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 440, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 491, in wrapper
    return func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 498, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 382, in wrapper
    return func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/models/fuel_web_client.py", line 896, in deploy_cluster_wait
    self.assert_task_success(task, interval=interval, timeout=timeout)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/__init__.py", line 59, in wrapped
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/models/fuel_web_client.py", line 322, in assert_task_success
    task = self.task_wait(task, timeout, interval)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/__init__.py", line 59, in wrapped
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-pkg-systest-ubuntu/fuel-qa/fuelweb_test/models/fuel_web_client.py", line 1335, in task_wait
    "was exceeded: ".format(task=task["name"], timeout=timeout))
TimeoutError: Waiting task "deploy" timeout 7800 sec was exceeded:

astute.log ( http://paste.openstack.org/show/582622/ ) looks interesting:

2016-09-19 14:23:39 DEBUG [30081] Task time summary: primary-database with status successful on node 1 took 00:36:03
2016-09-19 14:57:13 DEBUG [30081] Task time summary: primary-rabbitmq with status successful on node 1 took 00:33:34
2016-09-19 14:57:22 DEBUG [30081] Task time summary: memcached with status successful on node 1 took 00:00:09
2016-09-19 14:57:41 DEBUG [30081] Task time summary: apache with status successful on node 1 took 00:00:19
2016-09-19 14:57:51 DEBUG [30081] Task time summary: openrc-delete with status successful on node 1 took 00:00:10
2016-09-19 14:58:03 DEBUG [30081] Task time summary: primary-dns-server with status successful on node 1 took 00:00:12
2016-09-19 14:58:03 DEBUG [30081] Task time summary: murano-rabbitmq with status skipped on node 1 took 00:00:00
2016-09-19 14:58:45 DEBUG [30081] Task time summary: api-proxy with status successful on node 1 took 00:00:42
2016-09-19 14:58:55 DEBUG [30081] Task time summary: cluster_health with status successful on node 1 took 00:00:10
2016-09-19 14:59:17 DEBUG [30081] Task time summary: umm with status successful on node 1 took 00:00:22
2016-09-19 14:59:35 DEBUG [30081] Task time summary: conntrackd with status successful on node 1 took 00:00:18
2016-09-19 15:05:14 DEBUG [30081] Task time summary: database with status successful on node 2 took 00:41:34
2016-09-19 15:06:44 DEBUG [30081] Task time summary: database with status successful on node 5 took 00:43:04

^ installation and configuration of mysql/rabbitmq takes longer than usual. In fact puppet logs allows to see that the most time is spent on installing of the packages:

http://paste.openstack.org/show/582629/

Unfortunately, the atop logs in the diagnostic snapshot do not capture this period of time.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

The diagnostic snapshot is attached.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

CI team, could you please check the logs of the corresponding Jenkins slaves to make sure we had enough resources available (disk / cpu / ram)?

Roman Vyalov (r0mikiam)
Changed in fuel:
milestone: none → 9.2
Changed in fuel:
status: New → Confirmed
Roman Vyalov (r0mikiam)
Changed in fuel:
assignee: Fuel CI (fuel-ci) → Pawel Brzozowski (pbrzozowski)
Revision history for this message
Pawel Brzozowski (pbrzozowski) wrote :

I am sorry Roman, but because of low importance I have checked it right now - and job results are gone.

Changed in fuel:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.