fuel-agent deployment tests are failing

Bug #1571997 reported by Dmitry Kaigarodеsev
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Confirmed
Critical
Kyrylo Romanenko

Bug Description

Facing with random fail during master.fuel-agent.pkgs.ubuntu.review_fuel_agent_ironic_deploy canary run:

<<<<<****************************************************************************************************>>>>>
2016-04-19 06:50:33,332 - ERROR decorators.py:126 -- Traceback (most recent call last):
  File "/home/jenkins/workspace/systest/master/fuelweb_test/helpers/decorators.py", line 120, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/systest/master/gates_tests/tests/test_review_in_fuel_agent.py", line 156, in gate_patch_fuel_agent
    ironic_conn.wait_for_vms(ironic_conn)
  File "/home/jenkins/workspace/systest/master/fuelweb_test/helpers/ironic_actions.py", line 99, in wait_for_vms
    timeout=60 * 15, timeout_msg='Server didn\'t became active')
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/devops/helpers/helpers.py", line 100, in wait
    raise TimeoutError(timeout_msg)
TimeoutError: Server didn't became active

related run:
https://ci.fuel-infra.org/view/deployment%20tests/job/master.fuel-agent.pkgs.ubuntu.review_fuel_agent_ironic_deploy/4/console

2016-04-19 01:43:48,630 - ERROR decorators.py:126 -- Traceback (most recent call last):
  File "/home/jenkins/workspace/systest/master/fuelweb_test/helpers/decorators.py", line 120, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/systest/master/gates_tests/tests/test_review_in_fuel_agent.py", line 150, in gate_patch_fuel_agent
    self._create_os_resources(ironic_conn)
  File "/home/jenkins/workspace/systest/master/fuelweb_test/tests/test_ironic_base.py", line 136, in _create_os_resources
    ironic_conn.wait_for_ironic_hypervisors(ironic_conn, ironic_slaves)
  File "/home/jenkins/workspace/systest/master/fuelweb_test/helpers/ironic_actions.py", line 93, in wait_for_ironic_hypervisors
    timeout=60 * 5, timeout_msg='Failed to update hypervisor details')
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/devops/helpers/helpers.py", line 100, in wait
    raise TimeoutError(timeout_msg)
TimeoutError: Failed to update hypervisor details

related runs:
https://ci.fuel-infra.org/view/deployment%20tests/job/master.fuel-agent.pkgs.ubuntu.review_fuel_agent_ironic_deploy/3/console
https://ci.fuel-infra.org/view/deployment%20tests/job/master.fuel-agent.pkgs.ubuntu.review_fuel_agent_ironic_deploy/2/console

for mitaka branch got another one failure:

2016-04-18 22:52:46,384 - ERROR decorators.py:126 -- Traceback (most recent call last):
  File "/home/jenkins/workspace/systest/master/fuelweb_test/helpers/decorators.py", line 120, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/systest/master/gates_tests/tests/test_review_in_fuel_agent.py", line 150, in gate_patch_fuel_agent
    self._create_os_resources(ironic_conn)
  File "/home/jenkins/workspace/systest/master/fuelweb_test/tests/test_ironic_base.py", line 136, in _create_os_resources
    ironic_conn.wait_for_ironic_hypervisors(ironic_conn, ironic_slaves)
  File "/home/jenkins/workspace/systest/master/fuelweb_test/helpers/ironic_actions.py", line 93, in wait_for_ironic_hypervisors
    timeout=60 * 5, timeout_msg='Failed to update hypervisor details')
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/devops/helpers/helpers.py", line 100, in wait
    raise TimeoutError(timeout_msg)
TimeoutError: Failed to update hypervisor details

related run:
https://ci.fuel-infra.org/job/mitaka.fuel-agent.pkgs.ubuntu.review_fuel_agent_ironic_deploy/2/console

marking this bug as critical since we've did a switch on this job and it's the only one deployment test for fuel-agent package

description: updated
Changed in fuel:
assignee: Fuel QA Team (fuel-qa) → Artem Grechanichenko (agrechanichenko)
summary: - fuel-agent deplyment tests are failing
+ fuel-agent deployment tests are failing
Revision history for this message
Artem Hrechanychenko (agrechanichenko) wrote :

After investigating traceback
File "/home/jenkins/workspace/systest/master/fuelweb_test/helpers/ironic_actions.py", line 93, in wait_for_ironic_hypervisors
    timeout=60 * 5, timeout_msg='Failed to update hypervisor details')
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/devops/helpers/helpers.py", line 100, in wait
    raise TimeoutError(timeout_msg)

and method which check ironic hypervisor on nodes decided, that this method fails on high-loaded CI slaves.

Custom run for stable/mitaka https://custom-ci.infra.mirantis.net/view/9.0/job/9.0.custom.packages_test.ubuntu/34/consoleFull - passed.

I think that need to increase timeout for waiting ironic hypervisor from 5 minutes to 15 minutes.

Changed in fuel:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-qa (master)

Change abandoned by Artem Grechanichenko (<email address hidden>) on branch: master
Review: https://review.openstack.org/307620

Revision history for this message
Artem Hrechanychenko (agrechanichenko) wrote :

Re-assign to K.Romanenko.
Reason:

Test failed by timeout in random order and always on Enroll Ironic nodes step.

If load on slave non critical , test will passed.

If load is critical - test will failed by timeout. So need to investigate which timeout value need to set in ironic actions helper

Changed in fuel:
assignee: Artem Grechanichenko (agrechanichenko) → Kyrylo Romanenko (kromanenko)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-qa (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/307817

Revision history for this message
Vasyl Saienko (vsaienko) wrote :

Lets create simple environment configuration:

1 Controller with ceph
1 Ironic node.

Also the job should be set in non voting mode first until it is stabilized and tested.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-qa (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/308253

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/307817
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=08afbdab3e1d2e685493e2586e57c4d25324edb8
Submitter: Jenkins
Branch: master

commit 08afbdab3e1d2e685493e2586e57c4d25324edb8
Author: Artem Grechanichenko <email address hidden>
Date: Tue Apr 19 16:28:10 2016 +0300

    Temporary disable ironic actions steps in test_review_in_fuel_agent

    Due to random failures on Ci slaves with high load
    temporary disable ironic actions
    After resolving failures on CI need to revert changes

    Change-Id: Ib56e27f016dec8776e8b39fd668571ac87693139
    Related-Bug: #1571997

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-qa (stable/mitaka)

Reviewed: https://review.openstack.org/308253
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=1adc1e757a54790dfc81a39fa848e7399cbc152a
Submitter: Jenkins
Branch: stable/mitaka

commit 1adc1e757a54790dfc81a39fa848e7399cbc152a
Author: Artem Grechanichenko <email address hidden>
Date: Tue Apr 19 16:28:10 2016 +0300

    Temporary disable ironic actions steps in test_review_in_fuel_agent

    Due to random failures on Ci slaves with high load
    temporary disable ironic actions
    After resolving failures on CI need to revert changes

    Change-Id: Ib56e27f016dec8776e8b39fd668571ac87693139
    Related-Bug: #1571997

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-qa (master)

Fix proposed to branch: master
Review: https://review.openstack.org/308319

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/308319
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=ca9b08cb7d39a79201a982a3dfba405382ff27d0
Submitter: Jenkins
Branch: master

commit ca9b08cb7d39a79201a982a3dfba405382ff27d0
Author: Kyrylo Romanenko <email address hidden>
Date: Wed Apr 20 16:11:35 2016 +0300

    Simplify environment configuration for review_fuel_agent_ironic_deploy test

    Use less resource-consuming cluster configuration.
    Fix _wait_for_ironic_hypervisor to work without computes.

    Change-Id: I2abb01faca3e5fbfa1ef5d548432a7be65ecaae1
    Closes-Bug: #1571997

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-qa (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/309460

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-qa (stable/mitaka)

Reviewed: https://review.openstack.org/309460
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=0cbd590e78a3fb62146d29eddef6a15f844060ba
Submitter: Jenkins
Branch: stable/mitaka

commit 0cbd590e78a3fb62146d29eddef6a15f844060ba
Author: Kyrylo Romanenko <email address hidden>
Date: Wed Apr 20 16:11:35 2016 +0300

    Simplify environment configuration for review_fuel_agent_ironic_deploy test

    Use less resource-consuming cluster configuration.
    Fix _wait_for_ironic_hypervisor to work without computes.

    Change-Id: I2abb01faca3e5fbfa1ef5d548432a7be65ecaae1
    Closes-Bug: #1571997
    (cherry picked from commit ca9b08cb7d39a79201a982a3dfba405382ff27d0)

Revision history for this message
Artem Hrechanychenko (agrechanichenko) wrote :
Changed in fuel:
status: Fix Committed → Confirmed
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

folks,

the reason of "TimeoutError: Failed to update hypervisor details" is directly connected to https://bugs.launchpad.net/fuel/+bug/1529240

2016-04-27T15:46:38.142502+00:00 err: 2016-04-27 15:46:38.140 29802 ERROR ironic_fa_deploy.modules.lib_virt [req-5a7fa851-8e17-4d62-9410-dabe68ef8be4 admin - - - -] Failed to get libvirt connection node 0588b413-e177-4b68-acca-28c1f1cf7a6c

libvirt port wasn't opened for some reasons.

Revision history for this message
Kyrylo Romanenko (kromanenko) wrote :

Filed issue to open libvirt socket on CI slaves where current job runs:
https://bugs.launchpad.net/fuel/+bug/1576243

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/310532
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=c61f4efff6984feac0a560c48d0ca8d9456d9142
Submitter: Jenkins
Branch: master

commit c61f4efff6984feac0a560c48d0ca8d9456d9142
Author: Kyrylo Romanenko <email address hidden>
Date: Wed Apr 27 19:10:48 2016 +0300

    Increase _wait_for_ironic_hypervisor timeout

    Fix test for job review_fuel_agent_ironic_deploy by
    increasing timeout for ironic hypervisors.
    Increase timeout for vm boot.

    Change-Id: I58fe6964e111f920a2d52d19074c23b8c777b63b
    Closes-Bug: #1571997

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

it's still throwing 'TimeoutError: Server didn't became active'

e.g.

https://ci.fuel-infra.org/job/master.fuel-agent.pkgs.ubuntu.review_fuel_agent_ironic_deploy/92/console

Changed in fuel:
status: Fix Committed → Confirmed
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

Marked as duplicate according to the latests fail reason was investigated under LP 1576881, so the latests update we will track in those lp report

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.