Deployment fails with timeout on Cluster::Vrouter_ocf task

Bug #1582599 reported by Volodymyr Shypyguzov
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Confirmed
High
Vladimir Sharshov

Bug Description

Steps to reproduce:
        1. Create cluster in Ha mode with 1 controller
        2. Add 1 node with controller role
        3. Add 1 node with compute role
        4. Add 1 node with cinder role
        5. Verify network
        6. Provision nodes
        7. Make a test file on every node
        8. Deploy nodes
        9. Stop deployment
        10. Verify nodes are not reset to bootstrap image
        11. Re-deploy cluster << Fail
        12. Verify network
        13. Run OSTF
Expected result:
Cluster successfully redeployed
Actual result:
Deployment fails with the following error^ Deployment has failed. All nodes are finished. Failed tasks: Task[primary-rabbitmq/3], Task[cluster-vrouter/3]

In puppet logs:
node-3.test.domain.local 2016-05-17T00:58:09.110032 err: (/Stage[main]/Cluster::Vrouter_ocf/Service[p_vrouter]/ensure) change from stopped to running failed: Execution timeout after 1800 seconds!

Revision history for this message
Volodymyr Shypyguzov (vshypyguzov) wrote :
Revision history for this message
Volodymyr Shypyguzov (vshypyguzov) wrote :
tags: added: swarm-blocker
Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

After investigation i do not see any problem which can raised by stop deployment. It was stopped just after run, stopped without any problem and deployment run again and processing without any problem from 00:24:42 to 01:27:51 (failed cluster-vrouter).

This test on iso #368 do not failed: https://product-ci.infra.mirantis.net/job/9.0.system_test.ubuntu.bvt_ubuntu_bootstrap/111/testReport/(root)/deploy_stop_on_deploying_ubuntu_bootstrap/

So i marked this bug as incomplete.

Just in case i try to reproduce this problem locally (first run such cluster without stop - succeed, and now try to run it as described in test).

Changed in fuel:
status: New → Incomplete
importance: Undecided → High
assignee: nobody → Vladimir Sharshov (vsharshov)
milestone: none → 9.0
Changed in fuel:
assignee: Vladimir Sharshov (vsharshov) → Fuel QA telco (fuel-qa-telco)
Revision history for this message
Nastya Urlapova (aurlapova) wrote :
Revision history for this message
Dmitry Kalashnik (dkalashnik) wrote :

The next one run is failing with the same error, actually that case with deploy-stop is valid so nothing to do from QA side

Changed in fuel:
status: Incomplete → Confirmed
Changed in fuel:
assignee: Fuel QA telco (fuel-qa-telco) → nobody
Revision history for this message
Artem Hrechanychenko (agrechanichenko) wrote :
Changed in fuel:
assignee: nobody → Vladimir Sharshov (vsharshov)
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Root cause is missing fuel_pkgs/fuel_pkgs.pp task.

On primary controller tests made stop deployment on fuel_pkgs/setup_repositories.pp and then continued deployment on roles/allocate_hugepages.pp task. So fuel_pkgs/fuel_pkgs.pp task is missing. This is why corosync/pacemaker resources were not able to start:

2016-05-17T00:27:54.732628+00:00 warning: warning: Cannot execute '/usr/lib/ocf/resource.d/fuel/ns_vrouter': No such file or directory (2)
2016-05-17T00:27:54.732628+00:00 err: error: Failed to retrieve meta-data for ocf:fuel:ns_vrouter
2016-05-17T00:27:54.732628+00:00 warning: warning: No metadata found for ns_vrouter::ocf:fuel: Input/output error (-5)
2016-05-17T00:27:54.732628+00:00 err: error: No metadata for fuel::ocf:ns_vrouter
2016-05-17T00:27:54.732628+00:00 err: error: Operation p_vrouter_monitor_0 (node=node-3.test.domain.local, call=6, status=7, cib-update=39, confirmed=true) Not installed

Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

After investigation looks like core reason of missing task fuel_pkgs/fuel_pkgs.pp was in last successful transaction per task which does not evaluate status of deployment tasks per node.

So mark it as duplicate of https://bugs.launchpad.net/fuel/+bug/1581015.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.