[9.0] Deployment failed on a heat-primary task

Bug #1588329 reported by Ilya Tyaptin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Alexey Deryugin
Mitaka
Fix Released
High
Alexey Deryugin
Newton
Invalid
High
Alexey Deryugin

Bug Description

Detailed bug description:
Job https://packaging-ci.infra.mirantis.net/job/9.0-pkg-systest-ubuntu/1045/

This job failed on task heat-primary/1:
Deployment has failed. All nodes are finished. Failed tasks: Task[primary-heat/1] Stopping the deployment process!

Looks like heat-engine package has not been installed:

heat-all.log:
<27>Jun 2 06:26:20 node-1 ocf-heat-engine: ERROR: Setup problem: couldn't find command: /usr/bin/heat-engine

In pacemaker.log we could find many issues during heat-engine resource restarting:

https://paste.mirantis.net/show/0nvUHjhkEZQoEtHWbch6/

In puppet-apply.log next keystone issues are present:
2016-06-02T06:28:47.558938+00:00 info: (/Stage[main]/Heat::Keystone::Domain/Keystone_user[heat_admin::heat]) Starting to evaluate the resource
2016-06-02T06:28:47.558938+00:00 debug: Executing '/usr/bin/openstack domain list --quiet --format csv'
2016-06-02T06:28:49.683138+00:00 debug: Executing '/usr/bin/openstack user show --format shell heat_admin --domain 64a90f9e69fb423da2e7d32bb489bee5'
2016-06-02T06:28:52.181347+00:00 err: (/Stage[main]/Heat::Keystone::Domain/Keystone_user[heat_admin::heat]) Could not evaluate: Execution of '/usr/bin/openstack user show --format shell heat_admin --domain 64a90f9e69fb423da2e7d32bb489bee5' returned 1: Discovering versions from the identity service failed when creating the password plugin. Attempting to determine version from URL.
2016-06-02T06:28:52.181347+00:00 err: (/Stage[main]/Heat::Keystone::Domain/Keystone_user[heat_admin::heat]) Could not determine a suitable URL for the plugin

2016-06-02T06:28:52.188096+00:00 notice: (/Stage[main]/Heat::Keystone::Domain/Keystone_user_role[heat_admin::heat@::heat]) Dependency Keystone_user[heat_admin::heat] has failures: true
2016-06-02T06:28:52.189789+00:00 warning: (/Stage[main]/Heat::Keystone::Domain/Keystone_user_role[heat_admin::heat@::heat]) Skipping because of failed dependencies

Steps to reproduce:
N/A

Expected results:
job passes

Actual results:
Job fails

Reproducibility:
N/A

Ilya Tyaptin (ityaptin)
tags: added: gating-failure
Ilya Tyaptin (ityaptin)
description: updated
Ilya Kutukov (ikutukov)
Changed in fuel:
milestone: none → 9.0
assignee: nobody → MOS Packaging Team (mos-packaging)
importance: Undecided → High
status: New → Confirmed
tags: added: area-ci
Revision history for this message
Ilya Kutukov (ikutukov) wrote :
Changed in fuel:
status: Confirmed → Incomplete
assignee: MOS Packaging Team (mos-packaging) → nobody
Ilya Kutukov (ikutukov)
Changed in fuel:
assignee: nobody → Fuel Toolbox (fuel-toolbox)
Changed in fuel:
assignee: Fuel Toolbox (fuel-toolbox) → Fuel Sustaining (fuel-sustaining-team)
status: Incomplete → Confirmed
tags: added: area-library
removed: area-ci
Dmitry Pyzhov (dpyzhov)
tags: added: tech-debt
Revision history for this message
Maksim Malchuk (mmalchuk) wrote :
Changed in fuel:
status: Confirmed → Incomplete
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 9.0 → 10.0
Revision history for this message
Serg Melikyan (smelikyan) wrote :
Revision history for this message
Alex Schultz (alex-schultz) wrote :
Changed in fuel:
status: Incomplete → Confirmed
Changed in fuel:
status: Confirmed → Triaged
tags: added: tricky
Revision history for this message
Alex Schultz (alex-schultz) wrote :

Actually the latest error for this pointed to timeouts coming from keystone while trying to query/add users. So not exactly the same error as previously but the heat task did fail while trying to add the user.

Revision history for this message
Alex Schultz (alex-schultz) wrote :

To follow up after looking in the haproxy logs of the last snapshot, the keystone token request took excessively long causing the puppet task to timeout since we have a 20 second timeout.

This call took ~18 seconds.

2016-06-16T01:00:08.810967+00:00 info: 10.109.1.2:60911 [16/Jun/2016:00:59:49.987] keystone-1 keystone-1/node-3 0/0/3/18867/18870 201 8194 - - ---- 57/0/0/0/0 0/0 "POST /v3/auth/tokens HTTP/1.1"

This call took ~36 seconds.
2016-06-16T01:00:49.798832+00:00 info: 10.109.1.2:32949 [16/Jun/2016:01:00:13.134] keystone-1 keystone-1/node-2 4/0/6/36694/36704 201 8194 - - ---- 55/1/1/1/0 0/0 "POST /v3/auth/tokens HTTP/1.1"

This call took ~13 seconds
2016-06-16T01:00:49.908022+00:00 info: 10.109.1.2:33215 [16/Jun/2016:01:00:36.158] keystone-1 keystone-1/node-2 0/0/1/13789/13793 201 8194 - - ---- 55/0/0/0/0 0/0 "POST /v3/auth/tokens HTTP/1.1"

So because all the calls took excessively long at the time, the user creation fails. This probably points to the keystone (or CI) servers being overloaded at the time of the heat task. We could increase the timeout of the user creation from 20 seconds to 60 seconds as a possible work around.

Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → MOS Keystone (mos-keystone)
tags: removed: area-library
tags: added: area-mos
tags: added: area-keystone
summary: - [9.0] Packaging CI job failed on a heat-primary task
+ [9.0] Deployment failed on a heat-primary task
Revision history for this message
Alexander Makarov (amakarov) wrote :

Please increase the timeout of the user creation from 20 seconds to 60 seconds.

Changed in fuel:
assignee: MOS Keystone (mos-keystone) → MOS Puppet Team (mos-puppet)
Changed in fuel:
assignee: MOS Puppet Team (mos-puppet) → Max Yatsenko (myatsenko)
Changed in fuel:
assignee: Max Yatsenko (myatsenko) → Alexey Deryugin (velovec)
tags: added: 10.0-reviewed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/344166

Revision history for this message
Alexey Deryugin (velovec) wrote :

Doesn't affect 10.0

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/mitaka)

Change abandoned by Alexey Deryugin (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/344166
Reason: Abandoned in prior to: https://review.openstack.org/#/c/345365/

Revision history for this message
Alexey Deryugin (velovec) wrote :

stable/mitaka: fix on review: https://review.openstack.org/#/c/345365/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/345365
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=92dc5a6a5f7a22f4bf15de1ecd0987012563bdad
Submitter: Jenkins
Branch: stable/mitaka

commit 92dc5a6a5f7a22f4bf15de1ecd0987012563bdad
Author: Alexey Deryugin <email address hidden>
Date: Thu Jul 21 11:33:25 2016 +0300

    Bump puppet modules to 8.2.0 release

    Change-Id: I2ecdf99b94231a4e639588e79ae683f1ed8dc893
    Closes-Bug: #1588329

tags: added: on-verification
Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :

Green job https://packaging-ci.infra.mirantis.net/job/9.0-pkg-systest-ubuntu/
runs 1405-1409
Move to Fix Released

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.