primary-keystone task failed

Bug #1577839 reported by Vladyslav Drok on 2016-05-03
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Critical
Alexander Tsamutali
Mitaka
Critical
Alexander Tsamutali

Bug Description

The root of the issue in incorrect version of dependencies for MOS Keystone package (please see the comments)

Master systest run https://packaging-ci.infra.mirantis.net/job/master-pkg-systest-ubuntu/691/console failed with the following traceback:

FAIL: Deploy ceph HA with RadosGW for objects
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/case.py", line 296, in testng_method_mistake_capture_func
    compatability.capture_type_error(s_func)
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/compatability/exceptions_2_6.py", line 27, in capture_type_error
    func()
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/case.py", line 350, in func
    func(test_case.state.get_state())
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 120, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/tests/test_ceph.py", line 502, in ceph_rados_gw
    self.fuel_web.deploy_cluster_wait(cluster_id)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 455, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 440, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 491, in wrapper
    return func(*args, **kwargs)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 498, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/helpers/decorators.py", line 382, in wrapper
    return func(*args, **kwargs)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/models/fuel_web_client.py", line 827, in deploy_cluster_wait
    self.assert_task_success(task, interval=interval, timeout=timeout)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/__init__.py", line 59, in wrapped
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/master-pkg-systest-ubuntu/fuel-qa/fuelweb_test/models/fuel_web_client.py", line 321, in assert_task_success
    task["name"], task['status'], 'ready', _message(task)
AssertionError: Task 'deploy' has incorrect status. error != ready, 'Deployment has failed. All nodes are finished. Failed tasks: Task[primary-keystone/1] Stopping the deployment process!'

From the astute.log:

2016-04-29 02:09:45 DEBUG [30241] Node[1]: Node 1: task primary-keystone, task status running
2016-04-29 02:09:45 WARNING [30241] Puppet agent 1 didn't respond within the allotted time
2016-04-29 02:09:45 DEBUG [30241] Task time summary: primary-keystone with status failed on node 1 took 01:00:00
2016-04-29 02:09:45 DEBUG [30241] Node[1]: Decreasing node concurrency to: 0
2016-04-29 02:09:45 DEBUG [30241] Graph[1]: All tasks are finished
2016-04-29 02:09:45 DEBUG [30241] Cluster[]: Process node: Node[3]
2016-04-29 02:09:45 INFO [30241] Casting message to Nailgun:
{"method"=>"deploy_resp",
 "args"=>
  {"task_uuid"=>"47d2f738-0240-4f9e-8cde-a4c1fa7605cf",
   "nodes"=>
    [{"uid"=>"1",
      "status"=>"error",
      "progress"=>100,
      "deployment_graph_task_name"=>"primary-keystone",
      "task_status"=>"error",
      "custom"=>
       ...
           "failed_resources"=>
           "Haproxy_backend_status[keystone-admin],Haproxy_backend_status[keystone-public]",
          "failed"=>2,
       ...

node-1 haproxy.log:

2016-04-29T00:57:08.281784+00:00 alert: Server keystone-1/node-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.281825+00:00 alert: Server keystone-1/node-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.323573+00:00 alert: Server keystone-1/node-2 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.323755+00:00 alert: Server keystone-1/node-2 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.367596+00:00 alert: Server keystone-1/node-5 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.367596+00:00 emerg: proxy keystone-1 has no server available!
2016-04-29T00:57:08.367909+00:00 alert: Server keystone-1/node-5 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.367909+00:00 emerg: proxy keystone-1 has no server available!
2016-04-29T00:57:08.407086+00:00 alert: Server keystone-2/node-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.407354+00:00 alert: Server keystone-2/node-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.451365+00:00 alert: Server keystone-2/node-2 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 4ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.451466+00:00 alert: Server keystone-2/node-2 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 4ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.492305+00:00 alert: Server keystone-2/node-5 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.492305+00:00 emerg: proxy keystone-2 has no server available!
2016-04-29T00:57:08.492429+00:00 alert: Server keystone-2/node-5 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2016-04-29T00:57:08.492429+00:00 emerg: proxy keystone-2 has no server available!

node-1 puppet-apply.log:

2016-04-29T01:31:30.352294+00:00 err: Timeout waiting for HAProxy backend: 'keystone-2' status to become: 'up' after 1200 seconds!

Vladyslav Drok (vdrok) wrote :
Changed in fuel:
milestone: none → 9.0
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
importance: Undecided → High
status: New → Confirmed

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

expected result

steps to reproduce

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
Dmitry Pyzhov (dpyzhov) on 2016-05-04
tags: added: area-library
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Kyrylo Galanov (kgalanov)
Kyrylo Galanov (kgalanov) wrote :

That's weird:
according to the logs apache-keystone was running, but haproxy could not connect to it.

Changed in fuel:
status: Confirmed → Incomplete
Changed in fuel:
status: Incomplete → In Progress
Kyrylo Galanov (kgalanov) wrote :

Vladyslav,

Thank you. Getting access to the failed environment was very helpful.

Kyrylo Galanov (kgalanov) wrote :

Keystone team,

Please take a look. Keystone fails with a traceback: http://paste.openstack.org/show/496293/

Changed in fuel:
status: In Progress → Confirmed
assignee: Kyrylo Galanov (kgalanov) → MOS Keystone (mos-keystone)
tags: added: area-mos
removed: area-library need-info
Kyrylo Galanov (kgalanov) wrote :

Specifically this deployment: https://product-ci.infra.mirantis.net/view/10.0/job/10.0.main.ubuntu.bvt_2/158/

ii keystone 2:9.0.0~b3-2~u14.04+mos59
ii python-keystone 2:9.0.0~b3-2~u14.04+mos59
ii python-keystoneauth1 2.3.0-1~u14.04+mos16
ii python-keystoneclient 1:2.3.1-1~u14.04+mos9
ii python-keystonemiddleware 4.2.0-1~u14.04+mos5

/etc/fuel_release 10.0
/etc/fuel_build_id 191

Kyrylo Galanov (kgalanov) wrote :

http://paste.openstack.org/show/496295/ : ContextualVersionConflict: (alembic 0.8.2.dev0 (/usr/lib/python2.7/dist-packages), Requirement.parse('alembic>=0.8.4'), set(['oslo.db']))

Bug Checker Bot (bug-checker) wrote :

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

expected result

steps to reproduce

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
Changed in fuel:
milestone: 9.0 → 10.0
Nastya Urlapova (aurlapova) wrote :

Added "swarm-blocker" tag, due to fact that BVT was affected.

tags: added: swarm-blocker
Dmitry Pyzhov (dpyzhov) wrote :

Raising to Critical because basic deployment is broken and all developers are blocked.

Changed in fuel:
importance: High → Critical
Dmitry Pyzhov (dpyzhov) wrote :

BVT for 9.0 is green. Looks like 9.0 is not affected by the bug.

Not sure what the keystone team can do here - "ContextualVersionConflict: (alembic 0.8.2.dev0 (/usr/lib/python2.7/dist-packages), Requirement.parse('alembic>=0.8.4'), set(['oslo.db']))"

In general we need to be in sync with the upper-constraints - http://git.openstack.org/cgit/openstack/requirements/tree/upper-constraints.txt?h=stable/mitaka#n62

description: updated
Changed in fuel:
assignee: MOS Keystone (mos-keystone) → MOS Packaging Team (mos-packaging)
Changed in fuel:
assignee: MOS Packaging Team (mos-packaging) → Alexander Tsamutali (astsmtl)

Reviewed: https://review.fuel-infra.org/20398
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0

Commit: 0923e2e6a5627fd85abb31d027cd0737d8187780
Author: Mikhail Ivanov <email address hidden>
Date: Fri May 6 13:20:30 2016

Update python-alembic to v0.8.4

Related-Bug: #1577839

Change-Id: Ia65983caf798c791de78c81de94fc3d66d3e561f
(cherry picked from commit 5d52cffa1deeac855185e0f7d2c21d560e92fd07)

Reviewed: https://review.fuel-infra.org/20402
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0

Commit: f74aa64219dbd427a655da5cc7dc919196388de7
Author: Alexander Tsamutali <email address hidden>
Date: Fri May 6 13:18:46 2016

Update to 0.8.6

Change-Id: I8d819f442d9dbf20c5a8edd4fefff719978f84c2
Related-Bug: #1577839

Changed in fuel:
status: Confirmed → Fix Committed
Changed in fuel:
status: Fix Committed → Fix Released
Changed in fuel:
status: Fix Released → Fix Committed
Matthew Mosesohn (raytrac3r) wrote :

This needs to be reopened. Keystone deployment still fails (identical to original bug description) with:
ContextualVersionConflict: (iso8601 0.1.10 (/usr/lib/python2.7/dist-packages), Requirement.parse('iso8601>=0.1.11'), set(['oslo.concurrency']))

mos-packaging team, please take necessary care to ensure all keystone dependencies are corrected, rather than hurrying to mark this bug fixed.

Changed in fuel:
status: Fix Committed → Confirmed
assignee: Alexander Tsamutali (astsmtl) → MOS Packaging Team (mos-packaging)

Related fix proposed to branch: master
Change author: Alexander Tsamutali <email address hidden>
Review: https://review.fuel-infra.org/20464

Roman Podoliaka (rpodolyaka) wrote :

So this is interesting due to the fact that we sync code from upstream master branches to corresponding master branched on review.fuel-infra.org regularly (every ~15 minutes), but we *don not* rebuild the packages, as the sync is done by the means of "git reset --hard" in the git directly, rather than by proposing a change request via Gerrit (what we have for stable branches in downstream, e.g. for 9.0).

I quickly checked Keystone packages versions on our mirror:

keystone-doc_9.0.0~b3-2~u14.04+mos59_all.deb 14-Mar-2016 16:10 211428
keystone_9.0.0~b3-2~u14.04+mos59.debian.tar.gz 14-Mar-2016 16:10 40837
keystone_9.0.0~b3-2~u14.04+mos59.dsc 14-Mar-2016 16:10 2712
keystone_9.0.0~b3-2~u14.04+mos59_all.deb 14-Mar-2016 16:10 84790
keystone_9.0.0~b3.orig.tar.gz 14-Mar-2016 16:10 1159991
python-keystone_9.0.0~b3-2~u14.04+mos59_all.deb 14-Mar-2016 16:10 626170

^ and they haven't been updated since March, thus, it's not Keystone which triggered this problem.

As it was pointed out in https://bugs.launchpad.net/fuel/+bug/1577839/comments/14 it's oslo.db requirements, which can't be satisfied:

ContextualVersionConflict: (alembic 0.8.2.dev0 (/usr/lib/python2.7/dist-packages), Requirement.parse('alembic>=0.8.4'), set(['oslo.db']))

Keystone just happens to trigger this requirements check on start.

As it turns out, oslo.db was actually rebuilt recently:

python-oslo-db-doc_4.6.0-3~u14.04+mos20_all.deb 05-May-2016 14:44 3536
python-oslo-db_4.6.0-3~u14.04+mos20_all.deb 05-May-2016 14:44 3530
python-oslo.db-doc_4.6.0-3~u14.04+mos20_all.deb 05-May-2016 14:44 42910
python-oslo.db_4.6.0-3~u14.04+mos20.debian.tar.gz 05-May-2016 14:44 4821
python-oslo.db_4.6.0-3~u14.04+mos20.dsc 05-May-2016 14:44 2851
python-oslo.db_4.6.0-3~u14.04+mos20_all.deb 05-May-2016 14:44 95778
python-oslo.db_4.6.0.orig.tar.gz 05-May-2016 14:44 138961
python3-oslo.db_4.6.0-3~u14.04+mos20_all.deb 05-May-2016 14:44 95832

https://review.fuel-infra.org/#/c/19713/ is the corresponding change request. As you can see `master-pkg-systest-ubuntu` actually detected the problem and *failed*, but it's non-voting, so it was ignored.

We desperately need to make the job stable and voting again, otherwise people will keep ignoring these failures.

I suggest we downgrade the oslo.db package by the means of reverting the commit in question, if possible. Rebuilding of dependencies (alembic, oslo.i18n, etc) one by one is not a viable solution, IMO.

Reviewed: https://review.fuel-infra.org/20464
Submitter: Ivan Remizov <email address hidden>
Branch: master

Commit: ac1b64c1175434b96cef6461f63c9168d22d6b41
Author: Alexander Tsamutali <email address hidden>
Date: Tue May 10 14:15:19 2016

Add packages/trusty/python-iso8601 to packaging-ci-gate.

Change-Id: I93a91ee62e9ac7c076ffcf676561f3df68ddb13f
Related-Bug: #1577839

Reviewed: https://review.fuel-infra.org/20461
Submitter: Andrey Nikitin <email address hidden>
Branch: master

Commit: 83ed35b46510ad64d5ae26ef8ed602b57ebecd8c
Author: Alexander Tsamutali <email address hidden>
Date: Tue May 10 14:15:45 2016

Add packages/trusty/python-iso8601

Required by oslo.concurrency.

Related-Bug: #1577839
Change-Id: Ia8e2aace6a1e23a2074df782d0b161c0917050c5

Changed in fuel:
assignee: MOS Packaging Team (mos-packaging) → Alexander Tsamutali (astsmtl)
status: Confirmed → In Progress
Roman Podoliaka (rpodolyaka) wrote :

9.0 is not affected: requirements.txt of oslo.db in 9.0 code is aligned with the dependencies we have in 9.0 mirrors.

Igor Yozhikov (iyozhikov) wrote :

Mitaka/9.0 doesn't affected by this bug.
All versions of dependencies are satisfied

Changed in fuel:
status: In Progress → Fix Released
Changed in fuel:
status: Fix Released → Fix Committed
Alexander Tsamutali (astsmtl) wrote :

Nastya, we already verified that this fix is working in the latest build. Do you expect someone else will verify it too?

Ilya Bumarskov (ibumarskov) wrote :
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers