fuel creates two Swift services in keystone

Bug #1496036 reported by Leontii Istomin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Matthew Mosesohn
7.0.x
Fix Released
Critical
Matthew Mosesohn

Bug Description

Deployment has been failed with the following error:
Execution of '/usr/bin/openstack endpoint create --format shell swift --region RegionOne --publicurl http://172.16.44.17:8080/swift/v1 --internalurl http://192.168.0.2:8080/swift/v1 --adminurl http://192.168.0.2:8080/swift/v1' returned 1: ERROR: openstack No service with a type, name or ID of 'swift' exists.

There are two Swift services in keystone:
root@node-246:~# keystone service-list 2>/dev/null | grep -i swift
| 0b8541e60d7d4467aff50e8138ce2f5b | swift | object-store | Openstack Object-Store Service |
| c608540392ad40e9a64b9f9ff3ff8b1d | swift | object-store | Openstack Object-Store Service |

Cluster configuration:
Baremetal,Ubuntu,IBP,HA,Neutron-vxlan,DVR,Ceph-all,Nova-debug,Nova-quotas,7.0-293
Controllers:3 Computes:180 Copmutes+Ceph: 20

api: '1.0'
astute_sha: a717657232721a7fafc67ff5e1c696c9dbeb0b95
auth_required: true
build_id: '293'
build_number: '293'
feature_groups:
- mirantis
fuel-agent_sha: 082a47bf014002e515001be05f99040437281a2d
fuel-library_sha: 0c5a39a43bffaddc91b46179d16e1bb2dd85bc0c
fuel-nailgun-agent_sha: d7027952870a35db8dc52f185bb1158cdd3d1ebd
fuel-ostf_sha: 1f08e6e71021179b9881a824d9c999957fcc7045
fuelmain_sha: 6b83d6a6a75bf7bca3177fcf63b2eebbf1ad0a85
nailgun_sha: 16a39d40120dd4257698795f12de4ae8200b1778
openstack_version: 2015.1.0-7.0
production: docker
python-fuelclient_sha: 2864459e27b0510a0f7aedac6cdf27901ef5c481
release: '7.0'

Diagnostic Snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2015-09-15_15-38-24.tar.xz

Changed in fuel:
milestone: none → 7.0
assignee: nobody → Matthew Mosesohn (raytrac3r)
importance: Undecided → Critical
status: New → In Progress
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

This looks related to the recent patch by Dmitry Ilyn that retries list/create/modify commands after 10s (for up to 60s total). Retrying create commands lead to failures.

Related patch
https://review.openstack.org/#/c/219668/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/223676

Revision history for this message
Leontii Istomin (listomin) wrote :

Also the following behavior on the env https://bugs.launchpad.net/fuel/+bug/1496046

description: updated
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

1496046 is not related to this.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/223696

Revision history for this message
Mike Scherbakov (mihgen) wrote :

Colleagues,
can you please provide more data on this bug. Namely, is it only introduced at scale of 200 nodes? Will it work just fine at scale of 100 nodes, for instance?
Is there an easy workaround?

Revision history for this message
Leontii Istomin (listomin) wrote :

Faced with the issue only once on 200-nodes environment. Successfully continued to deploy with patch from Matt.
Will try to reproduce the issue on 50-nodes environment.

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

This bug is not impacted by scale. It only happens on primary controller and it's a race condition related to the performance of keystone. If keystone API fails to fulfill a create request within 10s, it tries to make two and then we have a duplicate resource with the same name (and can't be called by name any more). It could happen on any hardware on any size deployment. It's a regression introduced from https://review.openstack.org/#/c/219668/ as I wrote earlier.

Revision history for this message
Leontii Istomin (listomin) wrote :

https://review.openstack.org/223696 have been tested on Scale 200-nodes lab. It works fine

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/223676
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=d905ac6405a0d844e648f52963633f8e1ef988d5
Submitter: Jenkins
Branch: master

commit d905ac6405a0d844e648f52963633f8e1ef988d5
Author: Matthew Mosesohn <email address hidden>
Date: Tue Sep 15 19:20:15 2015 +0300

    Raise timeout and disable retry for openstack create

    Create commands run a risk of creating duplicate resources,
    breaking deployment. They should only be run once and with a
    single attempt.

    Changes for openstack command behavior:
    * list/show - unchanged
    * create - 1 try (was 6), 60s timeout (was 10 per try)

    Change-Id: I33da8be45a70c0e1a2848d258d0e46f822c4e5d9
    Closes-Bug: #1496036

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/7.0)

Reviewed: https://review.openstack.org/223696
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=0623b4daad438ceeb5dc41b10cdd3011795fff7e
Submitter: Jenkins
Branch: stable/7.0

commit 0623b4daad438ceeb5dc41b10cdd3011795fff7e
Author: Matthew Mosesohn <email address hidden>
Date: Tue Sep 15 19:20:15 2015 +0300

    Raise timeout and disable retry for openstack create

    Create commands run a risk of creating duplicate resources,
    breaking deployment. They should only be run once and with a
    single attempt.

    Changes for openstack command behavior:
    * list/show - unchanged
    * create - 1 try (was 6), 60s timeout (was 10 per try)

    Change-Id: I33da8be45a70c0e1a2848d258d0e46f822c4e5d9
    Closes-Bug: #1496036

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

This issue only occurs when provisioning primary controller. It has impact for any deployment schema, for any OpenStack service's keystone component.

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

verified {"build_id": "298", "build_number": "298", "release_versions": {"2015.1.0-7.0": {"VERSION": {"build_id": "298", "build_number": "298", "api": "1.0", "fuel-library_sha": "0623b4daad438ceeb5dc41b10cdd3011795fff7e", "nailgun_sha": "d590b26dbb09785b8a8b3651b0ef69746fcf9991", "feature_groups": ["mirantis"], "fuel-nailgun-agent_sha": "d7027952870a35db8dc52f185bb1158cdd3d1ebd", "openstack_version": "2015.1.0-7.0", "fuel-agent_sha": "082a47bf014002e515001be05f99040437281a2d", "production": "docker", "python-fuelclient_sha": "486bde57cda1badb68f915f66c61b544108606f3", "astute_sha": "6c5b73f93e24cc781c809db9159927655ced5012", "fuel-ostf_sha": "1f08e6e71021179b9881a824d9c999957fcc7045", "release": "7.0", "fuelmain_sha": "6b83d6a6a75bf7bca3177fcf63b2eebbf1ad0a85"}}}, "auth_required": true, "api": "1.0", "fuel-library_sha": "0623b4daad438ceeb5dc41b10cdd3011795fff7e", "nailgun_sha": "d590b26dbb09785b8a8b3651b0ef69746fcf9991", "feature_groups": ["mirantis"], "fuel-nailgun-agent_sha": "d7027952870a35db8dc52f185bb1158cdd3d1ebd", "openstack_version": "2015.1.0-7.0", "fuel-agent_sha": "082a47bf014002e515001be05f99040437281a2d", "production": "docker", "python-fuelclient_sha": "486bde57cda1badb68f915f66c61b544108606f3", "astute_sha": "6c5b73f93e24cc781c809db9159927655ced5012", "fuel-ostf_sha": "1f08e6e71021179b9881a824d9c999957fcc7045", "release": "7.0", "fuelmain_sha": "6b83d6a6a75bf7bca3177fcf63b2eebbf1ad0a85"}

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to puppet-modules/puppet-openstacklib (mos-8.0)

Fix proposed to branch: mos-8.0
Change author: Sergey Kolekonov <email address hidden>
Review: https://review.fuel-infra.org/12080

tags: added: on-verification
Dmitry Pyzhov (dpyzhov)
no longer affects: fuel/8.0.x
Revision history for this message
Dmitriy Kruglov (dkruglov) wrote :

Verified on MOS 8.0 Kilo. Not reproduced.

The ISO info:
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  openstack_version: "2015.1.0-8.0"
  api: "1.0"
  build_number: "128"
  build_id: "128"
  fuel-nailgun_sha: "70d8b7e80573728e04ac5478c112850afcfa9802"
  python-fuelclient_sha: "56fbd6bad7f60f0944b3845c2db14d0b8cabd4d3"
  fuel-agent_sha: "e881f0dabd09af4be4f3e22768b02fe76278e20e"
  fuel-nailgun-agent_sha: "d66f188a1832a9c23b04884a14ef00fc5605ec6d"
  astute_sha: "0f753467a3f16e4d46e7e9f1979905fb178e4d5b"
  fuel-library_sha: "e3d2905b9dd2cc7b4d46201ca9816dd320868917"
  fuel-ostf_sha: "41aa5059243cbb25d7a80b97f8e1060a502b99dd"
  fuelmain_sha: "51614465980e5f62a5796779d3f6c3305c1d5739"

Changed in fuel:
status: Fix Committed → Fix Released
tags: removed: on-verification
Dmitry Pyzhov (dpyzhov)
tags: added: area-library
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote :

Fix proposed to branch: mos-8.0
Change author: Dmitry Ilyin <email address hidden>
Review: https://review.fuel-infra.org/13137

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on puppet-modules/puppet-openstacklib (mos-8.0)

Change abandoned by Sergey Kolekonov <email address hidden> on branch: mos-8.0
Review: https://review.fuel-infra.org/12080
Reason: Abandoned in favor of https://review.fuel-infra.org/#/c/13137

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to puppet-modules/puppet-openstacklib (mos-8.0)

Reviewed: https://review.fuel-infra.org/13137
Submitter: Ivan Berezovskiy <email address hidden>
Branch: mos-8.0

Commit: 7c032c436e48abb0f73ab5b380d363f8d0296f09
Author: Dmitry Ilyin <email address hidden>
Date: Fri Oct 23 18:14:34 2015

Add retries to the openstack command

Sometimes openstackclient can hang if Keystone
API fails to respond to requests.
This patch adds retries to work around these
situations.

* Retry and timeout openstack comand
* Does not retry non-idempotent actions
* Improve specs

Closes-Bug: #1496036
Change-Id: Ifd8ae1b00321366e3a54fd6fe4a68db46bb743c7

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.