[Fuel Upgrade] 5.1 upgrade script failed with UpgradeVerificationError: Failed to run services ['integration_postgres_nailgun_nginx', 'integration_nginx_nailgun']

Bug #1349287 reported by Andrey Sledzinskiy
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Matthew Mosesohn
5.0.x
Won't Fix
Medium
Fuel Python (Deprecated)

Bug Description

Reproduced on fuel-5.1-upgrade-76-2014-07-25_18-23-03.tar

Steps:
1. Install fuel-5.0 - 26
2. Create simple cluster with default values - 1 controller, 1 compute, 1 cinder
3. Deploy cluster
4. After successful deployment upgrade fuel to 5.0.1 with fuel-5.0.1-upgrade-75-2014-07-24_21-23-41.tar
5. After successful upgrade deploy simple cluster with new CentOS release version - 1 controller, 1 compute, 1 cinder
6. After deployment run upgrade to 5.1 with fuel-5.1-upgrade-76-2014-07-25_18-23-03.tar

Expected - fuel is upgraded to 5.1
Actual - upgrade failed with UpgradeVerificationError: Failed to run services ['integration_postgres_nailgun_nginx', 'integration_nginx_nailgun']

Logs are attached

Tags: upgrade
Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Evgeniy L (rustyrobot)
Changed in fuel:
status: New → Confirmed
tags: added: upgrade
removed: fuel-upgrade
Evgeniy L (rustyrobot)
Changed in fuel:
importance: High → Medium
Revision history for this message
Evgeniy L (rustyrobot) wrote :

This issue happens not often, it's the reason why I reduced priority.
We are trying to reproduce to figure out what happens.
It looks like iptables rules duplication.

Revision history for this message
Evgeniy L (rustyrobot) wrote :

We cannot reproduce it, in case if we will be able to reproduce lets reopen and increase priority.

Changed in fuel:
status: Confirmed → Incomplete
assignee: Evgeniy L (rustyrobot) → Fuel Python Team (fuel-python)
Revision history for this message
Aleksey Kasatkin (alekseyk-ru) wrote :

Upgrade with fuel-master-upgrade-376-2014-07-30_17-59-31.tar
Same result.

Changed in fuel:
status: Incomplete → Confirmed
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-web (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/111305

Revision history for this message
Evgeniy L (rustyrobot) wrote :

It's still not clear why does it happen, created patch with additional logging.

Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Evgeniy L (rustyrobot)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/111305
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=3b5a6a333ddaab7f9888e210073d817560c4bd15
Submitter: Jenkins
Branch: master

commit 3b5a6a333ddaab7f9888e210073d817560c4bd15
Author: Evgeniy L <email address hidden>
Date: Fri Aug 1 18:00:15 2014 +0400

    Upgrades, add logging to debug the problem with inter-container communication

    Change-Id: I4bc91b9c35cc915de435ab2d9a86f46ba5761192
    Related-bug: #1349287

Evgeniy L (rustyrobot)
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/112552

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/112552
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=e1e1a5e23fd02279ab9ea2f6a936ad2b90e72840
Submitter: Jenkins
Branch: master

commit e1e1a5e23fd02279ab9ea2f6a936ad2b90e72840
Author: Evgeniy L <email address hidden>
Date: Wed Aug 6 20:56:38 2014 +0400

    Upgrades, add new pattern for iptables cleaner

    * add new pattern for iptables cleaner
      after containers stop, because after
      new port binding there are rules with
      new format
    * save iptables rules after cleaning
    * don't raise error if deletion of
      iptables rules failed, just skip it

    Change-Id: I9077f9988ca75c2d498515a688828cfe99dbe1c2
    Closes-bug: #1349287
    Closes-bug: #1353503

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

reproduced:
http://paste.openstack.org/show/93297/
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "5.1"
  api: "1.0"
  build_number: "414"
  build_id: "2014-08-08_11-23-51"
  astute_sha: "b52910642d6de941444901b0f20e95ebbcb2b2e9"
  fuellib_sha: "d699fc178559e98cfd7d53b58478b46553ffe39e"
  ostf_sha: "e33390c275e225d648b36997460dc29b1a3c20ae"
  nailgun_sha: "5bc33457e5a1f108b071ed0ef2a771ea0b610b22"
  fuelmain_sha: "16c54168143061724e635a20a99545a756725b49"

Changed in fuel:
status: Fix Committed → Confirmed
Revision history for this message
Alexander Kislitsky (akislitsky) wrote :

Could you provide logs snapshot, please?

Dima Shulyak (dshulyak)
Changed in fuel:
assignee: Evgeniy L (rustyrobot) → Fuel Python Team (fuel-python)
Revision history for this message
Dima Shulyak (dshulyak) wrote :

Maybe we should flush whole DOCKER chain ?

like

iptables -t nat -F DOCKER

Revision history for this message
Stanislaw Bogatkin (sbogatkin) wrote :

Reproduced.
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "5.1"
  api: "1.0"
  build_number: "425"
  build_id: "2014-08-12_10-52-56"
  astute_sha: "b52910642d6de941444901b0f20e95ebbcb2b2e9"
  fuellib_sha: "54d834afccf1f8a97e4a82d7d081e7d1a1e068a1"
  ostf_sha: "d2a894d228c1f3c22595a77f04b1e00d09d8e463"
  nailgun_sha: "c7e00e5a00499d9f8dec608541dad1d745a8dd2e"
  fuelmain_sha: "9d4463400b4924159c978af43855e48bcf2a84b2"

Log:
2014-08-12 10:22:04 DEBUG 24682 (upgrade) HostSystemUpgrader: rollbacking...
2014-08-12 10:22:04 INFO 24682 (version_file) Switch current version file to previous version
2014-08-12 10:22:04 DEBUG 24682 (utils) Symlinking "/etc/fuel/5.0.1/version.yaml" -> "/etc/fuel/version.yaml" [overwrite=1]
2014-08-12 10:22:04 DEBUG 24682 (utils) Removing "/etc/fuel/version.yaml"
2014-08-12 10:22:04 DEBUG 24682 (utils) Remove file "/etc/yum.repos.d/5.1_nailgun.repo"
2014-08-12 10:22:04 ERROR 24682 (upgrade) *** UPGRADE FAILED
2014-08-12 10:22:04 ERROR 24682 (cli) Failed to run services ['integration_postgres_nailgun_nginx']
Traceback (most recent call last):
  File "/var/upgrade/site-packages/fuel_upgrade/cli.py", line 142, in main
    run_upgrade(parse_args())
  File "/var/upgrade/site-packages/fuel_upgrade/cli.py", line 135, in run_upgrade
    upgrade_manager.run()
  File "/var/upgrade/site-packages/fuel_upgrade/upgrade.py", line 51, in run
    upgrader.upgrade()
  File "/var/upgrade/site-packages/fuel_upgrade/engines/docker_engine.py", line 93, in upgrade
    self.upgrade_verifier.verify()
  File "/var/upgrade/site-packages/fuel_upgrade/health_checker.py", line 385, in verify
    self._get_non_running_services()))
UpgradeVerificationError: Failed to run services ['integration_postgres_nailgun_nginx']
Upgrade failed

Revision history for this message
Stanislaw Bogatkin (sbogatkin) wrote :

Diagnostic snapshot here.

Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :

5.0.1 upgrade script is affected too
{

    "build_id": "2014-08-11_12-45-06",
    "mirantis": "yes",
    "build_number": "169",
    "ostf_sha": "09b6bccf7d476771ac859bb3c76c9ebec9da9e1f",
    "nailgun_sha": "04ada3cd7ef14f6741a05fd5d6690260f9198095",
    "production": "docker",
    "api": "1.0",
    "fuelmain_sha": "43374c706b4fdce28aeb4ef11e69a53f41646740",
    "astute_sha": "6db5f5031b74e67b92fcac1f7998eaa296d68025",
    "release": "5.0.1",
    "fuellib_sha": "a31dbac8fff9cf6bc4cd0d23459670e34b27a9ab"

}
Logs are attached

Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Vladimir Kuklin (vkuklin)
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

The problem is there is a command to purge stale iptables rules after container starts, but the fuel_upgrade tool does not call dockerctl start to launch containers. I am assigning this to Dmitry Shulyak and he will add dockerctl post_start_hooks $container to ensure this purge command gets run.

Changed in fuel:
assignee: Vladimir Kuklin (vkuklin) → Dima Shulyak (dshulyak)
Dima Shulyak (dshulyak)
Changed in fuel:
assignee: Dima Shulyak (dshulyak) → Igor Kalnitsky (ikalnitsky)
Dima Shulyak (dshulyak)
Changed in fuel:
assignee: Igor Kalnitsky (ikalnitsky) → Dima Shulyak (dshulyak)
Revision history for this message
Mike Scherbakov (mihgen) wrote :

Raised to critical according to the message from QA (Tatyana):
> it is very important to fix this one floating issue, according it very often became blocker for patching

Changed in fuel:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/113903

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/113903
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=58b2ce09edf0ac1af5015630653ed532cd19012d
Submitter: Jenkins
Branch: master

commit 58b2ce09edf0ac1af5015630653ed532cd19012d
Author: Dima Shulyak <email address hidden>
Date: Wed Aug 13 17:08:42 2014 +0300

    Use dockerctl post_start_hooks to clean iptables

    iptables -D itself is buggy and doesnot always perform cleanup correctly
    also we need to cleanup not only DOCKER chain, but also FORWARD
    this is already done in dockerctl post_start_hooks as solution
    for removing stale iptables rule after upgrade

    We need to perform invokation of this script after
    container is started because dockerctl uses docker inspect to
    fetch relevant data for containers

    Change-Id: Ib568fb14bc78d40c214d922907d7bb3adb213eee
    Closes-Bug: 1349287

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :

verified on {

    "build_id": "2014-08-21_02-01-17",
    "ostf_sha": "c6ecd0137b5d7c1576fa65baef0fc70f9a150daa",
    "build_number": "464",
    "auth_required": true,
    "api": "1.0",
    "nailgun_sha": "25eba6fbb2047f26d9da4d27ffdb742c9c27832a",
    "production": "docker",
    "fuelmain_sha": "25a0c228d998707f90e90877559f17817a749d2f",
    "astute_sha": "efe3cb3668b9079e68fb1534fd4649ac45a344e1",
    "feature_groups": [
        "mirantis"
    ],
    "release": "5.1",
    "fuellib_sha": "52f3ebfa968f0338e0584edf47cff10911109de5"

}

Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

reproduced with tarball VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "5.1"
  api: "1.0"
  build_number: "466"
  build_id: "2014-08-21_16-33-53"
  astute_sha: "ac520b09525af4551e730b1c1f78170fefaf3cb8"
  fuellib_sha: "bddba1e854a6b0350e844a0baad50816d3cc8e28"
  ostf_sha: "907f25f8fad39b177bf6a66fba9785afa7dd8008"
  nailgun_sha: "25eba6fbb2047f26d9da4d27ffdb742c9c27832a"
  fuelmain_sha: "2c63c5024f3a97c873628e1c3a6a30861c6086aa"

Changed in fuel:
status: Fix Released → Confirmed
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Revision history for this message
Ihor Kalnytskyi (ikalnytskyi) wrote :

I think the issue is that we temporary remove patched docker (a docker with a fix) from our mirrors, since it leads to another bugs. Need to check this assumption.

Revision history for this message
Dima Shulyak (dshulyak) wrote :

reassigned to fuel-python

wont be able to work on it tomorrow

Changed in fuel:
assignee: Dima Shulyak (dshulyak) → Fuel Python Team (fuel-python)
Revision history for this message
Evgeniy L (rustyrobot) wrote :

Asked Tatyana to attach logs.

Revision history for this message
Evgeniy L (rustyrobot) wrote :

I set Won't fix for 5.0.2 since in this release, we won't have new upgrade system, to upgrade stable branches we will use newer upgrade system from 5.1 or in the future from 6.0 version.

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

Downgrade for 5.1 to medium, according to it happens not so often as before

Changed in fuel:
importance: Critical → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/116840

Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Matthew Mosesohn (raytrac3r)
status: Confirmed → In Progress
Revision history for this message
Evgeniy L (rustyrobot) wrote :

I was able to reproduce it several times, ~35% probability of error.
Will increase priority.

Changed in fuel:
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-web (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/116905

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/116840
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=d9070e21b20b9917efe64f5379300a551e33a1a1
Submitter: Jenkins
Branch: master

commit d9070e21b20b9917efe64f5379300a551e33a1a1
Author: Matthew Mosesohn <email address hidden>
Date: Tue Aug 26 13:48:28 2014 +0400

    Fix post_start_hooks entry point, add purge to setup hook

    post_start_hooks was calling post_setup_hooks,
    which is a misnomer. post_setup_hooks should also
    call purge because it is necessary during upgrades.

    Change-Id: Ic8731f185338b6febd9c543f837ffed2abec80a5
    Partial-Bug: #1349287

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/116905
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=4e328ea30f12066d63fc2aca44dbe70b14062871
Submitter: Jenkins
Branch: master

commit 4e328ea30f12066d63fc2aca44dbe70b14062871
Author: Evgeniy L <email address hidden>
Date: Tue Aug 26 18:28:19 2014 +0400

    Upgrades, fix iptables cleaning in docker engine

    Rules should be cleaned after we start
    new container via supervisor because
    after containers creation we stop all
    of the containers and start them again
    under supervisor.

    Change-Id: I6272b2fd93f1180cddb15f7a4ceca4d1c55d02b7
    Related-bug: #1349287

Evgeniy L (rustyrobot)
Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Download full text (3.3 KiB)

verified on {

    "build_id": "2014-09-13_14-34-19",
    "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346",
    "build_number": "6",
    "auth_required": true,
    "api": "1.0",
    "nailgun_sha": "b8d8189cc37d6d1b26f4479be6be7313beefb1c8",
    "production": "docker",
    "fuelmain_sha": "d7ed7973034bde73d3f42c000984423b59b2312b",
    "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13",
    "feature_groups": [
        "experimental"
    ],
    "release": "5.1",
    "release_versions": {
        "2014.1.1-5.0.1": {
            "VERSION": {
                "build_id": "2014-08-14_19-52-36",
                "mirantis": "yes",
                "build_number": "170",
                "ostf_sha": "09b6bccf7d476771ac859bb3c76c9ebec9da9e1f",
                "nailgun_sha": "af3d1922bfc21345f81be3454115ab6139675c35",
                "production": "docker",
                "api": "1.0",
                "fuelmain_sha": "fd58828f404e4298ed338e8f44c6a326cebd31de",
                "astute_sha": "6db5f5031b74e67b92fcac1f7998eaa296d68025",
                "release": "5.0.1",
                "fuellib_sha": "a31dbac8fff9cf6bc4cd0d23459670e34b27a9ab"
            }
        },
        "2014.1-5.0": {
            "VERSION": {
                "build_id": "2014-05-27_05-51-41",
                "mirantis": "yes",
                "build_number": "26",
                "ostf_sha": "a8b7660082a6f152794c610d6abe30d360fd577d",
                "nailgun_sha": "bd09f89ef56176f64ad5decd4128933c96cb20f4",
                "production": "docker",
                "api": "1.0",
                "fuelmain_sha": "505741e4f431f85a8d0252fc42754d10c0326c1a",
                "astute_sha": "a7eac46348dc77fc2723c6fcc3dbc66cc1a83152",
                "release": "5.0",
                "fuellib_sha": "2f79c0415159651fc1978d99bd791079d1ae4a06"
            }
        },
        "2014.1.1-5.0.2": {
            "VERSION": {
                "build_id": "2014-09-13_13-34-31",
                "ostf_sha": "2969c1ad443b632e815bb1f01149c3800cd7aa3a",
                "build_number": "77",
                "api": "1.0",
                "nailgun_sha": "c18a21381843dffe807b254a4ff96eec259953cb",
                "production": "docker",
                "fuelmain_sha": "23b2e8e3a649392f780ca136201ca6f280cdb600",
                "astute_sha": "6db5f5031b74e67b92fcac1f7998eaa296d68025",
                "feature_groups": [
                    "mirantis"
                ],
                "release": "5.0.2",
                "fuellib_sha": "d3ce50bea9ec91ac932238e676decd4d24bc28e8"
            }
        },
        "2014.1.1-5.1": {
            "VERSION": {
                "build_id": "2014-09-13_14-34-19",
                "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346",
                "build_number": "6",
                "api": "1.0",
                "nailgun_sha": "b8d8189cc37d6d1b26f4479be6be7313beefb1c8",
                "production": "docker",
                "fuelmain_sha": "d7ed7973034bde73d3f42c000984423b59b2312b",
                "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13",
                "feature_groups": [
                    "experimental"
                ],
                "r...

Read more...

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.