Deployment of Ha nova-flat cluster failed with (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) Could not evaluate: Oops - not sure what happened: 757: unexpected token at '<html><body><h1>504 Gateway Time-out</h1>

Bug #1391180 reported by Andrey Sledzinskiy
42
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Unassigned
5.1.x
Fix Released
Critical
Aleksandr Didenko

Bug Description

{

    "build_id": "2014-11-08_21-30-24",
    "ostf_sha": "9c6fadca272427bb933bc459e14bb1bad7f614aa",
    "build_number": "85",
    "auth_required": true,
    "api": "1.0",
    "nailgun_sha": "8330f6221e190db87fc5baa735fa719c85a2e02d",
    "production": "docker",
    "fuelmain_sha": "f536d11fb40fed78c26f4d7326c45b0c48f157d8",
    "astute_sha": "3c374c9f7bfbdbcd7ce2f716cd704e3044e6fb41",
    "feature_groups": [
        "mirantis"
    ],
    "release": "6.0",
    "release_versions": {
        "2014.2-6.0": {
            "VERSION": {
                "build_id": "2014-11-08_21-30-24",
                "ostf_sha": "9c6fadca272427bb933bc459e14bb1bad7f614aa",
                "build_number": "85",
                "api": "1.0",
                "nailgun_sha": "8330f6221e190db87fc5baa735fa719c85a2e02d",
                "production": "docker",
                "fuelmain_sha": "f536d11fb40fed78c26f4d7326c45b0c48f157d8",
                "astute_sha": "3c374c9f7bfbdbcd7ce2f716cd704e3044e6fb41",
                "feature_groups": [
                    "mirantis"
                ],
                "release": "6.0",
                "fuellib_sha": "c7b71bd1ee939b5a634715ac7b13c2936ad93a5e"
            }
        }
    },
    "fuellib_sha": "c7b71bd1ee939b5a634715ac7b13c2936ad93a5e"

}

Steps:
1. Create next cluster - CentOS, HA, flat nova-network, 3 controller, 2 compute nodes
2. Deploy cluster

Expected - cluster is deployed successfully
Actual - deployment failed with errors in puppet log on node-2
2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) /etc/puppet/modules/nova/lib/puppet/provider/nova_floating_range/nova_manage.rb:15:in `exists?'

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) /etc/puppet/modules/nova/lib/puppet/provider/nova_floating_range/nova_manage.rb:49:in `operate_range'

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) /usr/lib/ruby/gems/1.8/gems/openstack-1.1.2/lib/openstack/compute/connection.rb:475:in `get_floating_ips_bulk'

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) /usr/lib/ruby/gems/1.8/gems/openstack-1.1.2/lib/openstack/compute/connection.rb:502:in `check_extension'

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) /usr/lib/ruby/gems/1.8/gems/openstack-1.1.2/lib/openstack/compute/connection.rb:242:in `api_extensions'

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) /usr/lib/ruby/gems/1.8/gems/openstack-1.1.2/lib/openstack/connection.rb:207:in `req'

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) /usr/lib/ruby/gems/1.8/gems/openstack-1.1.2/lib/openstack/connection.rb:510:in `raise_exception'

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) /usr/lib/ruby/gems/1.8/gems/openstack-1.1.2/lib/openstack/connection.rb:533:in `deal_with_faulty_error'

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) '

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) </body></html>

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) The server didn't respond in time.

2014-11-10 07:22:03 ERR

 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) Could not evaluate: Oops - not sure what happened: 757: unexpected token at '<html><body><h1>504 Gateway Time-out</h1>

Logs are attached

Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Changed in fuel:
milestone: none → 6.0
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Aleksandr Didenko (adidenko)
Revision history for this message
Aleksandr Didenko (adidenko) wrote :
Download full text (3.9 KiB)

It looks like the problem is caused by Haproxy. When it starts, it shows all backends as "UP", then it runs health-checks and marks problem ones as "DOWN".

Here is an easy experiment on how to reproduce - go to any controller in your HA env, edit /etc/haproxy/conf.d/170-nova-novncproxy.cfg file and change node-* IPs to some non existent ones. Then kill haproxy (it will be started automatically by pacemaker) and run this command:

while : ; do date ; echo show stat | socat unix-connect:///var/lib/haproxy/stats stdio | grep 'nova-novncproxy,BACKEND'; sleep 1 ; don

Tue Nov 11 11:09:35 UTC 2014
2014/11/11 11:09:35 socat[25769] E connect(3, AF=1 "///var/lib/haproxy/stats", 26): Connection refused
Tue Nov 11 11:09:36 UTC 2014
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,UP,3,3,0,,0,0,0,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,
Tue Nov 11 11:09:37 UTC 2014
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,UP,3,3,0,,0,1,0,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,
Tue Nov 11 11:09:38 UTC 2014
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,UP,3,3,0,,0,2,0,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,
Tue Nov 11 11:09:39 UTC 2014
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,UP,3,3,0,,0,3,0,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,
Tue Nov 11 11:09:40 UTC 2014
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,DOWN,0,0,0,,1,0,0,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,
Tue Nov 11 11:09:41 UTC 2014
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,DOWN,0,0,0,,1,1,1,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,
Tue Nov 11 11:09:42 UTC 2014
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,DOWN,0,0,0,,1,2,2,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,
Tue Nov 11 11:09:43 UTC 2014
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,DOWN,0,0,0,,1,3,3,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,

As you can see haproxy marks backend as UP at first and then marks it DOWN.

The same thing happens with our "wait-for-haproxy-keystone-backend" exec. In this particular case the hronology was:

2014-11-10T07:20:53.314581+00:00 notice: Proxy keystone-1 started.
2014-11-10T07:20:53.314581+00:00 notice: Proxy keystone-2 started.

*** keystone backend is UP according to haproxy ^^

2014-11-10T07:20:53.597929+00:00 notice: (/Stage[main]/Osnailyfacter::Cluster_ha/Exec[wait-for-haproxy-keystone-backend]/returns) executed successfully

*** puppet thinks keystone backend is ready ^^

2014-11-10T07:20:53.601276+00:00 info: (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.78.128-10.108.78.254]) Starting to evaluate the resource

*** puppet tries to evaluate Nova_floating_range but keystone is down ^^

2014-11-10T07:20:55.481416+00:00 alert: Server keystone-1/node-4 is DOWN, reason: Layer4 timeout, check duration: 2001ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
2014-11-10T07:20:55.522536+00:00 alert: Server keystone-1/node-5 is DOWN, reason: Layer4 timeout, check duration: 2000ms. 1 active and 0 b...

Read more...

Changed in fuel:
status: New → Triaged
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/133686

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Here is how it looks right after haproxy start:

nova-novncproxy,FRONTEND,,,0,0,8000,0,0,0,0,0,0,,,,,OPEN,,,,,,,,,1,18,0,,,,0,0,0,0,,,,0,0,0,0,0,0,,0,0,0,,,0,0,0,0,,,,,,,,
nova-novncproxy,node-1,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP 1/3,1,1,0,0,0,4,0,,1,18,1,,0,,2,0,,0,INI,,,0,0,0,0,0,0,0,,,,0,0,,,,,-1,,,0,0,0,0,
nova-novncproxy,node-3,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP 1/3,1,1,0,0,0,4,0,,1,18,2,,0,,2,0,,0,INI,,,0,0,0,0,0,0,0,,,,0,0,,,,,-1,,,0,0,0,0,
nova-novncproxy,node-5,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP 1/3,1,1,0,0,0,4,0,,1,18,3,,0,,2,0,,0,INI,,,0,0,0,0,0,0,0,,,,0,0,,,,,-1,,,0,0,0,0,
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,UP,3,3,0,,0,4,0,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,

Then haproxy marks backend as DOWN:

nova-novncproxy,FRONTEND,,,0,0,8000,0,0,0,0,0,0,,,,,OPEN,,,,,,,,,1,18,0,,,,0,0,0,0,,,,0,0,0,0,0,0,,0,0,0,,,0,0,0,0,,,,,,,,
nova-novncproxy,node-1,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,1,2,2,,1,18,1,,0,,2,0,,0,L4TOUT,,2000,0,0,0,0,0,0,0,,,,0,0,,,,,-1,,,0,0,0,0,
nova-novncproxy,node-3,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,1,2,2,,1,18,2,,0,,2,0,,0,L4TOUT,,2001,0,0,0,0,0,0,0,,,,0,0,,,,,-1,,,0,0,0,0,
nova-novncproxy,node-5,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,1,2,2,,1,18,3,,0,,2,0,,0,L4TOUT,,2001,0,0,0,0,0,0,0,,,,0,0,,,,,-1,,,0,0,0,0,
nova-novncproxy,BACKEND,0,0,0,0,800,0,0,0,0,0,,0,0,0,0,DOWN,0,0,0,,1,2,2,,1,18,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,-1,,,0,0,0,0,

Changed in fuel:
status: Triaged → In Progress
Changed in fuel:
importance: Critical → High
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

Priority upgraded back to Critical since it affects Fuel CI in 5.1.1:
https://review.openstack.org/#/c/134677/

Changed in fuel:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/133686
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=360ef59e8f828243304a669328257be8504b8df6
Submitter: Jenkins
Branch: master

commit 360ef59e8f828243304a669328257be8504b8df6
Author: Aleksandr Didenko <email address hidden>
Date: Tue Nov 11 14:53:35 2014 +0200

    Improve wait-for-* execs

    Right after start haproxy marks all backends as UP. It marks
    problem ones as DOWN only after health-checks. So we should not
    rely on BACKEND status while INI is in process.

    Change-Id: Ia3743d446826ee1a9d42799a83e14a911a577966
    Related-bug: #1391180

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/5.1)

Related fix proposed to branch: stable/5.1
Review: https://review.openstack.org/134920

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

Fix for 6.0 is not yet committed as this error happens due to memcached issue, for which we occasionally did not merge the patch into our 6.0 packages.

Changed in fuel:
status: Fix Committed → In Progress
assignee: Aleksandr Didenko (adidenko) → Fuel OSCI Team (fuel-osci)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/134920
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=add3fdd3e2af57b20dbb73a6bc53a9ccc4701c9a
Submitter: Jenkins
Branch: stable/5.1

commit add3fdd3e2af57b20dbb73a6bc53a9ccc4701c9a
Author: Aleksandr Didenko <email address hidden>
Date: Tue Nov 11 14:53:35 2014 +0200

    Improve wait-for-* execs

    Right after start haproxy marks all backends as UP. It marks
    problem ones as DOWN only after health-checks. So we should not
    rely on BACKEND status while INI is in process.

    Change-Id: Ia3743d446826ee1a9d42799a83e14a911a577966
    Related-bug: #1391180
    (cherry picked from commit 360ef59e8f828243304a669328257be8504b8df6)

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Mykhaylo Slobodyan (mykhaylo-slobodyan) wrote :

Verified on 5.1.1 ISO #18

{"build_id": "2014-11-17_12-14-23", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "18", "auth_required": true, "api": "1.0", "nailgun_sha": "2fcab95dc43a248ba867065e96ab764ee73882d1", "production": "docker", "fuelmain_sha": "ff22ca819e6eb7c63b6d7978fdd80ef9b84457d9", "astute_sha": "702af3db6f5bca92525bc8322d7d5d7675ec857e", "feature_groups": ["mirantis"], "release": "5.1.1", "release_versions": {"2014.1.3-5.1.1": {"VERSION": {"build_id": "2014-11-17_12-14-23", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "18", "api": "1.0", "nailgun_sha": "2fcab95dc43a248ba867065e96ab764ee73882d1", "production": "docker", "fuelmain_sha": "ff22ca819e6eb7c63b6d7978fdd80ef9b84457d9", "astute_sha": "702af3db6f5bca92525bc8322d7d5d7675ec857e", "feature_groups": ["mirantis"], "release": "5.1.1", "fuellib_sha": "0d3909b9a291880af28dbe48b9c7d25215aa98ea"}}}, "fuellib_sha": "0d3909b9a291880af28dbe48b9c7d25215aa98ea"}

Revision history for this message
Anastasia Palkina (apalkina) wrote :

Verified on ISO #114 for 6.0

"build_id": "2014-11-18_22-00-23", "ostf_sha": "82465a94eed4eff1fc8d8e1f2fb7e9993c22f068", "build_number": "114", "auth_required": true, "api": "1.0", "nailgun_sha": "b0add09c4361fee8fc70637c9a6ef42fbe738abe", "production": "docker", "fuelmain_sha": "e556f0e1b00c30ec5c4b374ca2878c047c8686c2", "astute_sha": "65eb911c38afc0e23d187772f9a05f703c685896", "feature_groups": ["mirantis"], "release": "6.0", "release_versions": {"2014.2-6.0": {"VERSION": {"build_id": "2014-11-18_22-00-23", "ostf_sha": "82465a94eed4eff1fc8d8e1f2fb7e9993c22f068", "build_number": "114", "api": "1.0", "nailgun_sha": "b0add09c4361fee8fc70637c9a6ef42fbe738abe", "production": "docker", "fuelmain_sha": "e556f0e1b00c30ec5c4b374ca2878c047c8686c2", "astute_sha": "65eb911c38afc0e23d187772f9a05f703c685896", "feature_groups": ["mirantis"], "release": "6.0", "fuellib_sha": "5a5275370b33ab3b9a403728a1c7ad173289e4a0"}}}, "fuellib_sha": "5a5275370b33ab3b9a403728a1c7ad173289e4a0"

Changed in fuel:
status: Fix Committed → Fix Released
Curtis Hovey (sinzui)
Changed in fuel:
assignee: Registry Administrators (registry) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.