deploying single-controller-ha doesn't generate primary-controller in hiera

Bug #1419201 reported by Andrew Woodward
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Triaged
Medium
Fuel Sustaining
Mitaka
Won't Fix
Medium
Fuel Python (Deprecated)
Newton
Triaged
Medium
Fuel Sustaining

Bug Description

{"build_id": "2015-02-05_20-00-43", "ostf_sha": "6c046b69d29021524906109f18092363505ee222", "build_number": "100", "release_versions": {"2014.2-6.1": {"VERSION": {"build_id": "2015-02-05_20-00-43", "ostf_sha": "6c046b69d29021524906109f18092363505ee222", "build_number": "100", "api": "1.0", "nailgun_sha": "d6c6d63600e3b36606be332b23a0c5490a00fbcf", "production": "docker", "python-fuelclient_sha": "521c2491f7f04f31d8c85db68499cd193d4904e3", "astute_sha": "cf25925680814745facc7ffaf1e0b08eed6f9cb5", "feature_groups": ["mirantis"], "release": "6.1", "fuelmain_sha": "", "fuellib_sha": "bc99ea769cd67121f91f49c48dffca58e3f53fdf"}}}, "auth_required": true, "api": "1.0", "nailgun_sha": "d6c6d63600e3b36606be332b23a0c5490a00fbcf", "production": "docker", "python-fuelclient_sha": "521c2491f7f04f31d8c85db68499cd193d4904e3", "astute_sha": "cf25925680814745facc7ffaf1e0b08eed6f9cb5", "feature_groups": ["mirantis"], "release": "6.1", "fuelmain_sha": "", "fuellib_sha": "bc99ea769cd67121f91f49c48dffca58e3f53fdf"}

when deploying with a single controller compute roles fail attempting to resolve the primary-controller

https://github.com/stackforge/fuel-library/blob/master/deployment/puppet/osnailyfacter/modular/controller.pp#L184-187
https://github.com/stackforge/fuel-library/blob/master/deployment/puppet/osnailyfacter/modular/compute.pp#L184-187

when diagnosing, the nodes filter is looking at the nodes hash

https://github.com/stackforge/fuel-library/blob/master/deployment/puppet/osnailyfacter/modular/controller.pp#L26

checking the output from hiera

root@node-2:~# hiera nodes
[{"swift_zone"=>"2",
  "storage_netmask"=>"255.255.255.0",
  "internal_address"=>"192.168.0.3",
  "fqdn"=>"node-2.test.domain.local",
  "name"=>"node-2",
  "role"=>"compute",
  "storage_address"=>"192.168.1.2",
  "uid"=>"2",
  "internal_netmask"=>"255.255.255.0",
  "user_node_name"=>"Untitled (fd:6c)"},
 {"swift_zone"=>"3",
  "public_netmask"=>"255.255.255.0",
  "storage_netmask"=>"255.255.255.0",
  "public_address"=>"172.16.0.4",
  "internal_address"=>"192.168.0.4",
  "fqdn"=>"node-3.test.domain.local",
  "name"=>"node-3",
  "role"=>"controller",
  "storage_address"=>"192.168.1.3",
  "uid"=>"3",
  "internal_netmask"=>"255.255.255.0",
  "user_node_name"=>"Untitled (f0:f8)"}]

there is only one controller, but no primary-controller, so filters to find primary-controller fails

adding another controller, and re-running hiera task shows there is now a controller, and primary controller

root@node-2:~# hiera nodes
[{"role"=>"controller",
  "storage_netmask"=>"255.255.255.0",
  "user_node_name"=>"Untitled (ff:32)",
  "public_address"=>"172.16.0.3",
  "uid"=>"1",
  "internal_netmask"=>"255.255.255.0",
  "public_netmask"=>"255.255.255.0",
  "internal_address"=>"192.168.0.2",
  "name"=>"node-1",
  "fqdn"=>"node-1.test.domain.local",
  "storage_address"=>"192.168.1.1",
  "swift_zone"=>"1"},
 {"role"=>"compute",
  "storage_netmask"=>"255.255.255.0",
  "user_node_name"=>"Untitled (fd:6c)",
  "uid"=>"2",
  "internal_netmask"=>"255.255.255.0",
  "internal_address"=>"192.168.0.3",
  "name"=>"node-2",
  "fqdn"=>"node-2.test.domain.local",
  "storage_address"=>"192.168.1.2",
  "swift_zone"=>"2"},
 {"role"=>"controller",
  "storage_netmask"=>"255.255.255.0",
  "user_node_name"=>"Untitled (f0:f8)",
  "public_address"=>"172.16.0.4",
  "uid"=>"3",
  "internal_netmask"=>"255.255.255.0",
  "public_netmask"=>"255.255.255.0",
  "internal_address"=>"192.168.0.4",
  "name"=>"node-3",
  "fqdn"=>"node-3.test.domain.local",
  "storage_address"=>"192.168.1.3",
  "swift_zone"=>"3"}]

looking at /etc/astute.yaml we see the nodes array doesn't have any node set to primary controller as it did before. so some magic must be happening with hiera

steps to quickly reproduce

fuel env --create --name test --rel 2 --net neutron --nst vlan --mode ha
fuel --env 1 node set --node 1 --role controller
fuel --env 1 node set --node 2 --role compute
fuel --env 1 node --node 2 --provsion
fuel --env 1 node --node 2 --tasks hiera
ssh node-2 -C 'hiera nodes'
fuel --env 1 node --set --node 3 --role controller
fuel --env 1 node --node 2 --tasks hiera
ssh node-2 -C 'hiera nodes'

Changed in fuel:
assignee: nobody → Aleksandr Didenko (adidenko)
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

> adding another controller, and re-running hiera task shows there is now a controller, and primary controller

Andrew, I don't see any 'primary-controller' role in the node list you provided. I see only 2 'controller' and 1 'compute' roles.

I've tried to reproduce this issue on the latest ISO #105

    "api": "1.0",
    "astute_sha": "7e6e6f9188bd69c603853b10d4a55149363323cc",
    "auth_required": true,
    "build_id": "2015-02-07_22-55-01",
    "build_number": "105",
    "feature_groups": [
        "mirantis"
    ],
    "fuellib_sha": "769af7fe30225cd15638ea2e6dffaa286bc06da1",
    "fuelmain_sha": "",
    "nailgun_sha": "6d1769b21819f8fb4195f1bd9c44c038721ae3d4",
    "ostf_sha": "6c046b69d29021524906109f18092363505ee222",
    "production": "docker",
    "python-fuelclient_sha": "521c2491f7f04f31d8c85db68499cd193d4904e3",
    "release": "6.1",

Commands:
# fuel env --create --name test --rel 2 --net neutron --nst vlan --mode ha
# fuel --env 2 node set --node 1 --role controller
# fuel --env 2 node set --node 3 --role compute
# fuel deployment --default --env 2
# grep ^role deployment_2/*yaml
deployment_2/compute_3.yaml:role: compute
deployment_2/primary-controller_1.yaml:role: primary-controller

Roles look good. Let's check hiera.

# fuel node --provision --env 2 --node-id 3
# fuel --env 2 node --node 3 --tasks hiera
# ssh node-3 -C 'hiera nodes'
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
[{"user_node_name"=>"Untitled (73:94)",
  "fqdn"=>"node-1.test.domain.local",
  "swift_zone"=>"1",
  "uid"=>"1",
  "storage_netmask"=>"255.255.255.0",
  "name"=>"node-1",
  "storage_address"=>"192.168.1.1",
  "public_address"=>"172.16.0.3",
  "internal_netmask"=>"255.255.255.0",
  "public_netmask"=>"255.255.255.0",
  "role"=>"primary-controller",
  "internal_address"=>"192.168.0.2"},
 {"user_node_name"=>"Untitled (3a:81)",
  "fqdn"=>"node-3.test.domain.local",
  "swift_zone"=>"3",
  "uid"=>"3",
  "storage_netmask"=>"255.255.255.0",
  "name"=>"node-3",
  "storage_address"=>"192.168.1.2",
  "internal_netmask"=>"255.255.255.0",
  "role"=>"compute",
  "internal_address"=>"192.168.0.3"}]

# ssh node-3 -C 'hiera nodes | grep role'
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
  "role"=>"primary-controller",
  "role"=>"compute",

So everything looks good.

Changed in fuel:
status: Confirmed → Incomplete
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Tested the same on ISO #100 and got the same results:

# grep ^role deployment_1/*yaml
deployment_1/compute_2.yaml:role: compute
deployment_1/primary-controller_1.yaml:role: primary-controller

# ssh node-2 -C 'hiera nodes'
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
[{"fqdn"=>"node-1.test.domain.local",
  "storage_netmask"=>"255.255.255.0",
  "public_address"=>"172.16.0.2",
  "internal_address"=>"192.168.0.1",
  "swift_zone"=>"1",
  "user_node_name"=>"Untitled (8c:5c)",
  "role"=>"primary-controller",
  "public_netmask"=>"255.255.255.0",
  "internal_netmask"=>"255.255.255.0",
  "uid"=>"1",
  "storage_address"=>"192.168.1.1",
  "name"=>"node-1"},
 {"fqdn"=>"node-2.test.domain.local",
  "storage_netmask"=>"255.255.255.0",
  "internal_address"=>"192.168.0.2",
  "swift_zone"=>"2",
  "user_node_name"=>"Untitled (d6:45)",
  "role"=>"compute",
  "internal_netmask"=>"255.255.255.0",
  "uid"=>"2",
  "storage_address"=>"192.168.1.2",
  "name"=>"node-2"}]

# ssh node-2 -C 'hiera nodes | grep role'
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
  "role"=>"primary-controller",
  "role"=>"compute",

So I'm marking this bug as invalid since I was not able to reproduce it on ISOs #100 and #105.

Changed in fuel:
status: Incomplete → Invalid
Revision history for this message
Andrew Woodward (xarses) wrote :
Download full text (3.6 KiB)

reproduced on:

{"build_id": "2015-02-05_22-55-01", "ostf_sha": "6c046b69d29021524906109f18092363505ee222", "build_number": "101", "release_versions": {"2014.2-6.1": {"VERSION": {"build_id": "2015-02-05_22-55-01", "ostf_sha": "6c046b69d29021524906109f18092363505ee222", "build_number": "101", "api": "1.0", "nailgun_sha": "d6c6d63600e3b36606be332b23a0c5490a00fbcf", "production": "docker", "python-fuelclient_sha": "521c2491f7f04f31d8c85db68499cd193d4904e3", "astute_sha": "cf25925680814745facc7ffaf1e0b08eed6f9cb5", "feature_groups": ["mirantis"], "release": "6.1", "fuelmain_sha": "", "fuellib_sha": "bc99ea769cd67121f91f49c48dffca58e3f53fdf"}}}, "auth_required": true, "api": "1.0", "nailgun_sha": "d6c6d63600e3b36606be332b23a0c5490a00fbcf", "production": "docker", "python-fuelclient_sha": "521c2491f7f04f31d8c85db68499cd193d4904e3", "astute_sha": "cf25925680814745facc7ffaf1e0b08eed6f9cb5", "feature_groups": ["mirantis"], "release": "6.1", "fuelmain_sha": "", "fuellib_sha": "bc99ea769cd67121f91f49c48dffca58e3f53fdf"}

[root@nailgun ~]# fuel node
id | status | name | cluster | ip | mac | roles | pending_roles | online | group_id
---|----------|------------------|---------|------------|-------------------|-------|---------------|--------|---------
2 | discover | Untitled (63:77) | None | 10.110.0.5 | 64:00:61:db:63:77 | | | True | None
1 | discover | Untitled (4a:ff) | None | 10.110.0.3 | 64:ae:59:a6:4a:ff | | | True | None
3 | discover | Untitled (32:a9) | None | 10.110.0.4 | 64:d6:8a:19:32:a9 | | | True | None
[root@nailgun ~]# fuel env
id | status | name | mode | release_id | changes | pending_release_id
---|--------|------|------|------------|---------|-------------------

[root@nailgun ~]# fuel env --create --name test --rel 2 --net neutron --nst vlan --mode ha
Environment 'test' with id=1, mode=ha_compact and network-mode=neutron was created!
[root@nailgun ~]# fuel --env 1 set node 1 --role controller
id | status | name | cluster | ip | mac | roles | pending_roles | online | group_id
---|--------|------|---------|----|-----|-------|---------------|--------|---------

[root@nailgun ~]# fuel --env 1 node set --node 1 --role controller
Nodes [1] with roles ['controller'] were added to environment 1
[root@nailgun ~]# fuel --env 1 node set --node 2 --role compute
Nodes [2] with roles ['compute'] were added to environment 1
[root@nailgun ~]# fuel --env 2 node --provision --node 2
Started provisioning nodes [2].
[root@nailgun ~]#
[root@nailgun ~]# fuel --env 2 node --node 2 --tasks hiera
Started tasks ['hiera'] for nodes nodes [2].
[root@nailgun ~]# ssh node-2 -C 'hiera nodes | grep role'
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
  "role"=>"controller",
  "role"=>"compute",
[root@nailgun ~]# fuel --env 1 deployment --default
Default deployment info was downloaded to /root/deployment_1
[root@nailgun ~]# grep role /root/deployment_1/
compute_2.yaml primary-controller_1.yaml
[root@nailgun ~]# grep role /root/deployment_1/compute_2.yaml
  roles:
  role: primary-controller
  role: compute
role: compute
[r...

Read more...

Changed in fuel:
status: Invalid → Confirmed
Revision history for this message
Andrew Woodward (xarses) wrote :

changing to high, this only occurs when the controller isn't included in the first deployment task

Changed in fuel:
importance: Critical → High
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Confirm. Steps to reproduce: http://paste.openstack.org/show/196601/

There are two roles in astute.yaml on compute node:

  role: controller
  role: compute

Btw, if you do the same steps for node-1 (primary-controller), it has accurate roles in astute.yaml:

# ssh node-1 -C 'grep role: /etc/astute.yaml'
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
role: primary-controller
  role: primary-controller
  role: compute

And then just re-run hiera task on compute again:

fuel --env 1 node --node 2 --end hiera

And check roles:

# ssh node-2 -C 'grep role: /etc/astute.yaml'
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
role: compute
  role: primary-controller
  role: compute

# ssh node-2 -C 'hiera nodes | grep role'
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
  "role"=>"primary-controller",
  "role"=>"compute",

Roles are fixed :)

Forwarding it to the python team. Lowering to medium since it does not affect the real deployment, because we always deploy controllers first.

Changed in fuel:
importance: High → Medium
assignee: Aleksandr Didenko (adidenko) → Fuel Python Team (fuel-python)
Revision history for this message
Dima Shulyak (dshulyak) wrote :

We need to take into account all nodes that are pending when performing set_primary_roles procedure

  https://github.com/stackforge/fuel-web/blob/master/nailgun/nailgun/orchestrator/deployment_serializers.py#L1841

Changed in fuel:
status: Confirmed → Triaged
Dmitry Pyzhov (dpyzhov)
tags: added: module-serializer
Dmitry Pyzhov (dpyzhov)
tags: added: module-serialization
removed: module-serializer
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 6.1 → 7.0
Changed in fuel:
milestone: 7.0 → 6.1
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

I agree that we can safely lower priority to medium, since this bug does not occur on real deployments where we have all tasks in correct order. So this bug could be reproduced only manually when you execute tasks in wrong order.

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 6.1 → 7.0
tags: added: release-notes
Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Does not occur on real deployments where we have all tasks in correct order.

Moving to 8.0. Added release-notes.

Changed in fuel:
status: Triaged → Won't Fix
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 7.0 → 8.0
status: Won't Fix → Triaged
no longer affects: fuel/8.0.x
Dmitry Pyzhov (dpyzhov)
tags: added: area-python
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 8.0 → 9.0
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix proposed to mos/mos-docs (master)

Related fix proposed to branch: master
Change author: Evgeny Konstantinov <email address hidden>
Review: https://review.fuel-infra.org/22317

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix merged to mos/mos-docs (master)

Reviewed: https://review.fuel-infra.org/22317
Submitter: Evgeny Konstantinov <email address hidden>
Branch: master

Commit: 3efec3275f772c0c41c01d2a5013c596463a24fa
Author: Evgeny Konstantinov <email address hidden>
Date: Wed Jun 22 12:58:01 2016

Add Fuel known issues to relnotes 9.0

Change-Id: I9130ecc87d013db29e8a170911e59de0631a0222
Related-Bug: #1587897
Related-Bug: #1450100
Related-Bug: #1490597
Related-Bug: #1466431
Related-Bug: #1419201

tags: added: release-notes-done
removed: release-notes
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.