It's not possible to move controllers into different racks

Bug #1524320 reported by Aleksandr Didenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Roman Prykhodchenko
8.0.x
Fix Released
High
Roman Prykhodchenko

Bug Description

Out of a box Fuel does not support moving controllers into different network node groups (racks). But as a plugin developer I want to have such possibility because I can handle it within my plugin.

So right now if controllers are in different nodegroups (for example in case of external LB), then it's not possible to download/get network configuration and thus not possible to deploy such configuration because Nailgin throws error.

Steps to reproduce:
1) Install https://github.com/adidenko/fuel-plugin-external-lb/blob/master/rpms/external_loadbalancer-0.1-0.1.3-1.noarch.rpm?raw=true plugin
2) Create env with 2+ nodegroups and enable plugin
3) Add 2+ nodes from different nodegroups as controllers to the cluster
4) Try to get network configuration via CLI (fuel network --env 1 -d) or via UI

Expected result:
Steps 1-4 work fine, no errors

Actual result:
500 Server Error: Internal Server Error (Node roles [controller] has more than one common node group)

Trace from nailgin app.log:

2015-12-09 11:35:20.635 ERROR [7fbcc775e740] (network_configuration) Serialization failed
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/nailgun/api/v1/handlers/network_configuration.py", line 101, in GET
    return self.serializer.serialize_for_cluster(cluster)
  File "/usr/lib/python2.6/site-packages/nailgun/objects/serializers/network_configuration.py", line 108, in serialize_for_cluster
    result = cls.serialize_net_groups_and_vips(cluster)
  File "/usr/lib/python2.6/site-packages/nailgun/objects/serializers/network_configuration.py", line 50, in serialize_net_groups_and_vips
    net_manager.assign_vips_for_net_groups_for_api(cluster))
  File "/usr/lib/python2.6/site-packages/nailgun/network/manager.py", line 1783, in assign_vips_for_net_groups_for_api
    cluster):
  File "/usr/lib/python2.6/site-packages/nailgun/network/manager.py", line 1899, in _assign_vips_for_net_groups
    objects.Cluster.get_node_group(cluster, noderoles)
  File "/usr/lib/python2.6/site-packages/nailgun/objects/cluster.py", line 898, in get_node_group
    ', '.join(noderoles)))
CanNotFindCommonNodeGroup: Node roles [controller] has more than one common node group

Fuel version info:
  release: "8.0"
  openstack_version: "2015.1.0-8.0"
  api: "1.0"
  build_number: "243"
  build_id: "243"

description: updated
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

It looks like we need to update these lines:

https://github.com/openstack/fuel-web/blob/master/nailgun/nailgun/objects/cluster.py#L895-L898

and return cls.get_default_group(instance) if len(nodegroups) > 1. Otherwise it won't be possible to deploy controllers in different racks.

Changed in fuel:
status: New → Triaged
tags: added: module-nailgun
Revision history for this message
Ihor Kalnytskyi (ikalnytskyi) wrote :

Alex,

Taking into account our current architecture there's no way to update them. They are by design due to the following reasons:

* It's not obvious in which nodegroup VIP should be allocated if controller (or other node role) nodes belong to few nodegroups.
* AFAIK, we use OCF scripts for moving VIPs, and they use ARP request to update VIM-MAC records. That means all nodes must be in the same L2 domain, otherwise - it will simply fail.

However, AFAIU, the case you try to cover is about external load balancer and manually specified VIPs. Since it's not implement in Nailgun yet, we can't fix this issue at the first place.

I propose to move it to 9.0.

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Igor, we still have to do something with it. In case user accidentally assigns controller roles from different racks he/she will lock UI for this env completely, because backend gives 500 Internal Server Error to UI in response to network configuration GET request. I think it's wrong and should be fixed.

Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Alexander Saprykin (cutwater)
assignee: Alexander Saprykin (cutwater) → Fuel Python Team (fuel-python)
Revision history for this message
Roman Prykhodchenko (romcheg) wrote :

https://review.openstack.org/#/c/257953/ may help to fix getting 500 on GET requests. Could someone please check on that?

Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Krzysztof Szukiełojć (krzysztof+launchpad)
Revision history for this message
Krzysztof Szukiełojć (kszukielojc) wrote :
Revision history for this message
Krzysztof Szukiełojć (kszukielojc) wrote :

I'm still getting stacktrace:

2016-01-13 17:03:09.179 ERROR [7f335bb7c880] (network_configuration) Serialization failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nailgun/api/v1/handlers/network_configuration.py", line 101, in GET
    return self.serializer.serialize_for_cluster(cluster)
  File "/usr/lib/python2.7/site-packages/nailgun/objects/serializers/network_configuration.py", line 111, in serialize_for_cluster
    result = cls.serialize_net_groups_and_vips(cluster, allocate_vips)
  File "/usr/lib/python2.7/site-packages/nailgun/objects/serializers/network_configuration.py", line 52, in serialize_net_groups_and_vips
    allocate))
  File "/usr/lib/python2.7/site-packages/nailgun/network/manager.py", line 1846, in assign_vips_for_net_groups_for_api
    for role, vip_info, vip_addr in allocated_vips_data:
  File "/usr/lib/python2.7/site-packages/nailgun/network/manager.py", line 1932, in _get_vips_for_net_groups
    in cls.get_node_groups_info(cluster):
  File "/usr/lib/python2.7/site-packages/nailgun/network/manager.py", line 1965, in get_node_groups_info
    noderoles)
  File "/usr/lib/python2.7/site-packages/nailgun/objects/cluster.py", line 892, in get_common_node_group
    ', '.join(noderoles)))
CanNotFindCommonNodeGroup: Node roles [controller] has more than one common node group

So it still has the same problem.

Changed in fuel:
assignee: Krzysztof Szukiełojć (krzysztof+launchpad) → nobody
Changed in fuel:
assignee: nobody → Fuel Python Team (fuel-python)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The triaged status assumes there was a w/a or a patch suggested, but according to the comments above, there are none working solutions

Changed in fuel:
status: Triaged → Confirmed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Cannot confirm for the 9.0, according to the Alex Didenko's comment: Please also note that this solution is needed for 8.0 only. In 9.0 we have new feature for manual VIPs allocation [1]. So in 9.0, if we can't auto allocate VIPs for some cluster configuration, we can simply ask user to manually set those problem VIPs or move roles to the same network node group (rack).

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

For 9.0 we still need to somehow handle the situation when Nailgun is asked to auto allocate VIP from IP pool, but roles for that VIP are in different network node groups (so we have different IP pools). And also 9.0 right now gives 500 error as well. So we need to fix this bug in both, 8.0 and 9.0. But the fix most likely will be different.

Dmitry Pyzhov (dpyzhov)
no longer affects: fuel/mitaka
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Roman Prykhodchenko (romcheg)
status: Confirmed → In Progress
Revision history for this message
Roman Prykhodchenko (romcheg) wrote :

This patch https://review.openstack.org/#/c/257953/ partially fixes the issue by removing IP allocation from the handler of GET requests.

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Yes, but we'll still get 500 Internal Server Error on PUT requests and thus on Deploy Changes since we still have the same algorythm in get_common_node_group method in objects/cluster.py, right? So it just moves the problem to another place :)

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Btw, this method should be reworked/removed/replaced anyway:
https://github.com/openstack/fuel-web/blob/816f237c76ddab9b376d47537dfbc84fb223782a/nailgun/nailgun/objects/cluster.py#L867-L895

We can configure cluster to use shared public and management networks. In such case it's absolutely fine to move controllers to different network node groups (racks) and everything will work out of the box in Fuel-8.0+

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
assignee: Roman Prykhodchenko (romcheg) → Fuel Python Team (fuel-python)
status: In Progress → Confirmed
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Roman Prykhodchenko (romcheg)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/273169

Changed in fuel:
status: Confirmed → In Progress
summary: - Nailgun throws Internal Server Error if controllers are in different
- nodegroups
+ It's not possible to move controllers into different racks
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/274791

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/274798

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/273169
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=338d62069995275e8dc0cc41187c700a47d03d8e
Submitter: Jenkins
Branch: master

commit 338d62069995275e8dc0cc41187c700a47d03d8e
Author: Roman Prykhodchenko <email address hidden>
Date: Wed Jan 27 17:21:13 2016 +0100

    Allow to clean VIPs from plugins

    This patch modifies merge policy in the way that
    specifying an empty list of VIPs in a plugin will
    remove all VIPs from the network role.

    Closes-bug: #1524320
    Change-Id: I57ba18feb1ee4a4fdb762f8018131c3b08e55fec

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Roman Prykhodchenko (romcheg) wrote :

The back-port to stable/8.0 is here https://review.openstack.org/#/c/274791/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (stable/8.0)

Reviewed: https://review.openstack.org/274791
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=f179f072bd50c2facf62ae842bd035b24deea308
Submitter: Jenkins
Branch: stable/8.0

commit f179f072bd50c2facf62ae842bd035b24deea308
Author: Roman Prykhodchenko <email address hidden>
Date: Wed Jan 27 17:21:13 2016 +0100

    Allow to clean VIPs from plugins

    This patch modifies merge policy in the way that
    specifying an empty list of VIPs in a plugin will
    remove all VIPs from the network role.

    Closes-bug: #1524320
    Change-Id: I57ba18feb1ee4a4fdb762f8018131c3b08e55fec

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/274798
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=77f5eaf896e8ddcef3fb4c4efc855896ea947255
Submitter: Jenkins
Branch: master

commit 77f5eaf896e8ddcef3fb4c4efc855896ea947255
Author: Roman Prykhodchenko <email address hidden>
Date: Mon Feb 1 18:13:17 2016 +0100

    Don't assign VIPs for GET requests

    Since VIPs make no sense before a cluster is deployed
    there's no need to assign them when network configuration
    is generated for GET requests.

    Co-authored by: Maciej Kwiek <email address hidden>
    Co-authored by: Roman Prykhodchenko<email address hidden>

    Change-Id: I382066cc62a9d98f728f5cd5edf771a5a980922f
    Closes-bug: #1504572
    Closes-bug: #1499291
    Partial-bug: #1524320

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/275234

Andrey Maximov (maximov)
tags: added: hit-hcf
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (stable/8.0)

Reviewed: https://review.openstack.org/275234
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=25a85691b60f8ef0140e835926953afa664cf9ba
Submitter: Jenkins
Branch: stable/8.0

commit 25a85691b60f8ef0140e835926953afa664cf9ba
Author: Roman Prykhodchenko <email address hidden>
Date: Mon Feb 1 18:13:17 2016 +0100

    Don't assign VIPs for GET requests

    Since VIPs make no sense before a cluster is deployed
    there's no need to assign them when network configuration
    is generated for GET requests.

    Co-authored by: Maciej Kwiek <email address hidden>
    Co-authored by: Roman Prykhodchenko<email address hidden>

    Change-Id: I382066cc62a9d98f728f5cd5edf771a5a980922f
    Closes-bug: #1504572
    Closes-bug: #1499291
    Partial-bug: #1524320

Revision history for this message
Roman Prykhodchenko (romcheg) wrote :

The patch was actually merged to 8.0 yesterday.

Maksym Strukov (unbelll)
tags: added: on-verification
Maksym Strukov (unbelll)
description: updated
Revision history for this message
Maksym Strukov (unbelll) wrote :

No errors. Verified as fixed in 8.0-518

tags: removed: on-verification
tags: added: on-verification
Revision history for this message
Andrey Lavrentyev (alavrentyev) wrote :

Wasn't able to reproduce it on 9.0 #330

[root@nailgun ~]# shotgun2 short-report
cat /etc/fuel_build_id:
 330
cat /etc/fuel_build_number:
 330
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6344.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8691.noarch
 network-checker-9.0.0-1.mos72.x86_64
 fuel-openstack-metadata-9.0.0-1.mos8691.noarch
 python-fuelclient-9.0.0-1.mos315.noarch
 fuel-9.0.0-1.mos6344.noarch
 fuel-nailgun-9.0.0-1.mos8691.noarch
 rubygem-astute-9.0.0-1.mos743.noarch
 fuel-library9.0-9.0.0-1.mos8366.noarch
 shotgun-9.0.0-1.mos88.noarch
 fuel-agent-9.0.0-1.mos278.noarch
 fuel-ui-9.0.0-1.mos2689.noarch
 fuel-setup-9.0.0-1.mos6344.noarch
 nailgun-mcagents-9.0.0-1.mos743.noarch
 fuel-misc-9.0.0-1.mos8366.noarch
 python-packetary-9.0.0-1.mos135.noarch
 fuel-bootstrap-cli-9.0.0-1.mos278.noarch
 fuel-migrate-9.0.0-1.mos8366.noarch
 fuel-mirror-9.0.0-1.mos135.noarch
 fuel-notify-9.0.0-1.mos8366.noarch
 fuel-ostf-9.0.0-1.mos934.noarch
 fuelmenu-9.0.0-1.mos270.noarch
 fuel-utils-9.0.0-1.mos8366.noarch

tags: removed: on-verification
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.