default availability zone is not set based on JUJU_AVAILABILITY_ZONE

Bug #1796068 reported by Dmitrii Shcherbakov
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Neutron API Charm
Fix Released
High
Unassigned
OpenStack Neutron Gateway Charm
Fix Released
Wishlist
Dmitrii Shcherbakov
OpenStack Neutron Open vSwitch Charm
Invalid
Wishlist
Unassigned
OpenStack Nova Compute Charm
Fix Released
Wishlist
Dmitrii Shcherbakov

Bug Description

Currently default_availability_zone in nova is set to "nova" regardless of a charm config called default-availability-zone.

For charm-neutron-openvswitch the availability_zone config for DHCP agents is set to a config value propagated from the nova charm (which is always set to "nova"). This prevents using AZ-aware scheduling of DHCP agents which is an operator requirement.

Proposed solutions:

1) use JUJU_AVAILABILITY_ZONE for agents configured by neutron-openvswitch charm if set, otherwise use a value propagated from a nova charm config;
2) use JUJU_AVAILABILITY_ZONE for nova-compute charm unless it is overridden by the default-availability-zone charm config.

More info:
~~~

There are two AZ-related nova configs:

default_availability_zone (default AZ for a compute host if it has not been added to an aggregate with an assigned AZ)
https://github.com/openstack/nova/blob/stable/queens/nova/conf/availability_zone.py#L30-L39
https://github.com/openstack/nova/blob/stable/queens/nova/availability_zones.py#L71-L100

default_schedule_zone (a default zone used by Nova API if a user request does not contain an AZ)
https://github.com/openstack/nova/blob/stable/queens/nova/conf/availability_zone.py#L43-L59
https://github.com/openstack/nova/blob/stable/queens/nova/compute/api.py#L488-L489

Neutron has its own concept of an availability zone for per-agent AZs which is important for cases like HA DHCP.

https://blueprints.launchpad.net/neutron/+spec/add-availability-zone (mitaka)
https://review.openstack.org/#/c/246107/

https://github.com/openstack/neutron/blob/stable/queens/neutron/conf/agent/common.py#L142-L147

~~

Charm behavior:

neutron-openvswitch simply receives a default config from nova-compute primary
https://github.com/openstack/charm-nova-compute/blob/stable/18.08/hooks/nova_compute_hooks.py#L442-L445
        'default_availability_zone': config('default-availability-zone')

https://github.com/openstack/charm-neutron-openvswitch/blob/stable/18.08/hooks/neutron_ovs_context.py#L208-L214
                availability_zone = relation_get(
                    'default_availability_zone',

Reviews:

https://review.openstack.org/#/q/topic:bug/1796068+(status:open+OR+status:merged)

information type: Private → Public
Revision history for this message
James Page (james-page) wrote :

I'd suggest we maintain existing behaviour as default for now; would a toggle on nova-compute and neutron-gateway to use the Juju provided JUJU_AVAILABILITY_ZONE make sense? then a deployment can turn on this feature - I also think we should tie a unit of n-ovs to the same az as its parent n-compute otherwise people will get in a twist - so something like

   juju config nova-compute use-juju-availability-zones=True

would reconfigure nova-compute with a preset AZ and propagate the same value down to the neutron-openvswitch agents.

neutron-gateway would need the same option.

What's the earliest release we can support this back to?

description: updated
Revision history for this message
James Page (james-page) wrote :

FWIW I'm not super keen on a charm level config option to set the AZ - that means fragmentation of applications (i.e one nova-compute application instances per AZ) which is fiddly and sub-optimal IMHO.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

I agree about adding the option.

The earliest release would be mitaka where those options were introduced to Neutron.

In most cases I expect Nova node and Neutron agent AZs to match so passing az down to agents seems reasonable.

There are probably some edge cases where we'd want to have different AZs for agents and nova nodes but that's a secondary concern.

James Page (james-page)
Changed in charm-neutron-gateway:
status: New → Triaged
Changed in charm-neutron-openvswitch:
status: New → Triaged
Changed in charm-nova-compute:
status: New → Triaged
Changed in charm-neutron-gateway:
importance: Undecided → Wishlist
Changed in charm-neutron-openvswitch:
importance: Undecided → Wishlist
Changed in charm-nova-compute:
importance: Undecided → Wishlist
Changed in charm-nova-compute:
assignee: nobody → Dmitrii Shcherbakov (dmitriis)
status: Triaged → In Progress
Changed in charm-neutron-gateway:
assignee: nobody → Dmitrii Shcherbakov (dmitriis)
status: Triaged → In Progress
description: updated
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Subscribed ~field-high as this affects an ongoing project with a requirement for HA DHCP with AZ-aware scheduling.

Reviews:
https://review.openstack.org/#/q/topic:bug/1796068+(status:open+OR+status:merged)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-neutron-gateway (master)

Reviewed: https://review.openstack.org/608109
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-gateway/commit/?id=71c0120d213e2adbd42df847022d8e36b5df9417
Submitter: Zuul
Branch: master

commit 71c0120d213e2adbd42df847022d8e36b5df9417
Author: Dmitrii Shcherbakov <email address hidden>
Date: Wed Oct 3 14:36:52 2018 +0100

    Allow Juju AZ context information to be used

    The change adds an option to the charm to use JUJU_AVAILABILITY_ZONE
    environment variable set by Juju for the hook environment based on the
    underlying provider's availability zone information for a given machine.

    This information is used to configure the availability_zone setting for
    Neutron DHCP and L3 agents specifically because they support it
    and for other agents (because both neutron.conf and agent-specific
    configuration files are loaded) such as metadata agents and lbaas
    agents.

    Additionally, a setting is added to allow changing the default
    availability zone because 'nova' is a default value coming from the
    Neutron defaults for agents.

    Change-Id: I94303aa70ee3adc6ace0f9af1e7c4f5c0edbcdb5
    Closes-Bug: #1796068

Changed in charm-neutron-gateway:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-compute (master)

Reviewed: https://review.openstack.org/608030
Committed: https://git.openstack.org/cgit/openstack/charm-nova-compute/commit/?id=4fb124310091008b3f650ac389d010d530b6831e
Submitter: Zuul
Branch: master

commit 4fb124310091008b3f650ac389d010d530b6831e
Author: Dmitrii Shcherbakov <email address hidden>
Date: Thu Oct 4 19:22:43 2018 +0300

    Allow Juju AZ context information to be used

    The change adds an option to the charm to use JUJU_AVAILABILITY_ZONE
    environment variable set by Juju for the hook environment based on the
    underlying provider's availability zone information for a given machine.

    This information is used to configure default_availability_zone for nova
    and availability_zone for subordinate networking charms.

    Change-Id: Idc7112e7fe7b76d15cf9c4896b702b8ffd8c0e8e
    Closes-Bug: #1796068

Changed in charm-nova-compute:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-neutron-gateway (stable/18.08)

Fix proposed to branch: stable/18.08
Review: https://review.openstack.org/609034

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (stable/18.08)

Fix proposed to branch: stable/18.08
Review: https://review.openstack.org/609035

David Ames (thedac)
Changed in charm-nova-compute:
milestone: none → 19.04
Changed in charm-neutron-gateway:
milestone: none → 19.04
Changed in charm-nova-compute:
status: Fix Committed → Fix Released
milestone: 19.04 → 18.11
Changed in charm-neutron-gateway:
status: Fix Committed → Fix Released
milestone: 19.04 → 18.11
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on charm-nova-compute (stable/18.08)

Change abandoned by James Page (<email address hidden>) on branch: stable/18.08
Review: https://review.openstack.org/609035
Reason: Part of latest charm release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on charm-neutron-gateway (stable/18.08)

Change abandoned by James Page (<email address hidden>) on branch: stable/18.08
Review: https://review.openstack.org/609034
Reason: Part of latest charm release.

Revision history for this message
James Page (james-page) wrote :

Marking neutron-openvswitch task as invalid as nova-compute provides the zone value over the subordinate relation.

Changed in charm-neutron-openvswitch:
status: Triaged → Invalid
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Added neutron-api charm.

network_scheduler_driver is set to WeightScheduler, not AZAwareWeightScheduler by default in neutron.

AGENTS_SCHEDULER_OPTS = [
    cfg.StrOpt('network_scheduler_driver',
               default='neutron.scheduler.'
                       'dhcp_agent_scheduler.WeightScheduler',
               help=_('Driver to use for scheduling network to DHCP agent')),

neutron.conf needs to contain this in order for this to work:

    network_scheduler_driver = neutron.scheduler.dhcp_agent_scheduler.AZAwareWeightScheduler
    dhcp_load_type = networks

Agent load calculation can also be done based on different criteria (networks, subnets, ports):

    cfg.StrOpt('dhcp_load_type', default='networks',
               choices=['networks', 'subnets', 'ports'],

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

AZAwareWeightScheduler is based on WeightScheduler

https://github.com/openstack/neutron/blob/mitaka-eol/neutron/scheduler/dhcp_agent_scheduler.py#L92-L98
class AZAwareWeightScheduler(WeightScheduler):

The spec also mentions that by default all agents belong to the 'nova' AZ so the scheduler change should be backwards-compatible.
http://specs.openstack.org/openstack/neutron-specs/specs/mitaka/availability-zone.html#upgrade-impact

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Changed in charm-neutron-api:
milestone: none → 19.04
status: New → Triaged
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-neutron-api (master)

Reviewed: https://review.openstack.org/632090
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-api/commit/?id=1e6430f9c6158ca048764866b9e985b5dc92d685
Submitter: Zuul
Branch: master

commit 1e6430f9c6158ca048764866b9e985b5dc92d685
Author: Dmitrii Shcherbakov <email address hidden>
Date: Mon Jan 21 14:39:41 2019 +0200

    Switch to AZAwareWeightScheduler as of Mitaka

    AZAwareWeightScheduler is based on WeightScheduler and provides a way to make
    DHCP agent scheduling be AZ-aware. This is used in conjunction with
    dhcp-agents-per-network config option and per-network agents (such as dnsmasq)
    will be distributed across neutron-dhcp-agents that have availability_zone
    configuration (based on dhcp-load-type for placement calculation).

    bp: https://blueprints.launchpad.net/neutron/+spec/add-availability-zone

    Upgrade impact is mentioned here:
    specs.openstack.org/openstack/neutron-specs/specs/mitaka/availability-zone.html

    The spec mentions that by default all agents belong to 'nova' AZ so
    the scheduler change should be backwards-compatible.

    Change-Id: I4d948efa157573fdbc0fbfd3b1efb21b69a713ef
    Closes-Bug: #1796068

Changed in charm-neutron-api:
status: Triaged → Fix Committed
David Ames (thedac)
Changed in charm-neutron-api:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.