neutron-server rolling upgrade fails

Bug #1821086 reported by Mikael Johansson
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
High
Mark Goddard
Stein
Fix Released
High
Mark Goddard

Bug Description

Please note that this bug spans kolla and kolla-ansible projects, atleast as far as we've been able to investigate.

When upgrading neutron-server from 7.0.0 to 8.0.0.0b1 with kolla-ansible the task "Running Neutron database contract container" in kolla-ansible/ansible/roles/neutron/tasks/rolling_upgrade.yml fails with "Invalid String" errors:

argument --subproject: Invalid String(choices=['vmware-nsx', 'networking-infoblox', 'neutron-lbaas', 'networking-sfc', 'neutron-vpnaas', 'networking-l2gw', 'neutron-fwaas', 'neutron', 'neutron-dynamic-routing']) value: [neutron,

The above task uses the bundled kolla_docker module which sends an NEUTRON_ROLLING_UPGRADE_SERVICES env with value "{{ neutron_rolling_upgrade_services }}" during neutron-server container start.

The neutron_rolling_upgrade_services ansible variable configured in defaults/main.yml looks like this:

neutron_rolling_upgrade_services: ["neutron", "neutron-fwaas", "neutron-vpnaas"]

In the neutron-server container there's an extend_start.sh script that will be executed when starting the container, that script have the following logic to loop through that list above:

if [[ "${!KOLLA_UPGRADE[@]}" ]]; then
    if [[ "${!NEUTRON_DB_EXPAND[@]}" ]]; then
        DB_ACTION="--expand"
        echo "Expanding database"
    fi
    if [[ "${!NEUTRON_DB_CONTRACT[@]}" ]]; then
        DB_ACTION="--contract"
        echo "Contracting database"
    fi

    if [[ "${!NEUTRON_ROLLING_UPGRADE_SERVICES[@]}" ]]; then
        for service in ${NEUTRON_ROLLING_UPGRADE_SERVICES}; do
            neutron-db-manage --subproject $service upgrade $DB_ACTION
        done
    fi
    exit 0
fi

That for-loop runs neutron-db-manage upgrade on each service using the provided env variable which contains extra square brackets that originates from the neutron_rolling_upgrade_services ansible variable. If i echo each iteration in the for loop the output looks like this:

neutron-db-manage --subproject ['neutron', upgrade --contract
neutron-db-manage --subproject 'neutron-fwaas', upgrade --contract
neutron-db-manage --subproject 'neutron-vpnaas'] upgrade --contract

Another important note here is that the bug does not propagate any errors and crashes the Docker container, which would have notified Ansible when executing the role. Basically it silently fails within the container and exits with an exit code of 0.

Running the commands manually within the neutron_server container with the correctly formatted subproject names obviously worked.

Related commits that added the changes that causes this bug (from Github):

https://github.com/openstack/kolla/commit/4f4de70594d3d9060c262438ddd87f768b4dda00

https://github.com/openstack/kolla-ansible/commit/ac5d5217fc7b40bdd2371f7f0f2caa4d734568bb

description: updated
tags: added: neutron-server
tags: added: neutron
description: updated
description: updated
description: updated
Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

Hi, what distro you running. Might be a difference between ubuntu and centos bash? Havent seen those errors with centos.

Regards

Revision history for this message
Mikael Johansson (mikjoh) wrote :

Hi!

We're running the following OS's:

Deploy Machine: Ubuntu 16.04.5
Target hosts: Ubuntu 16.04.5
Containers: Ubuntu 18.04

Revision history for this message
Mikael Johansson (mikjoh) wrote :

I've isolated the specific parts that makes up this bug and ran it through CentOS and Ubuntu Docker containers with the same result. It's still that variable in neutron/defaults/main.yml that isn't interpreted (later on when that container are started) as intended by the author of the commit(s) that introduced the rolling upgrade logic:

neutron_rolling_upgrade_services: ["neutron", "neutron-fwaas", "neutron-vpnaas"]

will be looped through in the container in the following way:

neutron-db-manage --subproject ['neutron', upgrade --contract
neutron-db-manage --subproject 'neutron-fwaas', upgrade --contract
neutron-db-manage --subproject 'neutron-vpnaas'] upgrade --contract

Workaround at the moment:

NEUTRON_ROLLING_UPGRADE_SERVICES=$(echo $NEUTRON_ROLLING_UPGRADE_SERVICES | tr -d '[],')

I can't see how different bash versions would yield different results, atleast not in this case. Please not that this fails silently and your kolla-ansible runs will not fail on this but instead later on when newer columns etc. are being accessed that isn't there due to the upgrade never went through.

Revision history for this message
Mikael Johansson (mikjoh) wrote :

If anyone want to try this at home i've created a repo containing the isolated parts of this bug, hopefully this might clear things up a bit:

https://github.com/mikejoh/kolla-ansible-neutron-bug

Revision history for this message
Mikael Johansson (mikjoh) wrote :

OK, so this was merged into master today: https://git.openstack.org/cgit/openstack/kolla-ansible/patch/?id=42d664c15618a67e899f77b622845dab145dc91a

Confirmed working! Thanks @Mark Goddard! :)

Mark Goddard (mgoddard)
no longer affects: kolla
Mark Goddard (mgoddard)
Changed in kolla-ansible:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.