'openstack overcloud config download' fails; mistral.actions.action_factory.GetOvercloudConfig expected a character buffer object

Bug #1834094 reported by John Fulton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
James Slagle

Bug Description

Deployment unable to proceed before config-download can be downloaded.

Same error may be reproduced by following steps to do manual config download:

https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/ansible_config_download.html#manual-config-download

Failure on this step:

(undercloud) [stack@undercloud ~]$ openstack overcloud config download --config-dir config-download
Starting config-download export...
The action raised an exception [action_ex_id=5ac200f3-ff56-4998-a99a-8857a9414a76, action_cls='<class 'mistral.actions.action_factory.GetOvercloudConfig'>', attributes='{}', params='{u'container_config': u'overcloud-config', u'container': u'overcloud', u'config_type': None}']
 expected a character buffer object
Exception exporting config-download: The action raised an exception [action_ex_id=5ac200f3-ff56-4998-a99a-8857a9414a76, action_cls='<class 'mistral.actions.action_factory.GetOvercloudConfig'>', attributes='{}', params='{u'container_config': u'overcloud-config', u'container': u'overcloud', u'config_type': None}']
 expected a character buffer object
(undercloud) [stack@undercloud ~]$

Using latest master to pass CI on June 24, 2019 on CentOS7 using undercloud and overcloud (not standalone).

Revision history for this message
John Fulton (jfulton-org) wrote :

- tripleo client call: https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_config.py#L98

- mistral task: https://github.com/openstack/tripleo-common/blob/master/workbooks/deployment.yaml#L615

- failed action's code: https://github.com/openstack/tripleo-common/blob/master/tripleo_common/actions/config.py#L31

- mistral engine log:

[root@undercloud mistral]# tail -1000 engine.log | curl -F 'f:1=<-' ix.io
http://ix.io/1MHH
[root@undercloud mistral]#

- swift list

(undercloud) [stack@undercloud train]$ swift list
__cache__
ov-lz54jdg42cb-0-rcgjn5q2ppko-Controller-jhhn7x55dxfs
ov-taezz52tcd-0-p4pmhsjkjqr7-NovaCompute-4bof3akxu77c
ov-trhpanqjze-0-2q65y6kp4ha3-CephStorage-ta7rcfgvcijz
overcloud
overcloud-messages
overcloud-swift-rings
overcloud_ceph_ansible_fetch_dir
tripleo-validations
(undercloud) [stack@undercloud train]$

Revision history for this message
John Fulton (jfulton-org) wrote :
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :
Download full text (5.0 KiB)

Hitting the same issue as of today (1st of July) with latest packages on centos-7.

Can add the following:
right before the failure, located in mistral-executor log, we can see the following log trace bellow.
It looks like either a path is wrong, or a file doesn't exist in the swift container.
Also, this can be tracked back to the following code:
https://github.com/openstack/tripleo-common/blob/master/tripleo_common/utils/config.py#L517

Packages:
on the host:
python2-tripleo-common-11.0.1-0.20190622091455.a20d66f.el7.noarch

In mistral container:
puppet-mistral-15.1.0-0.20190610144005.030a4fe.el7.noarch
python2-mistral-8.1.0-0.20190620140934.f75e719.el7.noarch
openstack-mistral-executor-8.1.0-0.20190620140934.f75e719.el7.noarch
python2-mistral-lib-1.1.0-0.20190306210850.bac92db.el7.noarch
python2-mistralclient-3.9.0-0.20190517090825.de9d2de.el7.noarch
openstack-mistral-common-8.1.0-0.20190620140934.f75e719.el7.noarch
python2-tripleo-common-11.0.1-0.20190622091455.a20d66f.el7.noarch

Log of interest (from /var/log/containers/mistral/executor.log):

2019-07-01 14:21:17.995 8 INFO tripleo_common.actions.ansible [req-34985d26-9879-4842-8aff-01fe77df8081 988f449cc23b4114ae260bc8c022f5ac d1ff8565af7f4b0ca24df34237761a23 - default default] Running ansible-playbook command: ['ansible-playbook-2', '-vvvvv', '/tmp/ansible-mistral-action784EVw/playbook.yaml', '--become', '--become-user', 'root', '--inventory-file', '/tmp/ansible-mistral-action784EVw/inventory.yaml']
2019-07-01 14:21:34.127 8 INFO swiftclient [req-34985d26-9879-4842-8aff-01fe77df8081 988f449cc23b4114ae260bc8c022f5ac d1ff8565af7f4b0ca24df34237761a23 - default default] REQ: curl -i https://192.168.24.2:13808/v1/AUTH_d1ff8565af7f4b0ca24df34237761a23/overcloud-config?format=json -X GET -H "Accept-Encoding: gzip" -H "X-Auth-Token: gAAAAABdGhJeSqV5..."
2019-07-01 14:21:34.129 8 INFO swiftclient [req-34985d26-9879-4842-8aff-01fe77df8081 988f449cc23b4114ae260bc8c022f5ac d1ff8565af7f4b0ca24df34237761a23 - default default] RESP STATUS: 404 Not Found
2019-07-01 14:21:34.129 8 INFO swiftclient [req-34985d26-9879-4842-8aff-01fe77df8081 988f449cc23b4114ae260bc8c022f5ac d1ff8565af7f4b0ca24df34237761a23 - default default] RESP HEADERS: {u'Date': u'Mon, 01 Jul 2019 14:21:34 GMT', u'Content-Length': u'70', u'Content-Type': u'text/html; charset=UTF-8', u'X-Openstack-Request-Id': u'tx5c4f2548a9f240e49071c-005d1a16ee', u'X-Trans-Id': u'tx5c4f2548a9f240e49071c-005d1a16ee'}
2019-07-01 14:21:34.129 8 INFO swiftclient [req-34985d26-9879-4842-8aff-01fe77df8081 988f449cc23b4114ae260bc8c022f5ac d1ff8565af7f4b0ca24df34237761a23 - default default] RESP BODY: <html><h1>Not Found</h1><p>The resource could not be found.</p></html>
2019-07-01 14:21:36.442 8 INFO tripleo_common.utils.config.Config [req-34985d26-9879-4842-8aff-01fe77df8081 988f449cc23b4114ae260bc8c022f5ac d1ff8565af7f4b0ca24df34237761a23 - default default] Generating configuration under the directory: /tmp/tripleo-JSp_J9-config
2019-07-01 14:21:39.656 8 INFO tripleo_common.utils.config.Config [req-34985d26-9879-4842-8aff-01fe77df8081 988f449cc23b4114ae260bc8c022f5ac d1ff8565af7f4b0ca24df34237761a23 - default default] Getting deployment data from Heat...
2...

Read more...

Changed in tripleo:
assignee: nobody → James Slagle (james-slagle)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/668560

Revision history for this message
James Slagle (james-slagle) wrote :
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

So James' patches does solve this first issue. Once we get over that empty set (or whatever it is), we hit another issue:

TASK [Copy NetworkConfig script] ***********************************************
Tuesday 02 July 2019 07:34:54 +0000 (0:00:00.459) 0:02:32.173 **********
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
fatal: [overcloud-novacompute-0]: FAILED! => {"changed": false, "msg": "Could not find or access 'Compute/overcloud-novacompute-0/NetworkConfig'\nSearched in:\n\t/var/lib/mistral/overcloud/files/Compute/overcloud-novacompute-0/NetworkConfi
g\n\t/var/lib/mistral/overcloud/Compute/overcloud-novacompute-0/NetworkConfig\n\t/var/lib/mistral/overcloud/files/Compute/overcloud-novacompute-0/NetworkConfig\n\t/var/lib/mistral/overcloud/Compute/overcloud-novacompute-0/NetworkConfig on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"}

So maybe a follow-up to James' is to create an empty file?

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :
Download full text (3.6 KiB)

Meh... If we create an empty file, we then hit a third issue:

TASK [NetworkConfig stdout] ****************************************************
Tuesday 02 July 2019 08:12:24 +0000 (0:00:01.092) 0:02:33.170 **********
ok: [overcloud-controller-0] => {
    "NetworkConfig_result.stderr_lines": [
        "+ '[' -n '{\"network_config\": [{\"members\": [{\"name\": \"interface_name\", \"primary\": true, \"type\": \"interface\"}], \"name\": \"bridge_name\", \"type\": \"ovs_bridge\", \"use_dhcp\": true}]}' ']'",
        "+ '[' -z '' ']'",
        "+ trap configure_safe_defaults EXIT",
        "++ date +%Y-%m-%dT%H:%M:%S",
        "+ DATETIME=2019-07-02T08:12:18",
        "+ '[' -f /etc/os-net-config/config.json ']'",
        "+ mkdir -p /etc/os-net-config",
        "+ echo '{\"network_config\": [{\"members\": [{\"name\": \"interface_name\", \"primary\": true, \"type\": \"interface\"}], \"name\": \"bridge_name\", \"type\": \"ovs_bridge\", \"use_dhcp\": true}]}'",
        "++ type -t network_config_hook",
        "+ '[' '' = function ']'",
        "+ sed -i s/bridge_name/br-ex/ /etc/os-net-config/config.json",
        "+ sed -i s/interface_name/nic1/ /etc/os-net-config/config.json",
        "+ set +e",
        "+ os-net-config -c /etc/os-net-config/config.json -v --detailed-exit-codes",
        "[2019/07/02 08:12:18 AM] [INFO] Using config file at: /etc/os-net-config/config.json",
        "[2019/07/02 08:12:18 AM] [INFO] Ifcfg net config provider created.",
        "[2019/07/02 08:12:18 AM] [INFO] Not using any mapping file.",
        "[2019/07/02 08:12:19 AM] [INFO] Finding active nics",
        "[2019/07/02 08:12:19 AM] [INFO] eth1 is an embedded active nic",
        "[2019/07/02 08:12:19 AM] [INFO] eth0 is an embedded active nic",
        "[2019/07/02 08:12:19 AM] [INFO] lo is not an active nic",
        "[2019/07/02 08:12:19 AM] [INFO] No DPDK mapping available in path (/var/lib/os-net-config/dpdk_mapping.yaml)",
        "[2019/07/02 08:12:19 AM] [INFO] Active nics are ['eth0', 'eth1']",
        "[2019/07/02 08:12:19 AM] [INFO] nic2 mapped to: eth1",
        "[2019/07/02 08:12:19 AM] [INFO] nic1 mapped to: eth0",
        "[2019/07/02 08:12:19 AM] [INFO] adding bridge: br-ex",
        "[2019/07/02 08:12:19 AM] [INFO] adding interface: eth0",
        "[2019/07/02 08:12:19 AM] [INFO] applying network configs...",
        "[2019/07/02 08:12:19 AM] [INFO] running ifdown on interface: eth0",
        "[2019/07/02 08:12:19 AM] [INFO] running ifdown on bridge: br-ex",
        "[2019/07/02 08:12:19 AM] [INFO] Writing config /etc/sysconfig/network-scripts/ifcfg-br-ex",
        "[2019/07/02 08:12:19 AM] [INFO] Writing config /etc/sysconfig/network-scripts/ifcfg-eth0",
        "[2019/07/02 08:12:19 AM] [INFO] running ifup on bridge: br-ex",
        "[2019/07/02 08:12:19 AM] [INFO] running ifup on interface: eth0",
        "+ RETVAL=2",
        "+ set -e",
        "+ [[ 2 == 2 ]]",
        "+ '[' -f /etc/udev/rules.d/99-dhcp-all-interfaces.rules ']'",
        "+ rm /etc/udev/rules.d/99-dhcp-all-interfaces.rules",
        "+ '[' -f /usr/libexec/os-apply-config/templates/etc/os-net-config/config.json ']'",
        "+ ...

Read more...

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

in order to pass through that last issue, I had to:
- config download
- edit deploy_steps_playbook.yaml
- add a new condition for the following tasks:
° NetworkConfig stdout
° Write rc of NetworkConfig script
The condition was: NetworkConfig_result is changed

Doing so, in addition to James' patch + my proposal (create empty file nonetheless) allowed me to get a successful overcloud deploy.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/668652

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (master)

Reviewed: https://review.opendev.org/668560
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=ecd3738077f54ce5440a66bbb616b1e32295e896
Submitter: Zuul
Branch: master

commit ecd3738077f54ce5440a66bbb616b1e32295e896
Author: James Slagle <email address hidden>
Date: Mon Jul 1 16:55:20 2019 -0400

    Handle empty NetworkConfig

    In some cases the NetworkConfig config resource may come back as an
    empty dict from Heat, such as when using net-config-noop.yaml. In that
    case, we should not attempt to write any config.

    Change-Id: I02129adf193f7a4a8b38411f3a20079d10cfa872
    Related-bug: #1834094

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/668652
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=c36433e34e4e613eae59ec254553d3621e9c90f2
Submitter: Zuul
Branch: master

commit c36433e34e4e613eae59ec254553d3621e9c90f2
Author: Cédric Jeanneret <email address hidden>
Date: Tue Jul 2 13:05:31 2019 +0200

    Run NetworkConfig only if configuration script exists

    Script is first located/generated on the undercloud node (aka ansible
    host), meaning we have to use local_action.
    We also deactivate the "become" in order to avoid useless privilege
    escalation.

    Change-Id: I8c1ed334dc5b578a87307a47656ee2d87f1e3688
    Depends-On: https://review.opendev.org/668560
    Related-Bug: #1834094

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/670237

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-common (stable/stein)

Change abandoned by Cédric Jeanneret (Tengu) (<email address hidden>) on branch: stable/stein
Review: https://review.opendev.org/670237
Reason: not needed. Thanks James for confirmation!

Changed in tripleo:
milestone: train-2 → train-3
Changed in tripleo:
milestone: train-3 → ussuri-1
Changed in tripleo:
milestone: ussuri-1 → ussuri-2
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-2 → ussuri-3
Revision history for this message
John Fulton (jfulton-org) wrote :
Changed in tripleo:
status: In Progress → Fix Released
milestone: ussuri-3 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.