Ocata -> Pike overcloud upgrade: major-upgrade-composable-steps-docker fails with The Resource Type (OS::TripleO::Network::Management) could not be found.

Bug #1729039 reported by Marius Cornea
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Steven Hardy

Bug Description

Ocata deployment:

openstack overcloud deploy \
--templates \
-e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml \
-e /home/stack/ospd-11-vlan-dpdk-single-port-ctlplane-bonding/network-environment.yaml \

When running major-upgrade-composable-steps-docker to upgrade to Pike it fails with:

openstack overcloud deploy \
--templates \
-e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml \
-e /home/stack/ospd-11-vlan-dpdk-single-port-ctlplane-bonding/network-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml \

Started Mistral Workflow tripleo.deployment.v1.deploy_plan. Execution ID: beff9250-d546-4239-b119-f70bd0b07adc
{u'execution': {u'created_at': u'2017-10-31 16:16:48',
                u'id': u'beff9250-d546-4239-b119-f70bd0b07adc',
                u'input': {u'container': u'overcloud',
                           u'queue_name': u'fa7fa51f-3a5e-4504-a9a3-b2fbba433708',
                           u'run_validations': False,
                           u'skip_deploy_identifier': False,
                           u'timeout': 240},
                u'name': u'tripleo.deployment.v1.deploy_plan',
                u'params': {u'namespace': u''},
                u'spec': {u'description': u'Deploy the overcloud for a plan.\n',
                          u'input': [u'container',
                                     {u'run_validations': False},
                                     {u'timeout': 240},
                                     {u'skip_deploy_identifier': False},
                                     {u'queue_name': u'tripleo'}],
                          u'name': u'deploy_plan',
                          u'tags': [u'tripleo-common-managed'],
                          u'tasks': {u'add_validation_ssh_key': {u'input': {u'container': u'<% $.container %>',
                                                                            u'queue_name': u'<% $.queue_name %>'},
                                                                 u'name': u'add_validation_ssh_key',
                                                                 u'on-complete': [{u'run_validations': u'<% $.run_validations %>'},
                                                                                  {u'create_swift_rings_backup_plan': u'<% not $.run_validations %>'}],
                                                                 u'type': u'direct',
                                                                 u'version': u'2.0',
                                                                 u'workflow': u'tripleo.validations.v1.add_validation_ssh_key_parameter'},
                                     u'create_swift_rings_backup_plan': {u'input': {u'container': u'<% $.container %>',
                                                                                    u'queue_name': u'<% $.queue_name %>',
                                                                                    u'use_default_templates': True},
                                                                         u'name': u'create_swift_rings_backup_plan',
                                                                         u'on-error': u'create_swift_rings_backup_plan_set_status_failed',
                                                                         u'on-success': u'get_heat_stack',
                                                                         u'type': u'direct',
                                                                         u'version': u'2.0',
                                                                         u'workflow': u'tripleo.swift_rings_backup.v1.create_swift_rings_backup_container_plan'},
                                     u'create_swift_rings_backup_plan_set_status_failed': {u'name': u'create_swift_rings_backup_plan_set_status_failed',
                                                                                           u'on-success': u'send_message',
                                                                                           u'publish': {u'message': u'<% task(create_swift_rings_backup_plan).result %>',
                                                                                                        u'status': u'FAILED'},
                                                                                           u'type': u'direct',
                                                                                           u'version': u'2.0'},
                                     u'deploy': {u'action': u'tripleo.deployment.deploy',
                                                 u'input': {u'container': u'<% $.container %>',
                                                            u'skip_deploy_identifier': u'<% $.skip_deploy_identifier %>',
                                                            u'timeout': u'<% $.timeout %>'},
                                                 u'name': u'deploy',
                                                 u'on-error': u'set_deployment_failed',
                                                 u'on-success': u'send_message',
                                                 u'type': u'direct',
                                                 u'version': u'2.0'},
                                     u'get_heat_stack': {u'action': u'heat.stacks_get stack_id=<% $.container %>',
                                                         u'name': u'get_heat_stack',
                                                         u'on-error': u'deploy',
                                                         u'on-success': [{u'set_stack_in_progress': u'<% "_IN_PROGRESS" in task(get_heat_stack).result.stack_status %>'},
                                                                         {u'deploy': u'<% not "_IN_PROGRESS" in task(get_heat_stack).result.stack_status %>'}],
                                                         u'type': u'direct',
                                                         u'version': u'2.0'},
                                     u'run_validations': {u'input': {u'group_names': [u'pre-deployment'],
                                                                     u'plan': u'<% $.container %>',
                                                                     u'queue_name': u'<% $.queue_name %>'},
                                                          u'name': u'run_validations',
                                                          u'on-error': u'set_validations_failed',
                                                          u'on-success': u'create_swift_rings_backup_plan',
                                                          u'type': u'direct',
                                                          u'version': u'2.0',
                                                          u'workflow': u'tripleo.validations.v1.run_groups'},
                                     u'send_message': {u'action': u'zaqar.queue_post',
                                                       u'input': {u'messages': {u'body': {u'payload': {u'execution': u'<% execution() %>',
                                                                                                       u'message': u"<% $.get('message', '') %>",
                                                                                                       u'status': u"<% $.get('status', 'SUCCESS') %>"},
                                                                                          u'type': u'tripleo.deployment.v1.deploy_plan'}},
                                                                  u'queue_name': u'<% $.queue_name %>'},
                                                       u'name': u'send_message',
                                                       u'on-success': [{u'fail': u'<% $.get(\'status\') = "FAILED" %>'}],
                                                       u'retry': u'count=5 delay=1',
                                                       u'type': u'direct',
                                                       u'version': u'2.0'},
                                     u'set_deployment_failed': {u'name': u'set_deployment_failed',
                                                                u'on-success': u'send_message',
                                                                u'publish': {u'message': u'<% task(deploy).result %>',
                                                                             u'status': u'FAILED'},
                                                                u'type': u'direct',
                                                                u'version': u'2.0'},
                                     u'set_stack_in_progress': {u'name': u'set_stack_in_progress',
                                                                u'on-success': u'send_message',
                                                                u'publish': {u'message': u'The Heat stack is busy.',
                                                                             u'status': u'FAILED'},
                                                                u'type': u'direct',
                                                                u'version': u'2.0'},
                                     u'set_validations_failed': {u'name': u'set_validations_failed',
                                                                 u'on-success': u'send_message',
                                                                 u'publish': {u'message': u'<% task(run_validations).result %>',
                                                                              u'status': u'FAILED'},
                                                                 u'type': u'direct',
                                                                 u'version': u'2.0'}},
                          u'version': u'2.0'}},
 u'message': u"Failed to run action [action_ex_id=46c2a132-902c-4dd2-93e3-608e15353376, action_cls='<class 'mistral.actions.action_factory.DeployStackAction'>', attributes='{}', params='{u'skip_deploy_identifier': False, u'container': u'overcloud', u'timeout': 240}']\n ERROR: resources.Networks<http://192.0.20.1:8080/v1/AUTH_ecb57f4c91ff488ab8677af7358a18c0/overcloud/network/networks.yaml>: : The Resource Type (OS::TripleO::Network::Management) could not be found.",
 u'status': u'FAILED'}

Revision history for this message
Marius Cornea (mcornea) wrote :
Download full text (3.2 KiB)

(undercloud) [stack@undercloud-0 ~]$ cat /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml
# A Heat environment that can be used to deploy DPDK with OVS
# Deploying DPDK requires enabling hugepages for the overcloud nodes
resource_registry:
  OS::TripleO::Services::ComputeNeutronOvsDpdk: ../puppet/services/neutron-ovs-dpdk-agent.yaml

parameter_defaults:
  NeutronDatapathType: "netdev"
  NeutronVhostuserSocketDir: "/var/lib/vhost_sockets"
  NovaSchedulerDefaultFilters: "RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,NUMATopologyFilter"
  OvsDpdkDriverType: "vfio-pci"

  #ComputeOvsDpdkParameters:
    ## Host configuration Parameters
    #TunedProfileName: "cpu-partitioning"
    #IsolCpusList: "" # Logical CPUs list to be isolated from the host process (applied via cpu-partitioning tuned).
                                    # It is mandatory to provide isolated cpus for tuned to achive optimal performance.
                                    # Example: "3-8,12-15,18"
    #KernelArgs: "" # Space separated kernel args to configure hugepage and IOMMU.
                                    # Deploying DPDK requires enabling hugepages for the overcloud compute nodes.
                                    # It also requires enabling IOMMU when using the VFIO (vfio-pci) OvsDpdkDriverType.
                                    # This should be done by configuring parameters via host-config-and-reboot.yaml environment file.

    ## Attempting to deploy DPDK without appropriate values for the below parameters may lead to unstable deployments
    ## due to CPU contention of DPDK PMD threads.
    ## It is highly recommended to to enable isolcpus (via KernelArgs) on compute overcloud nodes and set the following parameters:
    #OvsDpdkSocketMemory: "" # Sets the amount of hugepage memory to assign per NUMA node.
                                   # It is recommended to use the socket closest to the PCIe slot used for the
                                   # desired DPDK NIC. Format should be comma separated per socket string such as:
                                   # "<socket 0 mem MB>,<socket 1 mem MB>", for example: "1024,0".
    #OvsPmdCoreList: "" # List or range of CPU cores for PMD threads to be pinned to. Note, NIC
                                   # location to cores on socket, number of hyper-threaded logical cores, and
                                   # desired number of PMD threads can all play a role in configuring this setting.
                                   # These cores should be on the same socket where OvsDpdkSocketMemory is assigned.
                                   # If using hyperthreading then specify both logical cores that would equal the
                                   # physical core. Also, specifying more than one core will trigger multiple PMD
                                   # threads to be spawned, which may improve dataplane performance.
    #NovaVcpuPinSet: "" # Cores to pin Nova instances to. For maximum performance, select cores
                                   # on the sam...

Read more...

Revision history for this message
Marius Cornea (mcornea) wrote :
Download full text (7.5 KiB)

(undercloud) [stack@undercloud-0 ~]$ cat /home/stack/ospd-11-vlan-dpdk-single-port-ctlplane-bonding/network-environment.yaml
resource_registry:
  # Specify the relative/absolute path to the config files you want to use for override the default.
  OS::TripleO::Compute::Net::SoftwareConfig: nic-configs/compute.yaml
  OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yaml
  OS::TripleO::NodeUserData: first-boot.yaml
  OS::TripleO::NodeExtraConfigPost: post-install.yaml

  # Network isolation configuration
  # Service section
  OS::TripleO::Network::External: /usr/share/openstack-tripleo-heat-templates/network/external.yaml
  OS::TripleO::Network::InternalApi: /usr/share/openstack-tripleo-heat-templates/network/internal_api.yaml
  OS::TripleO::Network::Tenant: /usr/share/openstack-tripleo-heat-templates/network/tenant.yaml
  OS::TripleO::Network::Management: OS::Heat::None
  OS::TripleO::Network::StorageMgmt: OS::Heat::None
  OS::TripleO::Network::Storage: OS::Heat::None

  # Port assignments for the VIPs
  OS::TripleO::Network::Ports::ExternalVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/external.yaml
  OS::TripleO::Network::Ports::InternalApiVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/internal_api.yaml
  OS::TripleO::Network::Ports::RedisVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/vip.yaml
  OS::TripleO::Network::Ports::StorageVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml
  OS::TripleO::Network::Ports::StorageMgmtVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml

  # Port assignments for the controller role
  OS::TripleO::Controller::Ports::ExternalPort: /usr/share/openstack-tripleo-heat-templates/network/ports/external.yaml
  OS::TripleO::Controller::Ports::InternalApiPort: /usr/share/openstack-tripleo-heat-templates/network/ports/internal_api.yaml
  OS::TripleO::Controller::Ports::TenantPort: /usr/share/openstack-tripleo-heat-templates/network/ports/tenant.yaml
  OS::TripleO::Controller::Ports::ManagementPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml
  OS::TripleO::Controller::Ports::StoragePort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml
  OS::TripleO::Controller::Ports::StorageMgmtPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml

  # Port assignments for the compute role
  OS::TripleO::Compute::Ports::ExternalPort: /usr/share/openstack-tripleo-heat-templates/network/ports/external.yaml
  OS::TripleO::Compute::Ports::InternalApiPort: /usr/share/openstack-tripleo-heat-templates/network/ports/internal_api.yaml
  OS::TripleO::Compute::Ports::TenantPort: /usr/share/openstack-tripleo-heat-templates/network/ports/tenant.yaml
  OS::TripleO::Compute::Ports::ManagementPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml
  OS::TripleO::Compute::Ports::StoragePort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml
  OS::TripleO::Compute::Ports::StorageMgmtPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml

  # Port assignments for service virtual IPs for the controller role
...

Read more...

Changed in tripleo:
milestone: none → queens-2
importance: Undecided → Critical
status: New → Triaged
tags: added: pike-backport-potential upgrade
Revision history for this message
Steven Hardy (shardy) wrote :

Hmm, so it's a bit confusing since /home/stack/ospd-11-vlan-dpdk-single-port-ctlplane-bonding/network-environment.yaml does appear to contain OS::TripleO::Network::Management: OS::Heat::None

That said, I would expect to get this mapping by including the rendered default environment (e.g overcloud-resource-registry-puppet.j2.yaml) so in theory those mappings should not be needed:

https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud-resource-registry-puppet.j2.yaml#L81

If it's possible to reproduce this, some more information would be helpful, specifically:

1. The rendered templates (the tarball from openstack overcloud plan export would provide this)

2. The output from openstack stack environment show overcloud

3. The Heat/Mistral logs from the undercloud

(Note the data above will include sensitive data such as passwords so should be sanitized if it's a non-test environment)

Revision history for this message
Marius Cornea (mcornea) wrote :

Attaching requested info.

Revision history for this message
Steven Hardy (shardy) wrote :

Ok so I debugged and reproduced this issue, the problem is when we define a resource_registry mapping to OS::Heat::None mixed with those that reference paths, e.g

(undercloud) [stack@undercloud ~]$ cat shtest/shtest.yaml
resource_registry:
  OS::TripleO::Network::Management: OS::Heat::None
  OS::TripleO::NodeUserData: first-boot.yaml
  OS::TripleO::Network::External: /usr/share/openstack-tripleo-heat-templates/network/external.yaml

Then deploying like this reproduces the issue:

openstack overcloud deploy --templates tripleo-heat-templates -e kubernetes.yaml -r roles_data_k8s.yaml -e tripleo-heat-templates/environments/config-download-environment.yaml -e shtest/shtest.yaml

Which is that we incorrectly prepend a path to OS::Heat::None, so it ends up like this:

OS::TripleO::Network::Management: /home/stack/shtest/OS::Heat::None

So in tripleoclient we need to detect that the value in the resource_registry isn't a path, and therefore that we should leave it alone & not convert it to an absolute path by adding tht_root.

Changed in tripleo:
assignee: nobody → Steven Hardy (shardy)
status: Triaged → Confirmed
importance: Critical → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/517040

Changed in tripleo:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (master)

Reviewed: https://review.openstack.org/517040
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=c5adb1c6551a609d6786853747407f991c2fb274
Submitter: Zuul
Branch: master

commit c5adb1c6551a609d6786853747407f991c2fb274
Author: Steven Hardy <email address hidden>
Date: Wed Nov 1 16:34:42 2017 +0000

    Don't rewrite resource_registry values that aren't paths

    If you map to e.g OS::Heat::None, the current code adds an unwanted
    path prefix to these entries if the environment triggers this legacy
    fallback path (e.g it references files outside of tht_root).

    Change-Id: Id591c1a119c3471b599dcaddb363f3d353d25fff
    Closes-Bug: #1729039

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-tripleoclient (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/519352

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-tripleoclient (stable/pike)

Reviewed: https://review.openstack.org/519352
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=fc03b76a310a1faee170567ffd690892ec8f1047
Submitter: Zuul
Branch: stable/pike

commit fc03b76a310a1faee170567ffd690892ec8f1047
Author: Steven Hardy <email address hidden>
Date: Wed Nov 1 16:34:42 2017 +0000

    Don't rewrite resource_registry values that aren't paths

    If you map to e.g OS::Heat::None, the current code adds an unwanted
    path prefix to these entries if the environment triggers this legacy
    fallback path (e.g it references files outside of tht_root).

    Change-Id: Id591c1a119c3471b599dcaddb363f3d353d25fff
    Closes-Bug: #1729039
    (cherry picked from commit c5adb1c6551a609d6786853747407f991c2fb274)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 7.3.5

This issue was fixed in the openstack/python-tripleoclient 7.3.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 8.1.0

This issue was fixed in the openstack/python-tripleoclient 8.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.