RHOSP-Undercloud upgrade failed

Bug #1792036 reported by shajuvk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Won't Fix
High
shajuvk
R3.2
Fix Committed
High
shajuvk

Bug Description

redhat under cloud upgrade failed while upgrading from 7.3 to 7.5

support case: https://access.redhat.com/support/cases/#/case/02120426

Error:
====

2018-06-13 20:17:03,175 INFO: 2018-06-13 20:17:03 - Could not retrieve fact='current_nova_host', resolution='<anonymous>': uninitialized constant Tempfile
2018-06-13 20:17:04,404 INFO: 2018-06-13 20:17:04 - ^[[1;31mError: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type sysctl::value at /etc/puppet/manifests/puppet-stack-config.pp:24 on node undercloud.example.com^[[0m

Solution provided by RH:
=============
Reinstalling some set of packages. Please check below for details.

    # for rpm in $(rpm -Va 2>/dev/null| awk '/missing/ {print $NF}'); do rpm -qf $rpm 2>/dev/null; done;

This will take a few minutes to complete and should output a list of rpm names (with a lot of duplicates). These are the RPMs that need to be reinstalled.

=====

Shaju,

  I found an explanation for this neutron issue. It appears that, in Ocata (11), the CIDR default was changed.

undercloud.conf
------------------------
# Network CIDR for the Neutron-managed network for Overcloud
# instances. This should be the subnet used for PXE booting. The
# current default for this value is 192.0.2.0/24, but this is
# deprecated due to it being a non-routable CIDR under RFC 5737. The
# default value for this option will be changed in the Ocata release.
# A different, valid CIDR should be selected to avoid problems. If an
# overcloud has already been deployed with the 192.0.2.0/24 CIDR and
# therefore the CIDR cannot be changed, you must set this option to
# 192.0.2.0/24 explicitly to avoid it changing in future releases, and
# all other network options related to the CIDR (e.g. local_ip) must
# also be set to maintain a valid configuration. (string value)
#network_cidr = 192.0.2.0/24
------------------------

As per this comment in the undercloud.conf, if you have already deployed your environment, you will need to revert the network back to the original 192.0.2.0/24 subnet.

Can you modify your undercloud.conf file, and uncomment the following parameters:

  'local_ip'
  'network_gateway'
  'undercloud_public_vip'
  'undercloud_admin_vip'
  'network_cidr'
  'masquerade_network'
  'dhcp_start'
  'dhcp_end'
  'inspection_iprange'

Make sure they are pointing to the correct IP address in that subnet. Once complete, you can retry the deploy.

Thanks,
  Jack Waterworth, Red Hat Certified Architect
  Principal OpenStack Technical Support Engineer
  Red Hat Enterprise Cloud Support North America
Reply
MESSAGE (ASSOCIATE)
Waterworth, Jack on Jul 20 2018 at 02:37 PM -06:00
Thanks, Shaju.

How did you manage to get past the original puppet error? All the logs pointed to the same issue as far as I can tell. Did you make any adjustments to the files I mentioned?

The new error indicates you're trying to change a subnet from to a new address. The new one is 192.168.24.0/24 the old one is 192.0.2.0/24. Is this still during the undercloud upgrade? Did you make any modifications to your templates at all that would explain this change?

Im digging into this new data now.

-- Jack

VK, Shaju on Jul 20 2018 at 12:38 PM -06:00
Thank you Jack. Appreciate your help.
Upgrade didn't hit the previously mentioned puppet issue. It failed because of neutron network ip mis-match.

I've attached the output of for loop, filtered output after removing duplicates, instack file , sosreport -a output and undercloud.conf. Hope this might help you.
I am available for online debugging. Please let me know if you like to take a look.

 [stack@undercloud ~]$ hiera sysctl_settings
nil
[stack@undercloud ~]$ ls -l /usr/share/openstack-puppet/
total 4
drwxr-xr-x. 2 root root 6 May 1 2017 cd
drwxr-xr-x. 52 root root 4096 Jun 28 23:02 modules
[stack@undercloud ~]$

Thanks,
Shaju
Reply
MESSAGE (ASSOCIATE)
Waterworth, Jack on Jul 20 2018 at 08:55 AM -06:00

======
Shaju,

  Here is the error that is preventing you from upgrading your environment:

-------------------------
2018-07-18 18:30:54,069 INFO: 2018-07-18 18:30:54 - ^[[1;31mError: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type sysctl::value at /etc/puppet/manifests/puppet-stack-config.pp:24 on node undercloud.example.com^[[0m
-------------------------

Drilling this down into the puppet files, I can see that this is the line that is failing:

-------------------------
 24 ensure_resource('sysctl::value', 'net.ipv4.ip_forward', { 'value' => 1 })
-------------------------

The error seems to indicate that puppet does not understand the resource type "sysctl:value". This should be handled be understood by puppet on the undercloud.

Looking into your sosreport, i noticed that you have a lot of missing puppet files on the system, including the files related to sysctl.

-------------------------
missing /usr/share/openstack-puppet/modules/sysctl
missing /usr/share/openstack-puppet/modules/sysctl/Gemfile
missing /usr/share/openstack-puppet/modules/sysctl/README.md
missing /usr/share/openstack-puppet/modules/sysctl/Rakefile
missing /usr/share/openstack-puppet/modules/sysctl/lib
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet/provider
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet/provider/sysctl
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet/provider/sysctl/parsed.rb
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet/provider/sysctl_runtime
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet/provider/sysctl_runtime/sysctl_runtime.rb
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet/type
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet/type/sysctl.rb
missing /usr/share/openstack-puppet/modules/sysctl/lib/puppet/type/sysctl_runtime.rb
missing /usr/share/openstack-puppet/modules/sysctl/manifests
missing /usr/share/openstack-puppet/modules/sysctl/manifests/base.pp
missing /usr/share/openstack-puppet/modules/sysctl/manifests/value.pp
missing /usr/share/openstack-puppet/modules/sysctl/manifests/values.pp
missing /usr/share/openstack-puppet/modules/sysctl/metadata.json
-------------------------

Infact, there are 601 missing files on your system that the rpm database believes should exist. I checked your RPMs and the correct things are installed

-------------------------
puppet-sysctl-0.0.11-1.el7ost.noarch Fri Apr 28 18:33:05 2017
-------------------------

I suspect reinstalling these packages, and restoring the missing files, will help to resolve your issue.

Could you run the following command as the root user:

    # for rpm in $(rpm -Va 2>/dev/null| awk '/missing/ {print $NF}'); do rpm -qf $rpm 2>/dev/null; done;

This will take a few minutes to complete and should output a list of rpm names (with a lot of duplicates). These are the RPMs that need to be reinstalled. Could you upload this list to the case?

I would also like to see the output of the following command:

    # hiera sysctl_settings

and as we attempt RCA, it may be good to determine exactly how deep the missing files go. Could you run the following commands:

    # ls -l /usr/share/openstack-puppet/

Once we have this data, we can work on reinstalling the rpms and retrying the upgrade.

The command we will use to reinstall the rpms is:

    # yum reinstall <packagename>

If you need a remote session to help you through these few steps, let me know and i'll be happy to join.

Thanks,
  Jack Waterworth, Red Hat Certified Architect
  Principal OpenStack Technical Support Engineer
  Red Hat Enterprise Cloud Support North America

Tags: upgrade rhosp
shajuvk (shajuvk)
Changed in juniperopenstack:
importance: Undecided → High
information type: Proprietary → Public
Jeba Paulaiyan (jebap)
tags: added: upgrade
Changed in juniperopenstack:
status: New → Won't Fix
Revision history for this message
shajuvk (shajuvk) wrote :

if the system is in 192.0.2.x network please use this series ip address in undercloud.conf

As per this comment in the undercloud.conf, if you have already deployed your environment, you will need to revert the network back to the original 192.0.2.0/24 subnet.

Can you modify your undercloud.conf file, and uncomment the following parameters:

  'local_ip'
  'network_gateway'
  'undercloud_public_vip'
  'undercloud_admin_vip'
  'network_cidr'
  'masquerade_network'
  'dhcp_start'
  'dhcp_end'
  'inspection_iprange'

Thanks,
Shaju

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.