Comment 0 for bug 1791238

Revision history for this message
Harald Jensås (harald-jensas) wrote :

In the containerized undercloud re-run removes the undercloud_admin_host and undercloud_public_host ip addresses if config for os-net-config is changed.

The br-ctlplane interface is restarted by os-net-config and this removes the undercloud_admin_host and undercloud_public_host ip addresses set up by keepalived. The install/update operation fails later on because services fail to connect to the ip that is no longer there.

Reproduce:

1. Deploy undercloud with the following configuration

[DEFAULT]

enable_routed_networks = false
enable_tempest = false
enable_ui = false
inspection_interface = br-ctlplane
ipxe_enabled = true
local_interface = eth1
local_ip = 172.20.0.200/26
local_mtu = 1500
local_subnet = ctlplane-subnet
overcloud_domain_name = localdomain
scheduler_max_attempts = 3
subnets = ctlplane-subnet
undercloud_admin_host = 172.20.0.201
undercloud_debug = true
undercloud_hostname = container-undercloud.lab.example.com
undercloud_nameservers = 172.20.0.254
undercloud_ntp_servers = 0.se.pool.ntp.org
undercloud_public_host = 172.20.0.203

[ctlplane-subnet]
cidr = 172.20.0.192/26
dhcp_start = 172.20.0.210
dhcp_end = 172.20.0.219
inspection_iprange = 172.20.0.220,172.20.0.229
gateway = 172.20.0.254
masquerade = true

2. Change the gateway address in [ctlplane-subnet]

sed -i s/undercloud_nameservers = 172.20.0.254/undercloud_nameservers = 192.168.122.1/g /home/stack/undercloud.conf

3. Re-run undercloud install

openstack undercloud install

RESULTS:

1. The os-net-config is config.json is updated with the new dnsserver.

Every 5.0s: diff -aur /etc/os-net-config/config.json /tmp/os-net-config.json.orig Fri Sep 7 08:51:26 2018

--- /etc/os-net-config/config.json 2018-09-07 08:45:39.054174371 +0200
+++ /tmp/os-net-config.json.orig 2018-09-07 08:17:38.597808977 +0200
@@ -1 +1 @@
-{"network_config": [{"addresses": [{"ip_netmask": "172.20.0.200/26"}], "dns_servers": ["192.168.122.1"], "members": [{"mtu": 1500, "name": "eth1", "primary": true, "type": "interface"}], "name": "br-ctlplane",
"ovs_extra": ["br-set-external-id br-ctlplane bridge-id br-ctlplane"], "routes": [], "type": "ovs_bridge", "use_dhcp": false}]}
+{"network_config": [{"addresses": [{"ip_netmask": "172.20.0.200/26"}], "dns_servers": ["172.20.0.254"], "members": [{"mtu": 1500, "name": "eth1", "primary": true, "type": "interface"}], "name": "br-ctlplane", "
ovs_extra": ["br-set-external-id br-ctlplane bridge-id br-ctlplane"], "routes": [], "type": "ovs_bridge", "use_dhcp": false}]}

2. After os-net-config applied config the keepalived VIPs are gone:

Every 2.0s: ip addr show br-ctlplane Fri Sep 7 08:51:08 2018

47: br-ctlplane: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 52:54:00:7a:f6:c5 brd ff:ff:ff:ff:ff:ff
    inet 172.20.0.200/26 brd 172.20.0.255 scope global br-ctlplane
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe7a:f6c5/64 scope link
       valid_lft forever preferred_lft forever

3. The upgrade is stuck on starting the containers:

TASK [Start containers for step 3] **********************************************

4. Log's show that services are failing to connect to the database via the keepalived VIPs:
/var/log/containers/nova/nova-compute.log:2018-09-07 08:52:47.462 6 ERROR oslo_service.periodic_task RemoteError: Remote error: DBConnectionError (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.20.0.201' ([Errno 113] EHOSTUNREACH)") (Background on this error at: http://sqlalche.me/e/e3q8)