Control plane gets modified by overcloud deployment

Bug #1852509 reported by Adam Ratcliff
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Triaged
Medium
Kevin Carter

Bug Description

During a Tripleo Rocky deployment with network isolation, after creating the heat stack, the ansible configuration modifies the networking on the nodes. It caused loops and broke the control plane in my latest deployment, but my previous deployments were having various untraceable failures like docker pull failures and endpoint missing or unauthorized which I am recklessly attributing to the same cause.

In my cloud the control plane is always supposed to be on nic1, which goes to an access port tagged on the switch to vlan130. nic2 carries the other traffic via ovs-bridge vlan tagged into a trunk port on the switch.

In my latest deployment the ansible configuration on some nodes added the control plane ip address to the ovs-bridge which is attached to nic2 (there are 2 nics) creating a mess of loops and causing problems with communicating to the director. That was lucky because it clearly indicated there was a problem with the control plane and so I went to investigate - issues with previous failed deployments have left me completely confused because the symptoms were not indicating the real cause.

I remember seeing it written somewhere that the control plane should be on nic1, and I didn't expect overcloud deployment to make changes to that.

Expected behaviour: Overcloud deployment doesn't mess with control plane stuff.
Actual result: Overcloud ip address was added to ovs bridge on a controller node (and possibly others) creating loops.

Steps to reproduce:
It might be hard to force a clear case of this: I ran over 20 deployments all failing in different ways, but configure single-nic-vlans and leaving the control plane under ovs_bridge like this:
         params:
            $network_config:
              network_config:
              - type: ovs_bridge
                name: bridge_name
                use_dhcp: false
                dns_servers:
                  get_param: DnsServers
                domain:
                  get_param: DnsSearchDomains
                addresses:
                - ip_netmask:
                    list_join:
                    - /
                    - - get_param: ControlPlaneIp
                      - get_param: ControlPlaneSubnetCidr
                routes:
                - ip_netmask: 169.254.169.254/32
                  next_hop:
                    get_param: EC2MetadataIp
                members:
                - type: interface
                  name: nic2...

environment:
Rocky, tripleo ceph, neutron with open vswitch, network isolation.

http://paste.openstack.org/show/786082/

Revision history for this message
Adam Ratcliff (adamjr) wrote :

cleaned ip and mac addresses from config

Revision history for this message
wes hayutin (weshayutin) wrote :

Kevin can you please help me get this into the right hands.
Thanks

Changed in tripleo:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Kevin Carter (kevin-carter)
milestone: none → ussuri-3
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-3 → ussuri-rc3
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc3 → victoria-1
Changed in tripleo:
milestone: victoria-1 → victoria-3
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.