Adding new infra host is failing

Bug #1874067 reported by YG Kumar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Invalid
Undecided
Unassigned

Bug Description

Hi,

I have a Train openstack ansible setup with git branch 20.0.2. This setup has two infra hosts and it is working fine. When I try to add a third new infra host and trying to run the following playbook command:
--------------

$:openstack-ansible setup-hosts.yml --limit "c3v*"
-------------

It is adding the new host successfully to the inventory. But when it creates the containers inside this host, it is failing to add the "eth0" interface of the containers to the lxcbr0 bridge inside the host. As a result, the default route is missing inside the containers and the playbook is failing at the point where it tries to download and update the apt packages because of no access to internet via the host. I have pasted the route information of one of these containers below:
-----------

root@c3v-glance-container-95beb229:~# ip route show
172.29.236.0/22 dev eth1 proto kernel scope link src 172.29.239.119
172.29.244.0/22 dev eth2 proto kernel scope link src 172.29.244.220
--------------

As you can see the default route is missing and so is the "eth0" interface as shown below:

---------
root@c3v-glance-container-95beb229:~# ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
185: eth1@if186: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link-netnsid 0
    inet 172.29.239.119/22 brd 172.29.239.255 scope global eth1
       valid_lft forever preferred_lft forever
187: eth2@if188: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link-netnsid 0
    inet 172.29.244.220/22 brd 172.29.247.255 scope global eth2
       valid_lft forever preferred_lft forever
----------------

Below is the route info from one of the working host and its containers:
----------
root@c1v-glance-container-24baf013:~# ip route show
default via 10.0.3.1 dev eth0 proto dhcp src 10.0.3.141 metric 20
10.0.3.0/24 dev eth0 proto kernel scope link src 10.0.3.141
10.0.3.1 dev eth0 proto dhcp scope link src 10.0.3.141 metric 20
172.29.236.0/22 dev eth1 proto kernel scope link src 172.29.237.115
172.29.244.0/22 dev eth2 proto kernel scope link src 172.29.246.95
----------
root@c1v-glance-container-24baf013:~# ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
96: eth0@if97: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link-netnsid 0
    inet 10.0.3.141/24 brd 10.0.3.255 scope global dynamic eth0
       valid_lft 2133sec preferred_lft 2133sec
98: eth1@if99: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link-netnsid 0
    inet 172.29.237.115/22 brd 172.29.239.255 scope global eth1
       valid_lft forever preferred_lft forever
100: eth2@if101: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link-netnsid 0
    inet 172.29.246.95/22 brd 172.29.247.255 scope global eth2
       valid_lft forever preferred_lft forever
------------------

I tried to remove the ansible facts and the inventory and rerun the playbook, but the issue still exists. The OS is ubuntu 18.04 of the host and the container as well.

Thanks
Kumar

Revision history for this message
YG Kumar (ygk-kmr) wrote :

Hi,

It seems that the "lxc-dnsmasq.slice" is in faulty state:
-------------
root@c3v:~# systemctl status lxc-dnsmasq.slice
● lxc-dnsmasq.slice
   Loaded: loaded
   Active: active since Tue 2020-04-21 09:47:43 EDT; 4min 20s ago
    Tasks: 0
   CGroup: /lxc.slice/lxc-dnsmasq.slice

Apr 21 09:52:01 c3v lxc-system-manage[22528]: Removing LXC IPtables rules.
Apr 21 09:52:01 c3v lxc-system-manage[22528]: LXC IPtables rules removed.
Apr 21 09:52:03 c3v lxc-system-manage[22595]: Creating LXC IPtables rules.
Apr 21 09:52:03 c3v lxc-system-manage[22595]: LXC IPtables rules created.
Apr 21 09:52:03 c3v lxc-system-manage[22620]: Starting LXC dnsmasq.
Apr 21 09:52:03 c3v lxc-system-manage[22620]: dnsmasq: failed to create listening socket for 10.0.3.1: Address already in use
Apr 21 09:52:03 c3v dnsmasq[22624]: failed to create listening socket for 10.0.3.1: Address already in use
Apr 21 09:52:03 c3v dnsmasq[22624]: FAILED to start up
-----------------------

I have killed the processes as shown below and then restarted the service. And now it works fine.

-------------
Apr 21 09:52:03 c3v lxc-system-manage[22631]: LXC IPtables rules removed.
root@c3v:~# netstat -alpn | grep 10.0.3.1
tcp 0 0 10.0.3.1:53 0.0.0.0:* LISTEN 5547/named
udp 0 0 10.0.3.1:53 0.0.0.0:* 5547/named
root@c3v:~# kill -9 5547
root@c3v:~# systemctl restart lxc-dnsmasq.slice
root@c3v:~# systemctl status lxc-dnsmasq.slice
● lxc-dnsmasq.slice
   Loaded: loaded
   Active: active since Tue 2020-04-21 09:52:36 EDT; 5s ago
    Tasks: 1
   CGroup: /lxc.slice/lxc-dnsmasq.slice
           └─lxc-dnsmasq.service
             └─24212 dnsmasq --user=lxc-dnsmasq --pid-file=/run/lxc/dnsmasq.pid --conf-file= --listen-address=10.0.3.1 --dhcp-range=10.0.3.2,10.0.3.253 --dhcp-option=6,10.0.3.1 --dhcp-

Apr 21 09:52:36 c3v lxc-system-manage[24197]: LXC IPtables rules created.
Apr 21 09:52:36 c3v lxc-system-manage[24208]: Starting LXC dnsmasq.
Apr 21 09:52:36 c3v dnsmasq[24212]: started, version 2.79 cachesize 150
Apr 21 09:52:36 c3v dnsmasq[24212]: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC loop-detect inotify
Apr 21 09:52:36 c3v dnsmasq-dhcp[24212]: DHCP, IP range 10.0.3.2 -- 10.0.3.253, lease time 1h
Apr 21 09:52:36 c3v lxc-system-manage[24208]: dnsmasq started.
------------------------------------

Changed in openstack-ansible:
status: New → Invalid
Revision history for this message
YG Kumar (ygk-kmr) wrote :

You can close the issue

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.