On restoring vrouter service, haproxy sometimes fails to start for netns SI

Bug #1374395 reported by Vedamurthy Joshi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R1.1
Invalid
High
Divakar Dharanalakota
Trunk
Invalid
High
Divakar Dharanalakota

Bug Description

R1.10 Buid 41 (3 config node, 3 compute node) Ubuntu Havana setup

The 3 compute nodes are nodeh4, nodeh5, nodeg29

200.1.1.2 and 200.1.1.3 are two backend vms on nodeh4 and nodeh5 serving a pool sshpool1
Initially haproxy active/passive is set on nodeh5 and nodeg29

I stopped vrouter agent service on nodeh5 and then on nodeg29

Then brought up agent on nodeh5, nodeg29 and restart agent on nodeh4

Now, for sshpool1 , haproxy was not running on nodeh5(nor nodeh4), but was running only on nodeg29
Divakar is aware of this.

Logs will be in http://10.204.216.50/Docs/bugs/#

On nodeh5's agent introspect, we see this :

Traceback (most recent call last):
  File "/usr/bin/opencontrail-vrouter-netns", line 9, in <module>
    load_entry_point('opencontrail-vrouter-netns==0.1', 'console_scripts', 'opencontrail-vrouter-netns')()
  File "/usr/lib/python2.7/dist-packages/opencontrail_vrouter_netns/vrouter_netns.py", line 436, in main
    vrouter_netns.args.func()
  File "/usr/lib/python2.7/dist-packages/opencontrail_vrouter_netns/vrouter_netns.py", line 395, in create
    netns_mgr.destroy()
  File "/usr/lib/python2.7/dist-packages/opencontrail_vrouter_netns/vrouter_netns.py", line 176, in destroy
    self.ip_ns.netns.delete(self.namespace)
  File "/usr/lib/python2.7/dist-packages/opencontrail_vrouter_netns/linux/ip_lib.py", line 480, in delete
    self._as_root('delete', name, use_root_namespace=True)
  File "/usr/lib/python2.7/dist-packages/opencontrail_vrouter_netns/linux/ip_lib.py", line 207, in _as_root
    kwargs.get('use_root_namespace', False))
  File "/usr/lib/python2.7/dist-packages/opencontrail_vrouter_netns/linux/ip_lib.py", line 58, in _as_root
    namespace)
  File "/usr/lib/python2.7/dist-packages/opencontrail_vrouter_netns/linux/ip_lib.py", line 69, in _execute
    root_helper=root_helper)
  File "/usr/lib/python2.7/dist-packages/opencontrail_vrouter_netns/linux/utils.py", line 82, in execute
    raise RuntimeError(m)
RuntimeError:
Command: ['sudo', 'ip', 'netns', 'delete', 'vrouter-ffef091d-efd1-41d6-92a1-75eaed4891ab']
Exit code: 1
Stdout: ''
Stderr: 'Cannot remove /var/run/netns/vrouter-ffef091d-efd1-41d6-92a1-75eaed4891ab: Device or resource busy\n'

root@nodec22:~# nova list
+--------------------------------------+--------------+--------+------------+-------------+---------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+--------------+--------+------------+-------------+---------------------+
| 95019dc8-6cdc-42a6-b58e-c32b4c6aa100 | backend_vm1 | ACTIVE | None | Running | backend1=200.1.1.2 |
| 9d99dd9b-dfd1-470b-abf6-af1175310395 | backend_vm2 | ACTIVE | None | Running | backend1=200.1.1.3 |
| 6394e69d-6871-455d-9c9b-1a748a415ac7 | frontend_vm1 | ACTIVE | None | Running | frontend1=201.1.1.2 |
| 1605b011-99a1-4507-a7e3-14f93ec578de | frontend_vm2 | ACTIVE | None | Running | frontend1=201.1.1.5 |
| 2ed3df86-e796-4d59-89ed-4bd7df6cf124 | frontend_vm3 | ACTIVE | None | Running | frontend1=201.1.1.6 |
+--------------------------------------+--------------+--------+------------+-------------+---------------------+
root@nodec22:~# neutron lb-vip-list
+--------------------------------------+--------+-----------+----------+----------------+--------+
| id | name | address | protocol | admin_state_up | status |
+--------------------------------------+--------+-----------+----------+----------------+--------+
| f30477bb-8458-44c8-9823-ba3c8f407b64 | sshvip | 201.1.1.4 | TCP | True | ACTIVE |
| ea60c9ca-77e8-4b21-af89-ed0fb0050b91 | myvip | 201.1.1.3 | HTTPS | True | ACTIVE |
+--------------------------------------+--------+-----------+----------+----------------+--------+
root@nodec22:~# neutron lb-pool-list
+--------------------------------------+----------+-------------+----------+----------------+--------+
| id | name | lb_method | protocol | admin_state_up | status |
+--------------------------------------+----------+-------------+----------+----------------+--------+
| 4e3a7ff5-a144-4b2c-9a1f-8c29e5947dc1 | mypool1 | ROUND_ROBIN | HTTP | True | ACTIVE |
| 1f678989-f8ca-4929-811e-a4a2e8786cf5 | sshpool1 | ROUND_ROBIN | TCP | True | ACTIVE |
+--------------------------------------+----------+-------------+----------+----------------+--------+
root@nodec22:~#

root@nodec22:~# neutron lb-pool-show sshpool1
+----------------+--------------------------------------+
| Field | Value |
+----------------+--------------------------------------+
| admin_state_up | True |
| description | |
| id | 1f678989-f8ca-4929-811e-a4a2e8786cf5 |
| lb_method | ROUND_ROBIN |
| members | ea4dc42f-4463-45ab-a602-d43485c3a2c6 |
| | f708ff9e-6d81-45bc-9aea-16a01b2f9f26 |
| name | sshpool1 |
| protocol | TCP |
| provider | opencontrail |
| status | ACTIVE |
| subnet_id | 4f0ff0bf-4263-4281-bc78-2b20c497317d |
| tenant_id | a6345dd0a98c4cc58a440b49c9c2d5f5 |
| vip_id | f30477bb-8458-44c8-9823-ba3c8f407b64 |
+----------------+--------------------------------------+
root@nodec22:~#

No haproxy on nodeh4 for sshpool1:

root@nodeh4:~# ps aux |grep haproxy
haproxy 1645 0.0 0.0 29668 2156 ? Ss Sep25 0:18 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid
root 13689 0.0 0.0 8112 920 pts/3 S+ 04:01 0:00 grep --color=auto haproxy
nobody 25857 0.0 0.0 33484 6752 ? Ss 02:32 0:00 haproxy -f /var/lib/contrail/loadbalancer/4e3a7ff5-a144-4b2c-9a1f-8c29e5947dc1/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/4e3a7ff5-a144-4b2c-9a1f-8c29e5947dc1/etc/haproxy/haproxy.cfg.pid -sf 25840
root@nodeh4:~#

No haproxy on nodeh5 for sshpool1:

root@nodeh5:/var/log/contrail# ps aux |grep haproxy
haproxy 1778 0.0 0.0 28940 960 ? Ss Sep25 0:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid
nobody 18936 0.0 0.0 33472 6820 ? Ss 02:47 0:00 haproxy -f /var/lib/contrail/loadbalancer/4e3a7ff5-a144-4b2c-9a1f-8c29e5947dc1/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/4e3a7ff5-a144-4b2c-9a1f-8c29e5947dc1/etc/haproxy/haproxy.cfg.pid
root 21967 0.0 0.0 8108 924 pts/2 S+ 04:02 0:00 grep --color=auto haproxy
root@nodeh5:/var/log/contrail#

root@nodeg29:~# ps aux |grep hapro
haproxy 1592 0.0 0.0 29668 2148 ? Ss Sep25 0:11 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid
root 10954 0.0 0.0 8112 924 pts/4 S+ 03:27 0:00 grep --color=auto hapro
nobody 29364 0.0 0.0 33468 6812 ? Ss 03:03 0:00 haproxy -f /var/lib/contrail/loadbalancer/4e3a7ff5-a144-4b2c-9a1f-8c29e5947dc1/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/4e3a7ff5-a144-4b2c-9a1f-8c29e5947dc1/etc/haproxy/haproxy.cfg.pid -sf 29347
nobody 31900 0.0 0.0 28980 964 ? Ss 03:08 0:00 haproxy -f /var/lib/contrail/loadbalancer/1f678989-f8ca-4929-811e-a4a2e8786cf5/etc/haproxy/haproxy.cfg -D -p /var/lib/contrail/loadbalancer/1f678989-f8ca-4929-811e-a4a2e8786cf5/etc/haproxy/haproxy.cfg.pid -sf 31885
root@nodeg29:~#

Revision history for this message
Divakar Dharanalakota (ddivakar) wrote :

We need to update the "ip" utility to latest version. The existing "ip" utility version has name space deletion bug. Once updated to latest, this issue will not be seen.
-Divakar

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.