Can't SSH into new VM, if DHCP agent's iptables-restore call fails

Bug #1669722 reported by Nell Jerram
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-calico
New
Undecided
Unassigned

Bug Description

Failed to SSH into 10.28.0.8:
10.28.0.8 is the IP just allocated to a newly created VM.

Observed during a test last night, with
- networking-calico 1.4.1
- Liberty OpenStack on CentOS 7.3.

VM was 'drb9q9', created at 01:03:10.

DHCP agent log for the relevant compute node shows an iptables-restore failure at 01:03:13:

2017-03-03 01:03:13,324 [23437552] (ERROR) utils:
Command: ['sudo', 'iptables-restore', '-n']
Exit code: 1
Stdin: # Generated by iptables_manager
*filter
-D FORWARD 1
-I FORWARD 3 -j felix-FORWARD
-I INPUT 1 -j calico-dhcp-agen-INPUT
-D INPUT 3
-D OUTPUT 1
-I OUTPUT 3 -j felix-OUTPUT
COMMIT
# Completed by iptables_manager
# Generated by iptables_manager
*nat
-I OUTPUT 1 -j calico-dhcp-agen-OUTPUT
-D OUTPUT 3
-D POSTROUTING 1
-I POSTROUTING 3 -j felix-POSTROUTING
-I PREROUTING 1 -j calico-dhcp-agen-PREROUTING
-D PREROUTING 3
COMMIT
# Completed by iptables_manager
# Generated by iptables_manager
*raw
-I OUTPUT 1 -j calico-dhcp-agen-OUTPUT
-D OUTPUT 3
-I PREROUTING 1 -j calico-dhcp-agen-PREROUTING
-D PREROUTING 3
COMMIT
# Completed by iptables_manager

Stdout:
Stderr: iptables-restore: line 9 failed

2017-03-03 01:03:13,325 [23437552] (ERROR) iptables_manager: IPTablesManager.apply failed to apply the following set of iptables rules:
      5. -I INPUT 1 -j calico-dhcp-agen-INPUT
      6. -D INPUT 3
      7. -D OUTPUT 1
      8. -I OUTPUT 3 -j felix-OUTPUT
      9. COMMIT
     10. # Completed by iptables_manager
     11. # Generated by iptables_manager
     12. *nat
     13. -I OUTPUT 1 -j calico-dhcp-agen-OUTPUT
     14. -D OUTPUT 3
2017-03-03 01:03:13,325 [23437552] (ERROR) agent: Unable to restart dhcp for 4e0cb6bb-bbf1-48e9-90d1-a4afa34e2fc6.
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Traceback (most recent call last):
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py", line 115, in call_driver
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent getattr(driver, action)(**action_kwargs)
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 138, in restart
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent self.enable()
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 206, in enable
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent interface_name = self.device_manager.setup(self.network)
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 1233, in setup
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent self.fill_dhcp_udp_checksums(namespace=network.namespace)
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 1299, in fill_dhcp_udp_checksums
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent iptables_mgr.apply()
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 423, in apply
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent return self._apply()
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 431, in _apply
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent return self._apply_synchronized()
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 508, in _apply_synchronized
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent '\n'.join(log_lines))
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in __exit__
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent six.reraise(self.type_, self.value, self.tb)
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 487, in _apply_synchronized
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent run_as_root=True)
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 159, in execute
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent raise RuntimeError(m)
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent RuntimeError:
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Command: ['sudo', 'iptables-restore', '-n']
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Exit code: 1
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Stdin: # Generated by iptables_manager
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent *filter
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D FORWARD 1
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I FORWARD 3 -j felix-FORWARD
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I INPUT 1 -j calico-dhcp-agen-INPUT
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D INPUT 3
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D OUTPUT 1
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I OUTPUT 3 -j felix-OUTPUT
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent COMMIT
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Completed by iptables_manager
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Generated by iptables_manager
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent *nat
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I OUTPUT 1 -j calico-dhcp-agen-OUTPUT
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D OUTPUT 3
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D POSTROUTING 1
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I POSTROUTING 3 -j felix-POSTROUTING
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I PREROUTING 1 -j calico-dhcp-agen-PREROUTING
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D PREROUTING 3
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent COMMIT
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Completed by iptables_manager
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Generated by iptables_manager
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent *raw
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I OUTPUT 1 -j calico-dhcp-agen-OUTPUT
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D OUTPUT 3
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I PREROUTING 1 -j calico-dhcp-agen-PREROUTING
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D PREROUTING 3
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent COMMIT
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Completed by iptables_manager
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Stdout:
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Stderr: iptables-restore: line 9 failed
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent
2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent

syslog shows that there was no dnsmasq restart between 00:56:38 and 01:27:09. So it looks like the iptables-restore failure caused the DHCP agent to bail completely on the setup that is needed to support a new VM IP.

Revision history for this message
Nell Jerram (neil-jerram) wrote :

Actually I believe this issue has already been addressed in networking-calico master by https://review.openstack.org/#/c/433702/.

description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.