libvirt error when create VM with 2 NICs under DevStack

Bug #1220856 reported by Paul Michali
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Ryota Mibu
Icehouse
Fix Released
Undecided
Unassigned

Bug Description

Under a stock DevStack setup on bare metal, I started stack.sh, which creates a public network (w/o DHCP) and a private network (with DHCP) and a router. On the host I created a br-ex and added eth3 to the bridge. The public network is connected via a physical switch, to another host (also running Devstack).

I am able to create VMs, and ping between them, and I can also (with the new VPNaaS feature under development) ping VMs over the public (provider?) network.

If I create a VM (seen this with cirros and others) and specify the private network, or create a VM with the public network, they launch fine. However, if I try to boot an instance with two NICs, the launch fails:

 localnet=`neutron net-list | grep private | cut -f 2 -d'|' | cut -f 2 -d' '`
 nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 mary --nic net-id=$localnet

pubnet=`neutron net-list | grep public | cut -f 2 -d'|' | cut -f 2 -d' '`
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 peter --nic net-id=$pubnet

nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 paul --nic net-id=$localnet --nic net-id=$pubnet

nova list
+--------------------------------------+-------+--------+------------+-------------+---------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+-------+--------+------------+-------------+---------------------+
| 0fc9c41c-fcd7-45a8-ae10-568b506a331f | mary | ACTIVE | None | Running | private=10.2.0.4 |
| 76925e14-a0b7-4285-9197-5ff0e92f5bb4 | paul | ERROR | None | NOSTATE | |
| c8cd45a4-10d9-4d8c-9a5a-f806e63683c0 | peter | ACTIVE | None | Running | public=172.24.4.235 |
+--------------------------------------+-------+--------+------------+-------------+---------------------+

Looking at the logs, I see that the screen-n-cpu.log reports an error and has a traceback:

2013-09-04 18:01:47.963 ERROR nova.compute.manager [req-187bbd89-f40d-4637-b549-0edfc9b66419 admin admin] [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] Instance failed to spawn
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] Traceback (most recent call last):
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/opt/stack/nova/nova/compute/manager.py", line 1293, in _spawn
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] block_device_info)
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 1699, in spawn
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] block_device_info)
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2697, in _creat\
e_domain_and_network
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] domain = self._create_domain(xml, instance=instance, power_on=power_on\
)
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2652, in _creat\
e_domain
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] domain.XMLDesc(0))
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2647, in _creat\
e_domain
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] domain.createWithFlags(launch_flags)
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 17\
9, in doit
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] result = proxy_call(self._autowrap, f, *args, **kwargs)
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 13\
9, in proxy_call
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] rv = execute(f,*args,**kwargs)
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 77\
, in tworker
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] rv = meth(*args,**kwargs)
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] File "/usr/lib/python2.7/dist-packages/libvirt.py", line 581, in createW\
ithFlags
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed',\
 dom=self)
2013-09-04 18:01:47.963 9855 TRACE nova.compute.manager [instance: 76925e14-a0b7-4285-9197-5ff0e92f5bb4] libvirtError: internal error Cannot instantiate filter due to unresolvable\
 variables: DHCPSERVER

There is no DHCP server on the public network, but there is one on the local network.

This is consistently reproducible and occurs with other image types.

The nova code is off the master branch (havana) with the SHA1 of 86c97ff from 6 days ago.

tags: added: libvirt
Revision history for this message
Craig Anderson (canderso) wrote :

I found the problem in another bug report. See https://bugs.launchpad.net/neutron/+bug/1186557

Revision history for this message
Craig Anderson (canderso) wrote :

Actually since this is a nova problem and not neutron issue, I'll elaborate on the problem here.

The problem is in nova/virt/libvirt/firewall.py (Grizzly):

allow_dhcp = False
for (network, mapping) in network_info:
    if mapping['dhcp_server']:
        allow_dhcp = True
        break

Looping through each network, if any of them happens to have dhcp_server, then it is setting them for ALL. Hence you get nova-base for all your vif filters, instead of nova-base for some and nova-nodhcp for others.

Revision history for this message
Craig Anderson (canderso) wrote :

Master has the same problem. base_filter is set only once, then it loops through each vif applying the same base_filter, instead of a per-vif base_filter.

base_filter = self.get_base_filter_list(instance, allow_dhcp)

for vif in network_info:
     self._define_filter(self._get_instance_filter_xml(instance,
                                                              base_filter,
                                                              vif))

Revision history for this message
Craig Anderson (canderso) wrote :

I revised the function as follows on my grizzly install, which resolved the issue for me:

    def setup_basic_filtering(self, instance, network_info):
        """Set up basic filtering (MAC, IP, and ARP spoofing protection)."""
        LOG.info(_('Called setup_basic_filtering in nwfilter'),
                 instance=instance)

        if self.handle_security_groups:
            # No point in setting up a filter set that we'll be overriding
            # anyway.
            return

        LOG.info(_('Ensuring static filters'), instance=instance)
        self._ensure_static_filters()

        nodhcp_base_filter = self.get_base_filter_list(instance, False)
        dhcp_base_filter = self.get_base_filter_list(instance, True)

        for (network, mapping) in network_info:
            if mapping['dhcp_server']:
                self._define_filter(self._get_instance_filter_xml(instance,
                                                              dhcp_base_filter,
                                                              network,
                                                              mapping))
            else:
                self._define_filter(self._get_instance_filter_xml(instance,
                                                              nodhcp_base_filter,
                                                              network,
                                                              mapping))

Revision history for this message
Michael H Wilson (geekinutah) wrote :

Craig,

Can you propose a patch to nova? I don't see any patch context in your last comment, but it seems sane enough to me. Let me know if you are not able to proposed the patch and we'll work something else out.

Revision history for this message
Ryota Mibu (r-mibu) wrote :

Hi Carig,

Thanks for pointing the issue.
I also wrote patch for master [1], because the network info model has been changed [2]. It works for me.
I hope this fix merged into mainline and backported to havana.

[1]
diff --git a/nova/virt/libvirt/firewall.py b/nova/virt/libvirt/firewall.py
index 1cbba78..58756db 100644
--- a/nova/virt/libvirt/firewall.py
+++ b/nova/virt/libvirt/firewall.py
@@ -117,20 +117,18 @@ class NWFilterFirewall(base_firewall.FirewallDriver):
         LOG.info(_('Ensuring static filters'), instance=instance)
         self._ensure_static_filters()

- allow_dhcp = False
+ nodhcp_base_filter = self.get_base_filter_list(instance, False)
+ dhcp_base_filter = self.get_base_filter_list(instance, True)
+
         for vif in network_info:
- if not vif['network'] or not vif['network']['subnets']:
- continue
+ _base_filter = nodhcp_base_filter
             for subnet in vif['network']['subnets']:
                 if subnet.get_meta('dhcp_server'):
- allow_dhcp = True
+ _base_filter = dhcp_base_filter
                     break

- base_filter = self.get_base_filter_list(instance, allow_dhcp)
-
- for vif in network_info:
             self._define_filter(self._get_instance_filter_xml(instance,
- base_filter,
+ _base_filter,
                                                               vif))

     def _get_instance_filter_parameters(self, vif):

[2] https://review.openstack.org/#/c/38589/

Revision history for this message
Solly Ross (sross-7) wrote :

Ryota Mibu: We use Gerrit for submitting patches for code review. Can you please submit this through OpenStack's Gerrit? You can find more information here: https://wiki.openstack.org/wiki/GerritJenkinsGit

Solly Ross (sross-7)
Changed in nova:
status: New → Confirmed
Solly Ross (sross-7)
Changed in nova:
importance: Undecided → Medium
Solly Ross (sross-7)
Changed in nova:
status: Confirmed → Incomplete
assignee: nobody → Ryota Mibu (r-mibu)
Revision history for this message
Ryota Mibu (r-mibu) wrote :

OK. I will submit my patch (with UTs that haven't been wrriten).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/78999

Changed in nova:
status: Incomplete → In Progress
Dirk Mueller (dmllr)
tags: added: icehouse-rc-potential
Thierry Carrez (ttx)
tags: added: icehouse-backport-potential
removed: icehouse-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/78999
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d2874463010b129e62a568357cfa1df3ab4805e6
Submitter: Jenkins
Branch: master

commit d2874463010b129e62a568357cfa1df3ab4805e6
Author: Ryota MIBU <email address hidden>
Date: Fri Mar 7 17:08:11 2014 +0900

    libvirt: Make nwfilter driver use right filterref

    This fixes Bug #1220856 which occurs with Libvirt NWFilterFirewall
    Driver. While creating a VM with multiple NICs including one connects
    to DHCP serving network and another to DHCP non-serving network,
    NWFilterFirewall Driver will use the same base filter that allows DHCP
    for all NICs. This leads a libvirt launch error due to lack of
    'dhcp_server' parameter which is needed to define allow-DHCP filters.

    This patch makes NWFilterFirewall Driver use right base filter for
    each NIC depends on 'dhcp_server' config.

    Closes-bug: #1220856
    Change-Id: I7f9a7c281f152985478b2ec295f0644ba475fd76

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/102441

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/icehouse)

Reviewed: https://review.openstack.org/102441
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f59886479f1014c225989e24c425106412b3f0f0
Submitter: Jenkins
Branch: stable/icehouse

commit f59886479f1014c225989e24c425106412b3f0f0
Author: Ryota MIBU <email address hidden>
Date: Fri Mar 7 17:08:11 2014 +0900

    libvirt: Make nwfilter driver use right filterref

    This fixes Bug #1220856 which occurs with Libvirt NWFilterFirewall
    Driver. While creating a VM with multiple NICs including one connects
    to DHCP serving network and another to DHCP non-serving network,
    NWFilterFirewall Driver will use the same base filter that allows DHCP
    for all NICs. This leads a libvirt launch error due to lack of
    'dhcp_server' parameter which is needed to define allow-DHCP filters.

    This patch makes NWFilterFirewall Driver use right base filter for
    each NIC depends on 'dhcp_server' config.

    Closes-bug: #1220856
    Change-Id: I7f9a7c281f152985478b2ec295f0644ba475fd76
    (cherry picked from commit d2874463010b129e62a568357cfa1df3ab4805e6)

tags: added: in-stable-icehouse
Changed in nova:
milestone: none → juno-2
status: Fix Committed → Fix Released
Chuck Short (zulcss)
tags: removed: icehouse-backport-potential
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-2 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.