Devstack. Fail to boot an instance if more than 1 network is defined

Bug #1296808 reported by Avishay Balderman
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Dan Smith

Bug Description

Using Horizon I try to launch an instance with 2 networks defined.

The operation fails with the following error:
ERROR nova.scheduler.filter_scheduler [req-cae61024-6723-4218-bd5e-71b42d181cea admin demo] [instance: 54c6f9ba-57e5-4680-bb6b-72eb2da484db] Error from last host: devstack-vmware1 (node devstack-vmware1): [u'Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 1304, in _build_instance set_access_ip=set_access_ip)
File "/opt/stack/nova/nova/compute/manager.py", line 394, in decorated_function return function(self, context, *args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 1716, in _spawn LOG.exception(_(\'Instance failed to spawn\'), instance=instance)

File "/opt/stack/nova/nova/openstack/common/excutils.py", line 68, in __exit__ six.reraise(self.type_ self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 1713, in _spawn block_device_info)
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2241, in spawn block_device_info)
File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3628, in _create_domain_and_network network_info)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next()
File "/opt/stack/nova/nova/compute/manager.py", line 556, in wait_for_instance_event actual_event = event.wait() u"AttributeError: 'NoneType' object has no attribute 'wait'\n"]

Looks like there is a None event in the events map.

When I launch an instance with 1 network defined I face no issues.

Revision history for this message
Avishay Balderman (avishayb) wrote :

A quick bypass for this issue:
     yield # line 553 in manager.py
        with eventlet.timeout.Timeout(deadline):
            for event_name, event in events.items():
                if event:
                    actual_event = event.wait()
                    if actual_event.status == 'completed':
                        continue
                    decision = error_callback(event_name, instance)
                    if decision is False:
                        break

Tracy Jones (tjones-i)
tags: added: network
Revision history for this message
Dan Smith (danms) wrote :

Just confirming, this is with neutron, right?

Changed in nova:
assignee: nobody → Dan Smith (danms)
importance: Undecided → High
milestone: none → icehouse-rc1
Revision history for this message
Dan Smith (danms) wrote :

I wasn't able to reproduce this with horizon or the command line. The command I tried was:

nova boot --flavor 84 --image cirros-0.3.1-x86_64-uec --nic net-id=dd343d3b-0d9e-4df1-9e02-3be0efef647a --nic net-id=ed5811cc-6dc0-48a9-8894-3994b9cfdf78 foo

For the horizon test, I logged in as admin, created a new network and subnet, then launched an instance with a nic on private and a nic on my new test network. Launching that instance worked too and I don't see any failures in the compute log.

Can you help with more information about how to reproduce this? Also, can you provide details on exact code levels you're running?

Thanks!

Revision history for this message
Dan Smith (danms) wrote :

Untargeting this from -rc1 until/unless we can reproduce.

Changed in nova:
milestone: icehouse-rc1 → none
status: New → Incomplete
tags: added: icehouse-rc-potential
Revision history for this message
Avishay Balderman (avishayb) wrote :

yes - its with Neutron

Revision history for this message
izikpenso (izikp) wrote :

This is the localrc we used when the bug happens:

DATABASE_PASSWORD=os
RABBIT_PASSWORD=os
SERVICE_TOKEN=os
SERVICE_PASSWORD=os
ADMIN_PASSWORD=os

disable_service n-net

enable_service q-svc
enable_service q-agt
enable_service q-dhcp
enable_service q-l3
enable_service q-meta
enable_service quantum
enable_service q-lbaas

#For tempest

enable_service tempest
API_RATE_LIMIT=False

SWIFT_HASH=openstack

ACTIVE_TIMEOUT=12000
BOOT_TIMEOUT=12000

SCREEN_LOGDIR=~/devstack/logs
LOGDAYS=1
LOGFILE=stack.sh.log

HOST_IP=10.205.120.36
FLAT_INTERFACE=eth0

Revision history for this message
Samuel Bercovici (samuelb) wrote :

To recreate:
Using Horizon.
Login as demo.
On the demo project, in addition to the existing private network and subnet, create an additional network and subnet.
Create a new VM based on the cirros image and connect it to the two networks.

Revision history for this message
Dan Smith (danms) wrote :

I have done that (with the admin user, not demo) and I don't see the problem. Does the problem occur with the admin user as well? I will spin devstack up again and test with demo.

I still would like to see current code levels for nova and neutron...

Revision history for this message
izikpenso (izikp) wrote :

Dan , we are not sure what do you mean by "code levels for nova and neutron" ?

Revision history for this message
Dan Smith (danms) wrote :

I want to know what git trees you're currently on. So if you're using devstack:

  cd /opt/stack/nova && git show
  cd /opt/stack/neutron && git show
  cd /opt/stack/python-novaclient && show

Revision history for this message
izikpenso (izikp) wrote :

radware@devstack-vmware1:~$ cd /opt/stack/nova && git show
commit 620be9a67a6e2393659242b33083a9ede1763af1
Merge: cb8de23 3da0d89
Author: Jenkins <email address hidden>
Date: Mon Mar 24 14:24:50 2014 +0000

    Merge "Do not add HPET timer config to non x86 targets"

radware@devstack-vmware1:/opt/stack/nova$ cd /opt/stack/neutron && git show
commit e8685dc37be52fe415fc1b8a2c119cb3b14b71c6
Merge: ec30b11 98b3f4a
Author: Jenkins <email address hidden>
Date: Mon Mar 24 10:25:03 2014 +0000

    Merge "Return meaningful error message on pool creation error"

radware@devstack-vmware1:/opt/stack/neutron$ cd /opt/stack/python-novaclient && git show
commit 94a4c49de056f7fb768814c1fe819d3899082c27
Merge: 6e3b287 e43825b
Author: Jenkins <email address hidden>
Date: Sat Mar 22 09:20:20 2014 +0000

    Merge "Print a useful message for unknown server errors"

radware@devstack-vmware1:/opt/stack/python-novaclient$

Revision history for this message
Dan Smith (danms) wrote :

I still can't reproduce this in horizon logging in as demo.

In my opinion, there are a lot of variables in play approaching this from the horizon. So, it would be extremely helpful if you could reproduce this with the CLI, as that is easy to make sure we're doing the same exact thing.

In parallel, if you can come up with some more concrete instructions for doing it through horizon, I can continue to try to poke it from that perspective.

Revision history for this message
Dan Smith (danms) wrote :

Also, please try running devstack like this:

  RECLONE=yes ./stack.sh

to make sure all your trees get updated together.

Revision history for this message
izikpenso (izikp) wrote :
Download full text (7.4 KiB)

This issue isn't related to horizon, it happens via the CLI as well and it doesn't matter which user you use.
Dan, What you described you did for reproduction should be enough.
I've used the 'RECLONE=yes' now and it still happens.
See details below:

radware@devstack-vmware1:~/devstack$ source openrc demo demo
radware@devstack-vmware1:~/devstack$ neutron net-list
+--------------------------------------+---------+------------------------------------------------------+
| id | name | subnets |
+--------------------------------------+---------+------------------------------------------------------+
| 01411139-9736-4575-b523-1787ec61b875 | private | 37f0be1c-1f97-40ae-bc68-6fab1f5e964d 10.0.0.0/24 |
| 6e1d4355-3238-4c7e-816f-9d56949c0a1f | test | b1463fee-2829-4270-af7e-20bffab0e750 192.168.32.0/24 |
| d13ec16a-e088-490d-9201-93d28bb98c4e | public | 49238aff-7152-45cf-92e9-20d8c3d75e16 |
+--------------------------------------+---------+------------------------------------------------------+
radware@devstack-vmware1:~/devstack$ nova boot --flavor 84 --image cirros-0.3.1-x86_64-uec --nic net-id=01411139-9736-4575-b523-1787ec61b875 --nic net-id=6e1d4355-3238-4c7e-816f-9d56949c0a1f from_CLI
+--------------------------------------+----------------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| adminPass | bVQwprn8GHGV |
| config_drive | |
| created | 2014-03-30T16:04:02Z |
| flavor | m1.micro (84) |
| hostId | |
| id | a8720478-3005-4aaf-9f19-8234c477cf0d ...

Read more...

Revision history for this message
Avishay Balderman (avishayb) wrote :

Dan
Since you cant reproduce the issue on your side, would you like us to create a webex where you can access our enviroment?

Revision history for this message
Dan Smith (danms) wrote :

Can you attach a full n-cpu.log here that has the history leading up to the error, including the error?

Revision history for this message
Dan Smith (danms) wrote :

Scratch that, I was able to reproduce it.. looking now.

Changed in nova:
milestone: none → icehouse-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/84147

Changed in nova:
status: Incomplete → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/84147
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bdd064f95f8c3896679e2b1f764a5ac489ebba06
Submitter: Jenkins
Branch: master

commit bdd064f95f8c3896679e2b1f764a5ac489ebba06
Author: Dan Smith <email address hidden>
Date: Mon Mar 31 07:37:22 2014 -0700

    Fix getting instance events on subsequent attempts

    When the code for getting the list of events for a given instance
    was moved into the InstanceEvents object, an indenting error was
    introduced, which causes the get method to return None instead of
    the list if the instance already has some events waiting.

    This corrects that issue.

    Closes-bug: #1296808
    Change-Id: I099a605980dd9c2ae0659b82c633caeb8a19bbe9

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-rc1 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.