Mirantis OpenStack

Instance ends up with multiple IP addresses

Bug #1648851 reported by Michael Petersen on 2016-12-09

This bug report is a duplicate of: Bug #1579037: Tempest test test_dualnet_multi_prefix_slaac is failed. Edit Remove

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	Mirantis OpenStack	New	Medium	Unassigned

Bug Description

MOS 9.0

When a nova-compute service was disabled, there were issues with placing a few VMs. This node wasn't working properly, which results in the error below:

20161129/node-1/nova-conductor.log:2016-11-29T15:19:57.403069+02:00 err: 2016-11-29 15:19:57.402 15799 ERROR nova.scheduler.utils [req-3f545e8f-5a81-4826-93b7-62ed2cb5df5d a9d6111cdec443569e960ee6a25ad2da 0fdb73785f644987b7d76e3bf0952d47 - - -] [instance: f180c69e-49c7-4eda-9eab-81afe10160f7] Error from last host: node-4.data.lt (node node-4.data.lt): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1926, in _do_build_and_run_instance\n filter_properties)\n', u' File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2116, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance f180c69e-49c7-4eda-9eab-81afe10160f7 was re-scheduled: internal error: process exited while connecting to monitor: Could not access KVM kernel module: Permission denied\nfailed to initialize KVM: Permission denied\n\n']

The instance was then placed on another node with multiple IP addresses.

Below is the output from the instance with some data removed such as the full IP.

root@node-1:~# nova show f180c69e-49c7-4eda-9eab-81afe10160f7
+--------------------------------------+----------------------------------------------------------+
| Property | Value
|
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | AUTO
|
| OS-EXT-AZ:availability_zone | nova
|
| OS-EXT-SRV-ATTR:host | node-5
|
| OS-EXT-SRV-ATTR:hostname | testdoubleip
|
| OS-EXT-SRV-ATTR:hypervisor_hostname | node-5
|
| OS-EXT-SRV-ATTR:instance_name | instance-00000ace
|
| OS-EXT-SRV-ATTR:kernel_id |
|
| OS-EXT-SRV-ATTR:launch_index | 0
|
| OS-EXT-SRV-ATTR:ramdisk_id |
|
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda
|
| OS-EXT-SRV-ATTR:user_data | -
|
| OS-EXT-STS:power_state | 1
|
| OS-EXT-STS:task_state | -
|
| OS-EXT-STS:vm_state | active
|
| OS-SRV-USG:launched_at | 2016-11-29T13:20:12.000000
|
| OS-SRV-USG:terminated_at | -
|
| Public-Internet network | *.*.150.175, *.*.150.179
|
| accessIPv4 |
|
| accessIPv6 |
|
| config_drive | True
|
| created | 2016-11-29T13:19:48Z
|
| description | testdoubleip
|
| flavor | m1.small (2)
|
| hostId |
053c391872bca5448909951b1e8ec7d20106c0dde6c86727d3b0f557 |
| host_status | UP
|
| id |
f180c69e-49c7-4eda-9eab-81afe10160f7 |
| locked | False
|
| metadata | {}
|
| name | testdoubleip
|
| os-extended-volumes:volumes_attached | []
|
| progress | 0
|
| security_groups | default
|
| status | ACTIVE
+--------------------------------------+----------------------------------------------------------+

After completion of maintenance on the node it was put back into rotation and the issue cannot be replicated.

Tags:

Michael Petersen (mpetason) on 2016-12-09

tags:

added: customer-found

Revision history for this message

Michael Petersen (mpetason) wrote on 2016-12-12:

I do not believe it is a duplicate as it happened on more than one instance. The issue could be replicated until the nova-compute service was returned to service.

Revision history for this message

Roman Podoliaka (rpodolyaka) wrote on 2016-12-13:

I don't get your point. How does the number of affected instances make this a different problem? We see essentially the same issue: VM failed to spawn on one of the compute nodes, then it was (automatically) rescheduled to another node and we see that it now has two ports in Neutron, while we expect to see only one.

Revision history for this message

Michael Petersen (mpetason) wrote on 2016-12-13:

My mistake. The other bug was discussing race conditions. I didn't think this was a race condition as you were able to replicate the issue every time you tried to spin up an instance. The instances appeared to be scheduled to the same node, as it had the least amount of used resources, which would fail and then it would get rescheduled. If it's a duplicate then we can mark it as a duplicate.

Revision history for this message

Eugene Nikanorov (enikanorov) wrote on 2016-12-14:

Since the instance was rescheduled, that would explain the second ip address.
But that's really a bug: on rescheduling or spawning tear down (due to exception), the first neutron port should have been deleted.

Also, if's a duplicate bug, please add it to the comments, otherwise launchpad doesn't store it in bug's history.

Changed in mos:
importance:	Undecided → Medium

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1579037 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.