Tempest test test_dualnet_multi_prefix_slaac is failed

Bug #1579037 reported by Sofiia Andriichenko
14
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Status tracked in 10.0.x
10.0.x
Confirmed
Medium
MOS Nova
9.x
Won't Fix
Medium
MOS Nova

Bug Description

Detailed bug description:
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_slaac

Configuration:
Storage Backends - Ceph RBD for volumes (Cinder), Ceph RBD for ephemeral volumes (Nova), Ceph RBD for images (Glance), Ceph RadosGW for objects (Swift API)
Additional services - Install Ironic, Install Sahara

In tab Settings->Compute check Nova quotas
In tab Settings->OpenStack Services check enable Install Ceilometer and Aodh
In tab Networks->Other check enable Neutron DVR

Nodes: controller, compute, Ceph, Telemetry - MongoDB

Steps to reproduce:
    1. Deploy ISO in configuration see (Detailed bug description)
    2. Navigate to controller node
    3. Install git (use apt-get install git)
    4. Clone script to deploy rally + tempest
       # git clone https://github.com/obutenko/mos-rally-verify.git
    5. Navigate to the https://github.com/obutenko/mos-rally-verify
    6. Execute necessary steps to deploy tempest
    7. Tun test in debug mode
        #rally --debug verify start --regex tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_slaac

Expected results:
Test is passed

Actual result:
Test is Failed
(see comments)

Reproducibility:
See attachment

Workaround:
---

Impact:
---

Description of the environment:
See (Detailed bug description)

Additional information:
Error Message

test failed

Stacktrace

Traceback (most recent call last):
  File "/home/rally/.rally/tempest/for-deployment-777332f2-8189-46a5-a7e4-01f79ddb81b7/tempest/test.py", line 113, in wrapper
    return f(self, *func_args, **func_kwargs)
  File "/home/rally/.rally/tempest/for-deployment-777332f2-8189-46a5-a7e4-01f79ddb81b7/tempest/scenario/test_network_v6.py", line 252, in test_dualnet_multi_prefix_slaac
    dualnet=True)
  File "/home/rally/.rally/tempest/for-deployment-777332f2-8189-46a5-a7e4-01f79ddb81b7/tempest/scenario/test_network_v6.py", line 156, in _prepare_and_test
    sshv4_2, ips_from_api_2, sid2 = self.prepare_server(networks=net_list)
  File "/home/rally/.rally/tempest/for-deployment-777332f2-8189-46a5-a7e4-01f79ddb81b7/tempest/scenario/test_network_v6.py", line 124, in prepare_server
    fip = self.create_floating_ip(thing=srv)
  File "/home/rally/.rally/tempest/for-deployment-777332f2-8189-46a5-a7e4-01f79ddb81b7/tempest/scenario/manager.py", line 851, in create_floating_ip
    port_id, ip4 = self._get_server_port_id_and_ip4(thing)
  File "/home/rally/.rally/tempest/for-deployment-777332f2-8189-46a5-a7e4-01f79ddb81b7/tempest/scenario/manager.py", line 834, in _get_server_port_id_and_ip4
    % port_map)
  File "/usr/local/lib/python2.7/dist-packages/testtools/testcase.py", line 406, in assertEqual
    self.assertThat(observed, matcher, message)
  File "/usr/local/lib/python2.7/dist-packages/testtools/testcase.py", line 493, in assertThat
    raise mismatch_error
testtools.matchers._impl.MismatchError: 2 != 1: Found multiple IPv4 addresses: [(u'35959afa-a8be-4f25-815d-14ba14dbcfc1', u'10.100.0.7'), (u'3b14a80d-d2f0-4135-96d2-af9f045039b8', u'10.100.0.6')]. Unable to determine which port to target.

Revision history for this message
Sofiia Andriichenko (sandriichenko) wrote :
tags: added: tempest
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

MOS Neutron, please take a look at this.

Changed in mos:
status: New → Confirmed
assignee: nobody → MOS Neutron (mos-neutron)
milestone: none → 9.0
importance: Undecided → High
tags: added: area-neutron
Revision history for this message
Oleg Bondarev (obondarev) wrote :
Revision history for this message
Oleg Bondarev (obondarev) wrote :

Ran TestGettingAddress suite several times, can't reproduce "Found multiple IPv4 addresses" failure. Need logs or env where the problem is reproducible. Marking as incomplete

Changed in mos:
status: Confirmed → Incomplete
assignee: MOS Neutron (mos-neutron) → Sofiia Andriichenko (sandriichenko)
Revision history for this message
Oleksiy Butenko (obutenko) wrote :

This issue doesn't reproduce on CI and anymore, moved to Invalid.
Please move to Confirmed state if the issue will be reproduced again.

Changed in mos:
status: Incomplete → Invalid
Revision history for this message
Sofiia Andriichenko (sandriichenko) wrote :

Reproduced again. 9.0 mos iso #400

Configuration:
Storage Backends - Ceph RBD for volumes (Cinder), Ceph RBD for ephemeral volumes (Nova), Ceph RBD for images (Glance), Ceph RadosGW for objects (Swift API)
Additional services - Install Sahara

In tab Settings->Compute check Nova quotas
In tab Settings->OpenStack Services check enable Install Ceilometer and Aodh
In tab Networks->Other check enable Neutron DVR
In tab Settings->Security check enable TLS for OpenStack public endpoints, HTTPS for Horizon

Nodes: controller, compute, Ceph, Telemetry - MongoDB

Changed in mos:
status: Invalid → Confirmed
assignee: Sofiia Andriichenko (sandriichenko) → nobody
Revision history for this message
Sofiia Andriichenko (sandriichenko) wrote :
Changed in mos:
assignee: nobody → MOS Neutron (mos-neutron)
Revision history for this message
Oleg Bondarev (obondarev) wrote :
Download full text (8.0 KiB)

Instance 2b580249-9467-44d7-9e6b-73693534a5d4 was first scheduled to node-6:

2016-05-26 02:37:34.645 14354 DEBUG nova.compute.manager [req-81751add-78d4-4f53-8042-a04596d0a040 becfc7903e7e4917ab927eae3fb3bb3d 420cb14a42c84b3fbbb8891fc9329816 - - -] [instance: 2b580249-9467-44d7-9e6b-73693534a5d4] Starting instance... _do_build_and_run_instance /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1897

port fa843bf8-01b7-424e-a705-ff8453ec743e created:

2016-05-26 02:37:37.614 14354 DEBUG keystoneauth.session [req-81751add-78d4-4f53-8042-a04596d0a040 becfc7903e7e4917ab927eae3fb3bb3d 420cb14a42c84b3fbbb8891fc9329816 - - -] RESP: [201] Date: Thu, 26 May 2016 02:37:37 GMT Connection: close Content-Type: application/json; charset=UTF-8 Content-Length: 886 X-Openstack-Request-Id: req-c801de97-029d-4927-a983-3d3db3e6f443
RESP BODY: {"port": {"status": "DOWN", "binding:host_id": "node-6.test.domain.local", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-05-26T02:37:37", "device_owner": "compute:None", "port_security_enabled": true, "binding:profile": {}, "fixed_ips": [{"subnet_id": "b888bc63-4e17-43e1-885f-aa7ba3511d33", "ip_address": "10.100.0.6"}], "id": "fa843bf8-01b7-424e-a705-ff8453ec743e", "security_groups": ["f1cb0412-1db4-4332-911b-1f9a0362f3bc"], "device_id": "2b580249-9467-44d7-9e6b-73693534a5d4", "name": "", "admin_state_up": true, "network_id": "5db2908b-a315-45e2-92ac-37472d4bdf76", "dns_name": null, "binding:vif_details": {"port_filter": true, "ovs_hybrid_plug": true}, "binding:vnic_type": "normal", "binding:vif_type": "ovs", "tenant_id": "420cb14a42c84b3fbbb8891fc9329816", "mac_address": "fa:16:3e:2c:5a:c4", "created_at": "2016-05-26T02:37:36"}}
 _http_log_response /usr/lib/python2.7/dist-packages/keystoneauth1/session.py:277
2016-05-26 02:37:37.615 14354 DEBUG nova.network.neutronv2.api [req-81751add-78d4-4f53-8042-a04596d0a040 becfc7903e7e4917ab927eae3fb3bb3d 420cb14a42c84b3fbbb8891fc9329816 - - -] [instance: 2b580249-9467-44d7-9e6b-73693534a5d4] Successfully created port: fa843bf8-01b7-424e-a705-ff8453ec743e _create_port /usr/lib/python2.7/dist-packages/nova/network/neutronv2/api.py:261

But eventually instance failed to spawn:

2016-05-26 02:37:47.591 14354 ERROR nova.compute.manager [req-81751add-78d4-4f53-8042-a04596d0a040 becfc7903e7e4917ab927eae3fb3bb3d 420cb14a42c84b3fbbb8891fc9329816 - - -] [instance: 2b580249-9467-44d7-9e6b-73693534a5d4] Instance failed to spawn
2016-05-26 02:37:47.591 14354 ERROR nova.compute.manager [instance: 2b580249-9467-44d7-9e6b-73693534a5d4] Traceback (most recent call last):
2016-05-26 02:37:47.591 14354 ERROR nova.compute.manager [instance: 2b580249-9467-44d7-9e6b-73693534a5d4] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2218, in _build_resources
2016-05-26 02:37:47.591 14354 ERROR nova.compute.manager [instance: 2b580249-9467-44d7-9e6b-73693534a5d4] yield resources
2016-05-26 02:37:47.591 14354 ERROR nova.compute.manager [instance: 2b580249-9467-44d7-9e6b-73693534a5d4] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2064, in _build_and_run_instance
2016-05-26 02:37:47.591 14354 ERROR nova.comput...

Read more...

Changed in mos:
assignee: MOS Neutron (mos-neutron) → MOS Nova (mos-nova)
Changed in mos:
assignee: MOS Nova (mos-nova) → Roman Podoliaka (rpodolyaka)
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

What Oleg described in https://bugs.launchpad.net/mos/+bug/1579037/comments/8 sounds like a race condition in Nova: I checked the code and we intentionally do no deallocate the port on rescheduling of an instance, at the same time - no additional allocations must be performed on the another compute node, if the original allocation succeeded. In the logs we clearly see that both nova-compute's allocated a port in Neutron, which eventually caused a test case to fail as it asserts on the number of ports (1 != 2).

That being said, I think it's a valid bug, but the user impact is moderate here - you might have allocated, but unused ports in Neutron, if instance failed to boot on one compute node and was rescheduled.

Again, this seems to be a race condition, as the check we perform in the nova code must have yielded false and skip the additional allocation. I checked the pass rate of this test case in both upstream and downstream: looks like we only had 2 failures out of 92 runs, so this is clearly not a blocker for 9.0.

I suggest we downgrade the importance to Medium and continue to work on this in 10.0.

tags: added: area-nova mova-to-10.0
removed: area-neutron
tags: added: move-to-10.0
removed: mova-to-10.0
Changed in mos:
status: Confirmed → Won't Fix
tags: added: 10.0-reviewed
removed: move-to-10.0
Revision history for this message
Sofiia Andriichenko (sandriichenko) wrote :

Reproduce on CI 9.1 snapshot #51 :
Tests:
test_dualnet_dhcp6_stateless_from_os
test_dualnet_multi_prefix_slaac
test_dualnet_slaac_from_os
test_slaac_from_os

Configuration:
Settings:
Compute - QEMU.
Network - Neutron with VLAN segmentation.
Storage Backends - LVM
Additional services - Install Ironic, Install Sahara

In tab Settings->Compute check Nova quotas
In tab Settings->OpenStack Services check enable Install Ceilometer and Aodh
In tab Networks->Other check enable Neutron DVR
In tab Settings->Security check enable TLS for OpenStack public endpoints, HTTPS for Horizon

Nodes: controller, compute, ironic,cinder, Telemetry - MongoDB

snapshot http://www.ex.ua/864242117910

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Too late to fix this in 9.1, moving to 9.2.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.