OSTF test "Update stack actions: inplace, replace and update whole template" failed without description

Bug #1562070 reported by Artem Hrechanychenko
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
MOS Oslo

Bug Description

Detailed bug description:
OSTF test "Update stack actions: inplace, replace and update whole template" failed without description
In ostf logs:
(nose_storage_plugin) fuel_health.tests.tests_platform.test_heat.HeatSmokeTests.test_update Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 67, in testPartExecutor yield File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 642, in doCleanups function(*args, **kwargs) File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 97, in with_params ret = self.function(instance, *args, **kwargs) File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 691, in delete_network return self.delete(self.network_path % (network)) File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 354, in delete headers=headers, params=params) File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 335, in retry_request headers=headers, params=params) File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 298, in do_request self._handle_fault_response(status_code, replybody, resp) File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 273, in _handle_fault_response exception_handler_v20(status_code, error_body) File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 84, in exception_handler_v20 request_ids=request_ids) NetworkInUseClient: Unable to complete operation on network f4e3b637-3836-4cd3-9bf2-bf055b997d1c. There are one or more ports still in use on the network. Neutron server returns request_ids: ['req-97411a0b-620c-4491-94f7-149b9ebde395']

Steps to reproduce:
Scenario:
1. Revert snapshot with 5 slaves
2. Create cluster (HA) with Neutron VLAN/VXLAN/GRE
3. Add 3 controller + ceph nodes
4. Add 2 compute + ceph nodes
5. Upload 'ceph' network template
6. Create custom network groups basing
on template endpoints assignments
7. Run network verification
8. Deploy cluster
9. Run network verification
10. Run health checks (OSTF) <<<< failed here
11. Check L3 network configuration on slaves
12. Check that services are listening on their networks only

Expected results:
test successfully passed

 Actual result:
OSTF test "Update stack actions: inplace, replace and update whole template" failed without description

 Reproducibility:
https://product-ci.infra.mirantis.net/job/9.0.system_test.ubuntu.network_templates/57/console

Revision history for this message
Artem Hrechanychenko (agrechanichenko) wrote :
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
assignee: nobody → MOS QA Team (mos-qa)
status: New → Confirmed
Changed in fuel:
assignee: MOS QA Team (mos-qa) → Alexander Nagovitsyn (gluk12189)
Revision history for this message
Bug Checker Bot (bug-checker) wrote : Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

version

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
Revision history for this message
Alexander Nagovitsyn (gluk12189) wrote :

i can't reproduce this bug - all works fine ( build 154 and older)

Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Tatyana Kuterina (tkuterina) wrote :

Issue is reproduced on CI 9.1 snapshot #140:
Test: Deploy HA environment with Cinder, Neutron and network template on two nodegroups.
Test Group: two_nodegroups_network_templates
Trace: https://product-ci.infra.mirantis.net/job/9.x.system_test.ubuntu.multiracks_2/29/console
Diagnostic snapshot: https://drive.google.com/a/mirantis.com/file/d/0Bz15vbpS5ZPNM1FsM3htZV9mRjg/view?usp=sharing

[root@nailgun ~]# shotgun2 short-report
cat /etc/fuel_build_id:
 495
cat /etc/fuel_build_number:
 495
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-misc-9.0.0-1.mos8500.noarch
 fuel-mirror-9.0.0-1.mos142.noarch
 fuel-release-9.0.0-1.mos6349.noarch
 rubygem-astute-9.0.0-1.mos754.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 python-packetary-9.0.0-1.mos142.noarch
 fuel-notify-9.0.0-1.mos8500.noarch
 shotgun-9.0.0-1.mos90.noarch
 fuel-9.0.0-1.mos6349.noarch
 python-fuelclient-9.0.0-1.mos337.noarch
 fuel-openstack-metadata-9.0.0-1.mos8750.noarch
 fuel-library9.0-9.0.0-1.mos8500.noarch
 fuel-utils-9.0.0-1.mos8500.noarch
 fuel-migrate-9.0.0-1.mos8500.noarch
 fuel-setup-9.0.0-1.mos6349.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-nailgun-9.0.0-1.mos8750.noarch
 fuelmenu-9.0.0-1.mos275.noarch
 nailgun-mcagents-9.0.0-1.mos754.noarch
 fuel-ostf-9.0.0-1.mos940.noarch
 fuel-ui-9.0.0-1.mos2725.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8750.noarch

Changed in fuel:
status: Invalid → New
tags: added: swarm-fail
Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :

Also failed:

https://product-ci.infra.mirantis.net/job/9.x.system_test.ubuntu.network_templates/50/testReport/(root)/add_nodes_net_tmpl/

neutronclient.client: DEBUG: RESP: 409 {'Date': 'Tue, 06 Sep 2016 09:54:57 GMT', 'Connection': 'close', 'Content-Type': 'application/json; charset=UTF-8', 'Content-Length': '204', 'X-Openstack-Request-Id': 'req-0fff4115-b038-4c00-be81-e37c9395ba3f'} {"NeutronError": {"message": "Unable to complete operation on subnet 6d8cc49e-0e58-43c7-9f24-56d5307e6bac: One or more ports have an IP allocation from this subnet.", "type": "SubnetInUse", "detail": ""}}
neutronclient.client: DEBUG: RESP: 409 {'Date': 'Tue, 06 Sep 2016 09:54:57 GMT', 'Connection': 'close', 'Content-Type': 'application/json; charset=UTF-8', 'Content-Length': '205', 'X-Openstack-Request-Id': 'req-1747914d-206d-415d-9c83-4bc47088ea6f'} {"NeutronError": {"message": "Unable to complete operation on network 766678b3-3ee5-42d1-8fec-c7f02c01da91. There are one or more ports still in use on the network.", "type": "NetworkInUse", "detail": ""}}

2016-09-06 09:54:57 ERROR Update stack actions: inplace, replace and update whole template (fuel_health.tests.tests_platform.test_heat.HeatSmokeTests.test_update) File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 67, in testPartExecutor
    yield
  File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 642, in doCleanups
    function(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/novaclient/v2/flavors.py", line 148, in delete
    return self._delete("/flavors/%s" % base.getid(flavor))
  File "/usr/lib/python2.7/site-packages/novaclient/base.py", line 354, in _delete
    resp, body = self.api.client.delete(url)
  File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 461, in delete
    return self._cs_request(url, 'DELETE', **kwargs)
  File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 430, in _cs_request
    resp, body = self._time_request(url, method, **kwargs)
  File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 403, in _time_request
    resp, body = self.request(url, method, **kwargs)
  File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 397, in request
    raise exceptions.from_response(resp, body, url, method)
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'oslo_db.exception.DBConnectionError'> (HTTP 500) (Request-ID: req-28b95ff8-a45b-4df9-bca9-78f85f3def4c)

Changed in fuel:
status: New → Confirmed
Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :
Changed in fuel:
assignee: Alexander Nagovitsyn (gluk12189) → MOS QA Team (mos-qa)
milestone: 9.0 → 9.2
Revision history for this message
ElenaRossokhina (esolomina) wrote :

new issue https://product-ci.infra.mirantis.net/job/9.x.acceptance.ubuntu.mixed_os_components/23/testReport/(root)/mixed_components_murano_sahara_ceilometer/mixed_components_murano_sahara_ceilometer/

2016-11-18 12:24:19 ERROR (nose_storage_plugin) fuel_health.tests.tests_platform.test_heat.HeatSmokeTests.test_update
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 67, in testPartExecutor
    yield
  File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 642, in doCleanups
    function(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 97, in with_params
    ret = self.function(instance, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 691, in delete_network
    return self.delete(self.network_path % (network))
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 354, in delete
    headers=headers, params=params)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 335, in retry_request
    headers=headers, params=params)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 298, in do_request
    self._handle_fault_response(status_code, replybody, resp)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 273, in _handle_fault_response
    exception_handler_v20(status_code, error_body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 84, in exception_handler_v20
    request_ids=request_ids)
NetworkInUseClient: Unable to complete operation on network e041b8c9-9b1a-4d36-a606-3d5efd172146. There are one or more ports still in use on the network.
Neutron server returns request_ids: ['req-4bbcb4a0-fc37-4e60-80fe-b0b891b737dd']

Revision history for this message
Nastya Urlapova (aurlapova) wrote :

Test "add_nodes_net_tmpl" works fine, https://product-ci.infra.mirantis.net/view/9.x_swarm/job/9.x.system_test.ubuntu.network_templates/167/testReport/(root)/add_nodes_net_tmpl/add_nodes_net_tmpl/history/

Case "mixed_components_murano_sahara_ceilometer" failed with other error, so moved to Invalid.

Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Nastya Urlapova (aurlapova) wrote :
Changed in fuel:
status: Invalid → Confirmed
Revision history for this message
Nastya Urlapova (aurlapova) wrote :
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Nastya,

test failed because of other issue, we didn't have required resources (RAM) on compute node:

Heat logs from node-1 from the snapshot which was uploaded by you:
http://paste.openstack.org/show/596858/
heat stack hanged in "In Progress" state because one of VM has ERROR state:
Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"

Nova scheduler logs in the same time (node-2):
http://paste.openstack.org/show/596863/
ram: 3313MB disk: 130048MB io_ops: 0 instances: 0 does not have 1048576 MB usable ram before overcommit, it only has 3825 MB

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Nova team, could you please take a look to the error message in Nova scheduler (node-2), time: 19:09. The message looks strange for me:
ram: 3313MB disk: 130048MB io_ops: 0 instances: 0 does not have 1048576 MB usable ram before overcommit, it only has 3825 MB

We tried to boot VM and we have 3825 free RAM on compute node, but nove scheduler decided to return 0 hosts because of RAM filter and we can see message "not have 1048576 MB usable ram before overcommit, it only has 3825 MB" in Nova logs, it looks strange, because of "1048576 MB" here.

Changed in fuel:
assignee: MOS QA Team (mos-qa) → MOS Nova (mos-nova)
Revision history for this message
Pavel Kholkin (pkholkin) wrote :

Timur, I think we should check the test once again. I found only one place with this number: https://github.com/openstack/fuel-ostf/blob/master/fuel_health/tests/tests_platform/test_heat.py#L750. Maybe this flavor was used for some boot command. Looks like scheduler just says that we want to boot some very huge instance (https://github.com/openstack/nova/blob/stable/mitaka/nova/scheduler/filters/ram_filter.py#L39)

Changed in fuel:
assignee: MOS Nova (mos-nova) → Timur Nurlygayanov (tnurlygayanov)
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Ok, so, it looks like these errors are not related to failed OSTF test.

We have other errors in OpenStack logs which can be related to the failed OSTF tests: RabbitMQ cluster was down and we can see tracebacks from all OpenStack services after 19:11.

Oslo team, could you please take a look?

Changed in fuel:
assignee: Timur Nurlygayanov (tnurlygayanov) → MOS Oslo (mos-oslo)
Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

Timur, Pavel, the failure of stack creation you have found is expected by test test_rollback, see its description in https://github.com/openstack/fuel-ostf/blob/master/fuel_health/tests/tests_platform/test_heat.py#L727

and the test passed. Another test actually failed later - test_update, see http://paste.openstack.org/show/597010/

Indeed here most probably the reason is RabbitMQ cluster partitioning and recovery starting at 19:11:15. It was caused by a network interface hiccup, see node-1-10.109.15.4/var/log/messages:
<6>Jan 29 19:11:15 node-1 kernel: [ 5026.455659] NETDEV WATCHDOG: enp0s5 (e1000): transmit queue 0 timed out

So effectively that is another bug https://bugs.launchpad.net/fuel/+bug/1648318

Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

Posted comment about repro into https://bugs.launchpad.net/fuel/+bug/1648318, moving current bug into previous invalid state.

Changed in fuel:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.