OSTF test: Update stack actions failed for Heat

Bug #1547041 reported by Andrey Lavrentyev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Confirmed
High
MOS Nova
8.0.x
Confirmed
High
MOS Nova
Mitaka
Confirmed
High
MOS Nova

Bug Description

[OSTF test] Update stack actions: inplace, replace and update whole template failed for Heat target component.

Test steps are taken from acceptance test: Shut down primary controller on ceph cluster

Steps to reproduce:

1. Deploy environment with 3+ controllers and NeutronTUN or NeutronVLAN, all ceph, 2 compute, 2 ceph nodes
2. Shut down primary controller
3. Verify networks
4. Ensure that VIPs are moved to other controller
5. Ensure connectivity to outside world from VM
6. Run OSTF tests

Expected result:
OSTF test: 'Update stack actions' is passed

Actual result:
OSTF test: 'Update stack actions' is failed

It's worth mentioning that when the controller was brought up back (turned on), the same test was a success.

ISO version: fuel-8.0-570-2016-02-15_13-42-00

[root@nailgun ~]# cat /etc/fuel/version.yaml
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  api: "1.0"
  build_number: "570"
  build_id: "570"
  fuel-nailgun_sha: "558ca91a854cf29e395940c232911ffb851899c1"
  python-fuelclient_sha: "4f234669cfe88a9406f4e438b1e1f74f1ef484a5"
  fuel-agent_sha: "658be72c4b42d3e1436b86ac4567ab914bfb451b"
  fuel-nailgun-agent_sha: "b2bb466fd5bd92da614cdbd819d6999c510ebfb1"
  astute_sha: "b81577a5b7857c4be8748492bae1dec2fa89b446"
  fuel-library_sha: "c2a335b5b725f1b994f78d4c78723d29fa44685a"
  fuel-ostf_sha: "3bc76a63a9e7d195ff34eadc29552f4235fa6c52"
  fuel-mirror_sha: "fb45b80d7bee5899d931f926e5c9512e2b442749"
  fuelmenu_sha: "78ffc73065a9674b707c081d128cb7eea611474f"
  shotgun_sha: "63645dea384a37dde5c01d4f8905566978e5d906"
  network-checker_sha: "a43cf96cd9532f10794dce736350bf5bed350e9d"
  fuel-upgrade_sha: "616a7490ec7199f69759e97e42f9b97dfc87e85b"
  fuelmain_sha: "d605bcbabf315382d56d0ce8143458be67c53434"

Link to the diagnostic snapshot: https://drive.google.com/open?id=0B5HPBFb7K7gXUUE1TUIzeXgxQVE

Revision history for this message
Andrey Lavrentyev (alavrentyev) wrote :
tags: added: heat
Revision history for this message
Sergey Kraynev (skraynev) wrote :

In logs I see follow traceback:
http://paste.openstack.org/show/487437/

It looks like the root cause is not related with Heat.

The original scenario of test is execute Update Replace for Heat resource. It means,
that Heat tries to create new resource (in the current situation it's Nova Server).
So traceback in logs tells us, that something went wrong with Nova, because we can not create VM.

I suggest to ask guys from Nova team.
I guess ,that it may be simple resource limitation, when you remove one controller you loose some CPU and memory resources...

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

I checked the requests Sergey found in heat-engine logs. The corresponding instances failed to be scheduled:

http://paste.openstack.org/show/487444/

It is RamFilter which filtered out both available compute nodes, i.e. we simply ran out of available memory (it may be due to the big flavors used, especially if we run tests in parallel, or memory hasn't been freed after the previous test yet).

This failure must have nothing to do with shutting of the primary controller down. From what I can see, I believe, it's a red herring. The tests pass after bringing the controller up, just because time passed and memory has finally been freed on the compute nodes.

tags: added: area-nova
removed: heat
tags: added: area-qa
removed: area-nova
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Leaving the bug on mos-nova team to finish the investigation. It must not be a blocker for a 8.0 release (thus, moving to 8.0-updates for now). So far I believe it's a problem with the test (either we use too large flavors or simply don't wait enough time for memory to be freed after the previous test, or we run multiple tests in parallel and there is no enough memory for all VMs booted).

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Andrey, how many RAM on compute node did you have before the execution of OSTF tests?

Revision history for this message
Andrey Lavrentyev (alavrentyev) wrote :

Timur, 3 GB of RAM was used per each compute node.

tags: added: area-ostf
removed: area-qa
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 9.0 → 10.0
tags: added: area-nova
tags: added: area-heat
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

We gave it another try on a 9.0 environment and it's not reproduced. I still believe RCA in #3 is correct.

There was a few fixes to tests recently. I tentatively mark this one as a duplicate of #1525200.

Feel free to re-open the bug, if you see it on CI again.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.