Queens, Rocky fs020 tempest test are failing ServersOnMultiNodesTest and TestSecurityGroupsBasicOps

Bug #1857365 reported by Ronelle Landy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Martin Kopec

Bug Description

Since 12/22 Queens and Rocky fs020 have been failing tempest tests:

Queens:

http://logs.rdoproject.org/00/24300/1/check/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-queens/fe63f93/logs/tempest.html.gz

Rocky:

http://logs.rdoproject.org/00/24300/1/check/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-rocky/7db5e8f/logs/tempest.html.gz

Both fail:

tempest.api.compute.admin.test_servers_on_multinodes.ServersOnMultiNodesTest test_create_servers_on_different_hosts_with_list_of_servers

with:

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tempest/api/compute/admin/test_servers_on_multinodes.py", line 88, in test_create_servers_on_different_hosts
    self.assertNotEqual(self.host01, host02)
  File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 842, in assertNotEqual
    raise self.failureException(msg)
AssertionError: u'overcloud-novacompute-1.localdomain' == u'overcloud-novacompute-1.localdomain'

Queens also fails:

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps
test_boot_into_disabled_port_security_network_without_secgroup[compute,id-13ccf253-e5ad-424b-9c4a-97b88a026699,network,slow]
 fail
test_cross_tenant_traffic[compute,id-e79f879e-debb-440c-a7e4-efeda05b6848,network]
 pass
test_in_tenant_traffic[compute,id-63163892-bbf6-4249-aa12-d5ea1f8f421b,network]
 pass
test_multiple_security_groups[compute,id-d2f77418-fcc4-439d-b935-72eca704e293,network,slow]
 pass
test_port_security_disable_security_group[compute,id-7c811dcc-263b-49a3-92d2-1b4d8405f50c,network,slow]
 fail
test_port_update_new_security_group[compute,id-f4d556d7-1526-42ad-bafb-6bebf48568f6,network,slow]
 fail

with:

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tempest/common/utils/__init__.py", line 88, in wrapper
    return f(*func_args, **func_kwargs)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/test_security_groups_basic_ops.py", line 538, in test_port_update_new_security_group
    [new_tenant.security_groups['default']])
  File "/usr/lib/python2.7/site-packages/tempest/scenario/test_security_groups_basic_ops.py", line 291, in _create_server
    message="Failed to boot servers on different "
  File "/usr/lib/python2.7/site-packages/testtools/testcase.py", line 394, in assertNotIn
    self.assertThat(haystack, matcher, message)
  File "/usr/lib/python2.7/site-packages/testtools/testcase.py", line 435, in assertThat
    raise mismatch_error
testtools.matchers._impl.MismatchError: [u'overcloud-novacompute-0.localdomain'] matches Contains(u'overcloud-novacompute-0.localdomain'): Failed to boot servers on different Compute nodes.

Ronelle Landy (rlandy)
tags: added: promotion-blocker
tags: added: ci
Ronelle Landy (rlandy)
Changed in tripleo:
milestone: none → ussuri-1
importance: Undecided → Critical
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart-extras (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/700443

Revision history for this message
yatin (yatinkarel) wrote :

Possible root cause of failing tests in queens/rocky is tempestconf bump to 2.4.0 in queens/rocky with:- https://review.rdoproject.org/r/#/c/24021/ and https://review.rdoproject.org/r/#/c/24020/ respectively.

Though fs020 job passed in https://review.rdoproject.org/r/#/c/24021/ so looks like test failure is random.
I cannot find older logs so guessing earlier these tests(atleast multinode ones which relies on min_compute_nodes config) were skipped. Someone from tempest/nova can confirm it what's the actual issue with these tests in queens/rocky.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/700443
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=851fdafe3b01e0ae80ef18c9f224912d612ff264
Submitter: Zuul
Branch: master

commit 851fdafe3b01e0ae80ef18c9f224912d612ff264
Author: Ronelle Landy <email address hidden>
Date: Mon Dec 23 13:45:09 2019 -0500

    Add failing tempest test to skip list

    Queens and Rocky started failing fs020 recently.
    Adding failing tempest tests to skip list to
    allow time for failure investigation.

    Change-Id: I7075f06923272bce67f9b5987e1a6c68c3372289
    Related-Bug: #1857365

Revision history for this message
Martin Kopec (mkopec) wrote :

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps tests:

test_boot_into_disabled_port_security_network_without_secgroup
test_port_security_disable_security_group
test_port_update_new_security_group

^^ those 3 tests get triggered when network-feature-enabled.port_security is set to True in tempest.conf. No change happen in python-tempestconf in 2.4.0 release regarding that. The other condition which needs to be set in order to run those 3 tests is running slow tests. Is it possible that the jobs run now slow tests as well?

------

tempest.api.compute.admin.test_servers_on_multinodes.ServersOnMultiNodesTest.test_create_servers_on_different_hosts_with_list_of_servers

The new 2.4.0 release of python-tempestconf contains the following change:
https://review.opendev.org/#/c/691573/
That change sets min_compute_nodes. The test was skipped as min_compute_nodes wasn't set by default before. If it's not desired to run it, the test should be explicitly skipped.

I found this BZ regarding the same issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1566148
In that case the CI running the test was missing SameHostFilter and DifferentHostFilter, there is a link to a devstack change which enables them:
https://github.com/openstack/devstack/commit/e0d61118f198e6a46af0956902485098f78e8d26

A year ago, the following tempest change was merged:
https://review.opendev.org/#/c/570207/
Before (current state in rocky and queens) all scheduler filters were enabled by default. That got changed by the 570207 review as the test_create_servers_on_different_hosts_with_list_of_servers test requires a special filter which doesn't have to be set in nova.conf on overcloud node.

So I see 2 options here:
1. enable the filters and the test should pass
2. backport the 570207 change to rocky and queens which will result in skipping the test anyway

Revision history for this message
yatin (yatinkarel) wrote :

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps tests:

test_boot_into_disabled_port_security_network_without_secgroup
test_port_security_disable_security_group
test_port_update_new_security_group

^^ those 3 tests get triggered when network-feature-enabled.port_security is set to True in tempest.conf. No change happen in python-tempestconf in 2.4.0 release regarding that. The other condition which needs to be set in order to run those 3 tests is running slow tests. Is it possible that the jobs run now slow tests as well?

------
>>> May be transient issue, i saw some failure jobs but didn't find those tests failing, i found some other security_group tests failures but those tests too failed due to different host filter.

tempest.api.compute.admin.test_servers_on_multinodes.ServersOnMultiNodesTest.test_create_servers_on_different_hosts_with_list_of_servers

The new 2.4.0 release of python-tempestconf contains the following change:
https://review.opendev.org/#/c/691573/
That change sets min_compute_nodes. The test was skipped as min_compute_nodes wasn't set by default before. If it's not desired to run it, the test should be explicitly skipped.

>> I think this can be discussed with nova team as changes might be required in recent releases as well. For now i think skipping the classes which were skipped till now needs to be completely skipped with skiplist like https://review.opendev.org/700443 was incomplete.

I found this BZ regarding the same issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1566148
In that case the CI running the test was missing SameHostFilter and DifferentHostFilter, there is a link to a devstack change which enables them:
https://github.com/openstack/devstack/commit/e0d61118f198e6a46af0956902485098f78e8d26

A year ago, the following tempest change was merged:
https://review.opendev.org/#/c/570207/
Before (current state in rocky and queens) all scheduler filters were enabled by default. That got changed by the 570207 review as the test_create_servers_on_different_hosts_with_list_of_servers test requires a special filter which doesn't have to be set in nova.conf on overcloud node.

So I see 2 options here:
1. enable the filters and the test should pass
2. backport the 570207 change to rocky and queens which will result in skipping the test anyway

Revision history for this message
chandan kumar (chkumar246) wrote :

This patch in tempest https://review.opendev.org/#/c/570207/ updates the filters from all to use default.

I was checking the tripleo-heat-templates stable/queens and rocky tht files and found that NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter'] -> which is the list of default filters.

In python-tempestconf side, we have 'Override options' & it can be used to update the value from all to above list and since these filters are available in deployment, it might pass.

It will also avoid backporting the patch. In RDO, we donot try to backport patches for tempest.
I hope it will help.

Revision history for this message
chandan kumar (chkumar246) wrote :
Revision history for this message
chandan kumar (chkumar246) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart-extras (master)

Fix proposed to branch: master
Review: https://review.opendev.org/701225

Changed in tripleo:
assignee: nobody → Martin Kopec (mkopec)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart-extras (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/701291

Revision history for this message
wes hayutin (weshayutin) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart-extras (master)

Change abandoned by wes hayutin (<email address hidden>) on branch: master
Review: https://review.opendev.org/701291
Reason: In favor of https://review.opendev.org/#/c/701225/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/701225
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=91f8ed6386792bab2e670c8ebd74d3654f6b7231
Submitter: Zuul
Branch: master

commit 91f8ed6386792bab2e670c8ebd74d3654f6b7231
Author: Martin Kopec <email address hidden>
Date: Mon Jan 6 13:43:02 2020 +0000

    Override tempest nova scheduler filters

    In Rocky and Queens tempest contains a default value for nova scheduler
    filters equal to all, which results in the mentioned bug when tests which
    are meant to be executed against specific filters are not enabled in the
    system.

    Overriding filters in tempest to only those which are available in the system
    will skip the tests which are meant for different filters.

    Change-Id: I98532113d3ad1339fdc4869686ad875b304f094e
    Closes-bug: 1857365

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/699394
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=f1b00f047569913eb33fe67c6cf714ca0e8b4373
Submitter: Zuul
Branch: master

commit f1b00f047569913eb33fe67c6cf714ca0e8b4373
Author: Soniya Vyas <email address hidden>
Date: Fri Jan 3 15:40:56 2020 +0530

    [Rocky] Removed passing tests from skiplist

    Class neutron_tempest_plugin.scenario has maximum
    number of tests passed. Hence, it is removed and
    only failing testsis kept.

    In addition to above, added two more tests from
    class 'TestSecurityGroupsBasicOps' and a test from
    class 'ServersOnMultiNodesTest'

    Related-bug: #1737940
    Related-bug: #1753209
    Related-bug: #1843259
    Related-bug: #1793482
    Related-bug: #1831223
    Related-bug: #1857365

    Signed-off by: Soniya Vyas<email address hidden>
    Change-Id: I8239bb694187d7f912163742e836a7362cdb1483

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart-extras (master)

Change abandoned by "Brent Eagles <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/701341
Reason: 2 years old and applies to queens - probably safe to abandon

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.