periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens fails tempest run for tempest.api.volume.admin.test_volumes_backup test group

Bug #1830351 reported by Marios Andreou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
chandan kumar

Bug Description

periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens fails in tempest run for the tempest.api.volume.admin.test_volumes_backup test group with trace in [1] like

    ft1.1: tempest.api.volume.admin.test_volumes_backup.VolumesBackupsAdminTest.test_volume_backup_export_import[id-a99c54a1-dd80-4724-8a13-13bf58d4068d]_StringException: pythonlogging:'': {{{
    2019-05-24 07:35:51,606 85373 INFO [tempest.lib.common.rest_client] Request (VolumesBackupsAdminTest:test_volume_backup_export_import): 202 POST http://10.0.0.5:8776/v3/c7f5c7d168c2416a89c7fb32630e0832/volumes 0.448s
      File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 849, in _error_checker
        resp=resp)
...
    tempest.lib.exceptions.UnexpectedResponseCode: Unexpected response code received
    Details: 503

This blocks promotions as the job is currently in criteria [2]

[1] https://logs.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens/b5ffa14/logs/tempest.html.gz
[2] https://github.com/rdo-infra/ci-config/blob/92dbe83f26daa7e3c578adddc9aabf2f939f3cdf/ci-scripts/dlrnapi_promoter/config/CentOS-7/queens.ini#L27

Revision history for this message
Marios Andreou (marios-b) wrote :

as discussed with ykarel just now on irc freenode #oooq we can skip this job for as the rest of the promotion run was green and we really need a Q promotion today

12:41 < ykarel> backup = False is needed in Tempestconf
12:42 < ykarel> under volume-feature-enabled

Revision history for this message
Marios Andreou (marios-b) wrote :
Changed in tripleo:
assignee: nobody → chandan kumar (chkumar246)
milestone: none → train-1
tags: added: tempest
Revision history for this message
yatin (yatinkarel) wrote :

It's consistently failing https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-24hr&job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens, adding promotion blocker tag, https://review.rdoproject.org/r/#/c/20920/(was pushed temporary to get promotion) needs to be reverted to avoid future promotions ignoring fs020 failures(which can be different then the current failures)

Also good to reconsider https://review.opendev.org/#/c/649084(revert might fix this bug) as puppet-tempestconf-2.2.0 is available in queens.

tags: added: promotion-blocker
removed: tempest
tags: added: tempest
Revision history for this message
chandan kumar (chkumar246) wrote :

Reverted the patch (which contains workaround): https://review.opendev.org/#/c/661707/ and testing here https://review.rdoproject.org/r/#/c/20326/

Revision history for this message
chandan kumar (chkumar246) wrote :
Revision history for this message
chandan kumar (chkumar246) wrote :

Above revert is not working in gates, but in promotion pipeline it is working, it looks weired totally as same tempest and tempestconf version is used there.

Revision history for this message
chandan kumar (chkumar246) wrote :

@marios, @ykarel, final contingency plan:
1. Update the tempest tag for queens in rdoinfo from 18.0.0 to 19.0.0
   [as volume3 support is available in tempest-19.0.0]
2. Then merge the revert
2.2 Merge the revert

Other workaround is to hardcode the same volume.backup to false as a extra overrides in validate-tempest? which does not seems to be a feasible solution.

What do you guys thinks?

Revision history for this message
chandan kumar (chkumar246) wrote :
Download full text (9.9 KiB)

Some more interesting output:
In passed one, type volume/volume3 is set and failed one volume2 and volume3 is setted.
So where volume2/volume3 is there-> volume backup check is skipped
and where volume/volume3 is there -> it is getting passed

That brings another question why volume/volume3 is getting created sometime.

passed one
https://logs.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens/00e8f34/logs/undercloud/home/zuul/tempest/tempest.log.txt.gz#_2019-05-28_07_09_33_221

       Body: <omitted>
    Response - Headers: {'status': '201', u'content-length': '7193', 'content-location': 'http://10.0.0.5:5000/v3/auth/tokens', u'x-subject-token': '<omitted>', u'vary': 'X-Auth-Token', u'server': 'Apache', u'connection': 'close', u'date': 'Tue, 28 May 2019 07:09:32 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-1fb3884a-bb20-4f19-8ed3-5c3fdbef4766'}
        Body: {"token": {"is_domain": false, "methods": ["password"], "roles": [{"id": "14684ad796274aed99d72b50f94c22c9", "name": "heat_stack_owner"}, {"id": "79c172778e4b47d19faa5f828a903a3a", "name": "admin"}], "expires_at": "2019-05-28T08:09:33.000000Z", "project": {"domain": {"id": "default", "name": "Default"}, "id": "5b4840a5140b4fb6926e81463a2ec00b", "name": "admin"}, "catalog": [{"endpoints": [{"region_id": "regionOne", "url": "http://172.18.0.20:8080/v1/AUTH_5b4840a5140b4fb6926e81463a2ec00b", "region": "regionOne", "interface": "internal", "id": "28cc4e52ac564793843797691c81bcd4"}, {"region_id": "regionOne", "url": "http://172.18.0.20:8080", "region": "regionOne", "interface": "admin", "id": "8962ef5d0cfd41139fa3aa667b79702e"}, {"region_id": "regionOne", "url": "http://10.0.0.5:8080/v1/AUTH_5b4840a5140b4fb6926e81463a2ec00b", "region": "regionOne", "interface": "public", "id": "fa8322650d324b21b10487ea2191aece"}], "type": "object-store", "id": "12b9c02a6f1645389754bf4ff0888db0", "name": "swift"}, {"endpoints": [{"region_id": "regionOne", "url": "http://172.17.0.22:8778/placement", "region": "regionOne", "interface": "internal", "id": "040ef39e452b43bc8ac307a0daff3be1"}, {"region_id": "regionOne", "url": "http://10.0.0.5:8778/placement", "region": "regionOne", "interface": "public", "id": "2d891945d81a48c6bb3f4ba401f1172f"}, {"region_id": "regionOne", "url": "http://172.17.0.22:8778/placement", "region": "regionOne", "interface": "admin", "id": "6a2ec796f4964154b5831fea2658458c"}], "type": "placement", "id": "457947346f2b43ddaf33671b9d432143", "name": "placement"}, {"endpoints": [{"region_id": "regionOne", "url": "http://172.17.0.22:8776/v3/5b4840a5140b4fb6926e81463a2ec00b", "region": "regionOne", "interface": "internal", "id": "4864fa63a7c241a3995abefdbab3deb6"}, {"region_id": "regionOne", "url": "http://10.0.0.5:8776/v3/5b4840a5140b4fb6926e81463a2ec00b", "region": "regionOne", "interface": "public", "id": "83569f86daf840e9b707e7bfab64ca36"}, {"region_id": "regionOne", "url": "http://172.17.0.22:8776/v3/5b4840a5140b4fb6926e81463a2ec00b", "region": "regionOne", "interface": "admin", "id": "aa484c5157d34768bb8595aa63aa0f7e"}], "type": "volume", "id": "47636c586a2f469ca112...

Revision history for this message
Alan Bishop (alan-bishop) wrote :

The failures that started around May 7 show a problem occurring in
discover-tempest-config. The logs [1] show it detects the cinder service:

  "Setting [service_available] cinder = True"

[1] http://logs.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens/fb7d48b/logs/undercloud/home/zuul/tempest.log.txt.gz

But then when it attempts to detect whether the cinder-backup service is
available it stumbles:

  "No volume service found, skipping backup service check"

Even though, later, it reaffirms the volume service is available:

  "Setting [volume-feature-enabled] api_v3 = True"

Here's the python-tempestconf code [2]:

[2] https://github.com/openstack/python-tempestconf/blob/master/config_tempest/services/volume.py#L62

    def post_configuration(self, conf, is_service):
        # Verify if the cinder backup service is enabled
        if not is_service("volumev3"):
            C.LOG.info("No volume service found, "
                       "skipping backup service check")
            return

When is_service("volumev3") fails, it doesn't set [volume-feature-enabled]
backup = False in tempest.conf, and that setting defaults to True.

Now we have a situation where tempest thinks the cinder-backup service is
available, and the tests fail because it isn't part of the overcloud
deployment (cinder-backup is an optional service).

When things work correctly [3], the post_configuration code checks whether
the cinder-backup service is available, sees that it isn't, and correctly
configures tempest.conf:

  Setting [volume-feature-enabled] backup = False

[3] http://logs.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens/2a849bc/logs/undercloud/home/zuul/tempest.log.txt.gz

I have no idea why the post_configuration code misbehaves. I tried to
reproduce the problem (with python-tempestconf 2.0.0 and 2.2.0) but can't
get it to fail.

Revision history for this message
Marios Andreou (marios-b) wrote :

I think this is another bug that was filed for periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens but is now moved to the 2 comp version (the 1comp is retired) https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-queens

i see green run on that yesterday @chandan did somethign merge over the weekend to fix it?

Revision history for this message
Marios Andreou (marios-b) wrote :

re comment #10 apparently it isn't consistent from freenode oooq just now with chandan:

13:22 < chandankumar> marios_|ruck: we need to ingore check jobs, I was testing the queens by updating tempest tag to 19 there
13:23 < marios_|ruck> chandankumar: ack but i see green *periodic* this morning
13:23 < chandankumar> marios_|ruck: I think we can wait till tomorrow, as it sometimes it passes or fails
13:23 < marios_|ruck> chandankumar:
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-queens
                      https://review.rdoproject.org/zuul/build/8471a37d9127445b8eae4c3dc0e59a36
13:23 < marios_|ruck> chandankumar: ah ok it isn't consistent?
13:23 < chandankumar> marios_|ruck: nope, not consistent
13:24 < marios_|ruck> chandankumar: can you please tell launchpad and/or trello about that
13:24 < chandankumar> marios_|ruck: will update the bz soon
13:24 < marios_|ruck> chandankumar: hold on i'll copy paste chate

Revision history for this message
Martin Kopec (mkopec) wrote :
Revision history for this message
Martin Kopec (mkopec) wrote :

Why this review [1] changed type from volumev3 to volume?

[1] https://review.opendev.org/#/c/649084/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (stable/queens)

Change abandoned by Chandan Kumar (raukadah) (<email address hidden>) on branch: stable/queens
Review: https://review.opendev.org/661707
Reason: not needed, fixed in tempestconf, thansk abishop a lot +++++++++

Changed in tripleo:
milestone: train-1 → train-2
Revision history for this message
chandan kumar (chkumar246) wrote :
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.