tempest cleanup removes all networks

Bug #1812660 reported by Martin Kopec
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Fix Released
Undecided
Martin Kopec

Bug Description

I had a packstack rocky deployment and I cloned tempest from master branch. Tempest cleaned up also networks which it wasn't supposed to.

I had two following networks available after the deployment.
$ openstack network list
+--------------------------------------+---------+--------------------------------------+
| ID | Name | Subnets |
+--------------------------------------+---------+--------------------------------------+
| 47bd1566-e13e-42b5-ad4b-fea122f425e9 | private | 98af2db8-2624-4471-8093-e983832e5dd9 |
| b121d356-2296-4100-9880-1a4410b4fd40 | public | 860fe620-6227-4349-9d8a-5947aa0f303b |
+--------------------------------------+---------+--------------------------------------+

After init-saved-state we can see, that there are no networks mentioned in the saved_state.json file - that's quite suspicious.
$ tempest cleanup --init-saved-state
$ cat saved_state.json
{
  "domains": {
    "default": "Default"
  },
  "flavors": {
    "1": "m1.tiny",
    "2": "m1.small",
    "3": "m1.medium",
    "4": "m1.large",
    "5": "m1.xlarge",
    "a83007fa-4b1b-4957-a731-1d82a6a264e0": "m1.micro",
    "dbc75455-e92d-4a3d-ace7-064ca8fc6324": "m1.nano"
  },
  "images": {},
  "projects": {
    "1efbbf4b524d44f694241e65dac11472": "alt_demo",
    "588d28c2d68e4857b72e7bd549141d55": "demo",
    "bc0183ea801d46ab869f2b40c15b2269": "admin",
    "c90f86114f444c09baa8a126fdf27f8b": "services"
  },
  "roles": {},
  "users": {
    "202c643912194efab607ab7b8e1f46da": "admin",
    "318fcff041f14bab9041759d54bdeecc": "cinder",
    "3b69696b13604b44a47475d383d65704": "nova",
    "534c5b16677044c1be2fe07d452d32dd": "aodh",
    "5ac1446caa9e4b6aa34d41dc7eb2f5d4": "neutron",
    "72cc57a76ea4477e8a33e88ca684d48a": "alt_demo",
    "75b2eb04cebf458ea8e489d16b2bb9b5": "demo",
    "8afab28d27ff466eaddb03bb423bd0a5": "swift",
    "aca4fcee47054713a1e91ee1ba277912": "glance",
    "b19acee3e3aa4c578848801f33c885b7": "gnocchi",
    "e07fcb5e10b245649d245a66d7451a96": "placement",
    "ebcd4caf6e274fb1a1eb395188233fb8": "ceilometer"
  }
}

Then I tried cleanup with --dry-run option and the generated file contained nothing to clean - that was good, but it was weird that the networks were not there because they weren't present in the saved_state.json, so does it mean that the `tempest cleanup` is not able to remove networks?
$ tempest cleanup --dry-run
$ cat dry_run.json
{
  "_projects_to_clean": {},
  "domains": [],
  "flavors": [],
  "images": [],
  "projects": [],
  "roles": [],
  "users": []
}

Then I did a little experiment and ran smoke tests:
$ tempest run --smoke
After a few seconds (let's say 30 sec) I killed the process, in order to make tempest left some networks and other resources behind.

$ openstack network list
+--------------------------------------+---------------------------------------------------------+--------------------------------------+
| ID | Name | Subnets |
+--------------------------------------+---------------------------------------------------------+--------------------------------------+
| 47bd1566-e13e-42b5-ad4b-fea122f425e9 | private | 98af2db8-2624-4471-8093-e983832e5dd9 |
| 69b82de6-2b24-4f15-b31c-64b304e4deb1 | tempest-ServersTestManualDisk-1308724128-network | e82f8798-b764-4556-9dcc-e0d08b20c02b |
| 8fe59e7d-89cb-4aed-82c4-4be235d38fdd | tempest-AttachInterfacesUnderV243Test-283077215-network | 28735895-9539-4475-b73b-70bd35ebe9c3 |
| b121d356-2296-4100-9880-1a4410b4fd40 | public | 860fe620-6227-4349-9d8a-5947aa0f303b |
| d3e43571-f9c6-48e0-99e6-f43f34c95581 | tempest-ServersTestBootFromVolume-577301809-network | 4e5db97f-b7fb-4fac-8de9-ae771f2d685a |
| edbe9f22-0dfb-44d3-ad43-497bbc8b0883 | tempest-ServerActionsTestJSON-293603464-network | 4cbf930e-dd9d-46f8-b7eb-5f15c4d09671 |
+--------------------------------------+---------------------------------------------------------+--------------------------------------+

I tried again dry cleanup:
$ tempest cleanup --dry-run

and I could see that even my private network was present in the dry_run.json file - but why now? Before it wasn't marked as a leftover resource (even it should have, because it was not in the saved_state.json), but now yes.

And after I did the real cleanup, the network was gone.
$ openstack network list
+--------------------------------------+--------+--------------------------------------+
| ID | Name | Subnets |
+--------------------------------------+--------+--------------------------------------+
| b121d356-2296-4100-9880-1a4410b4fd40 | public | 860fe620-6227-4349-9d8a-5947aa0f303b |
+--------------------------------------+--------+--------------------------------------+

I guess that the public network wasn't deleted, because its ID is also written in my tempest.conf file. But the private network shouldn't be deleted and therefore networks should be also part of the saved_state.json file.

Martin Kopec (mkopec)
Changed in tempest:
assignee: nobody → Martin Kopec (mkopec)
Revision history for this message
Martin Kopec (mkopec) wrote :

When tempest cleanup queries the cloud for discovering initial state, it iterates only via global services [1][2] and only those services are listed for any initial state, which is then saved to saved_state.json.
Then the cleanup continues by iterating over projects [3], where every found project is cleaned by _clean_project method. Cleaning of every project means iterating over project_services [4][5] and executing their delete/dry_run methods.
Note, that these project_services [5] don't have implemented save_stae method and their list method doesn't contain code which filters saved resources out - that means that if the delete/dry_run method is executed, it will delete all resources found.

Now to the question why my private network was deleted.
When a project is cleaned in the _clean_project method and the loop over project_services [4] is on NetworkSubnetService, its list method discovers also private subnet and that's why it's deleted.
The same happens when the loop [4] is on NetworkService - its list method discovers the private network and it's deleted.
Why?
Because NetworkSubnetService [6] and NetworkService don't have implemented save_state method and their list methods don't exclude resources (subnets and networks) from saved_state.json.

When you have a look at the definition of these project_services [8], you'll see that none of them contains saved_state method and list method which would contain code for excluding initial resources. That's very strange because those services have implemented delete method. Is it really intended? I would say that every class/service which deletes resources should have a code for their initial discovery. Otherwise we may end up in a situation in which this bug was created.

[1] https://github.com/openstack/tempest/blob/c1d32676102cf2e5a04083aa66fe76179412b200/tempest/cmd/cleanup.py#L304
[2] https://github.com/openstack/tempest/blob/c1d32676102cf2e5a04083aa66fe76179412b200/tempest/cmd/cleanup_service.py#L954
[3] https://github.com/openstack/tempest/blob/c1d32676102cf2e5a04083aa66fe76179412b200/tempest/cmd/cleanup.py#L154
[4] https://github.com/openstack/tempest/blob/c1d32676102cf2e5a04083aa66fe76179412b200/tempest/cmd/cleanup.py#L204
[5] https://github.com/openstack/tempest/blob/c1d32676102cf2e5a04083aa66fe76179412b200/tempest/cmd/cleanup_service.py#L925
[6] https://github.com/openstack/tempest/blob/c1d32676102cf2e5a04083aa66fe76179412b200/tempest/cmd/cleanup_service.py#L655
[7] https://github.com/openstack/tempest/blob/c1d32676102cf2e5a04083aa66fe76179412b200/tempest/cmd/cleanup_service.py#L353
[8] https://github.com/openstack/tempest/blob/c1d32676102cf2e5a04083aa66fe76179412b200/tempest/cmd/cleanup_service.py#L353-L677

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/636196

Changed in tempest:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/636387

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/637442

Revision history for this message
Martin Kopec (mkopec) wrote :

So in order to fix and improve tempest cleanup I posted several reviews.

Phase 1 was to remove any unused or deprecated services, that was done by:
https://review.openstack.org/#/c/636387/

Phase 2 was to fix any issues with the current code, f.e. missing save_state methods were implemented in order to avoid unexpected resource removal:
https://review.openstack.org/#/c/636196/

Phase 3 was adding unit tests to exercise the cleanup code:
https://review.openstack.org/#/c/637442/

And the last phase 4 will be adding support for any services which tempest cleanup doesn't support yet but it should (if there are such services there).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tempest (master)

Reviewed: https://review.openstack.org/636387
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=9e43fd8bbe9355f6cf1bb15d9b0240988a3b2a5c
Submitter: Zuul
Branch: master

commit 9e43fd8bbe9355f6cf1bb15d9b0240988a3b2a5c
Author: Martin Kopec <email address hidden>
Date: Tue Feb 12 16:47:27 2019 +0000

    Remove deprecated services from cleanup

    The patch removes deprecated services from
    cleanup_service.py:
     * NetworkVipService
     * NetworkMemberService
     * NetworkHealthMonitorService
     * NetworkPoolService
     * FloatingIpService
     * SecurityGroupService
    The above services are not used by the cleanup tool, they
    call clients which were already removed from Tempest or
    the clients are marked as deprecated ones.

    Change-Id: I55ddbce64404c67688600dc6b1231d0bd8ff7006
    Related-Bug: #1812660

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/637533

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/637950

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tempest (master)

Reviewed: https://review.openstack.org/636196
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=5a884bfbb0a8436886593a00b8fe5031f3d3ee38
Submitter: Zuul
Branch: master

commit 5a884bfbb0a8436886593a00b8fe5031f3d3ee38
Author: Martin Kopec <email address hidden>
Date: Mon Feb 11 18:10:55 2019 +0000

    Fix tempest cleanup

    Edit service classes so that they discover initial
    state of resources before deleting any.

    Unify service names - f.e. if a service returns resources
    in a list named server_groups, server_groups should be
    the key of initial resources in the saved_state.json.

    When is_preserve is True, security groups in
    NetworkSecGroupService were filtered by networks present
    in tempest.conf, however, these groups are associated
    with a project_id, therefore it should be filtered
    against projects present in tempest.conf.

    Change-Id: I97d0115bbb43a089b33602df7c98e153984ceaf1
    Related-Bug: #1812660

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/637442
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=470aca738ca8916f6bc8700c04551372e04aeea6
Submitter: Zuul
Branch: master

commit 470aca738ca8916f6bc8700c04551372e04aeea6
Author: Martin Kopec <email address hidden>
Date: Mon Feb 18 00:05:13 2019 +0000

    Add unit tests for tempest cleanup

    In the previous patches, tempest cleanup got improved
    and new methods were implemented. This review adds
    more unit tests to exercise those changes and to
    improve tempest cleanup test coverage.

    Change-Id: Ibf30162e49a8cf87accdbe7f0a6cc38941873d5e
    Related-Bug: #1812660

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/637533
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=e681998023d9714da346d117e18ab673b5657f71
Submitter: Zuul
Branch: master

commit e681998023d9714da346d117e18ab673b5657f71
Author: Martin Kopec <email address hidden>
Date: Mon Feb 18 12:34:52 2019 +0000

    Add NetworkSubnetPools to tempest cleanup

    The review adds support for NetworkSubnetPools service,
    so if there are leftover subnet pools, tempest cleanup
    is able to detect them and remove eventually.

    Change-Id: Ieecde490d5eb20e1a894a7bdf3bcf0e7a54c08e2
    Related-Bug: #1812660

Revision history for this message
Martin Kopec (mkopec) wrote :

The important reviews which solved the issue got merged to master, so I'm gonna mark this as Fix Released.

Changed in tempest:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tempest (master)

Change abandoned by "Martin Kopec <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tempest/+/637950

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.