tempest cleanup removes all networks
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tempest |
Fix Released
|
Undecided
|
Martin Kopec |
Bug Description
I had a packstack rocky deployment and I cloned tempest from master branch. Tempest cleaned up also networks which it wasn't supposed to.
I had two following networks available after the deployment.
$ openstack network list
+------
| ID | Name | Subnets |
+------
| 47bd1566-
| b121d356-
+------
After init-saved-state we can see, that there are no networks mentioned in the saved_state.json file - that's quite suspicious.
$ tempest cleanup --init-saved-state
$ cat saved_state.json
{
"domains": {
"default": "Default"
},
"flavors": {
"1": "m1.tiny",
"2": "m1.small",
"3": "m1.medium",
"4": "m1.large",
"5": "m1.xlarge",
"a83007fa-
"dbc75455-
},
"images": {},
"projects": {
"1efbbf4b52
"588d28c2d6
"bc0183ea80
"c90f86114f
},
"roles": {},
"users": {
"202c643912
"318fcff041
"3b69696b13
"534c5b1667
"5ac1446caa
"72cc57a76e
"75b2eb04ce
"8afab28d27
"aca4fcee47
"b19acee3e3
"e07fcb5e10
"ebcd4caf6e
}
}
Then I tried cleanup with --dry-run option and the generated file contained nothing to clean - that was good, but it was weird that the networks were not there because they weren't present in the saved_state.json, so does it mean that the `tempest cleanup` is not able to remove networks?
$ tempest cleanup --dry-run
$ cat dry_run.json
{
"_projects_
"domains": [],
"flavors": [],
"images": [],
"projects": [],
"roles": [],
"users": []
}
Then I did a little experiment and ran smoke tests:
$ tempest run --smoke
After a few seconds (let's say 30 sec) I killed the process, in order to make tempest left some networks and other resources behind.
$ openstack network list
+------
| ID | Name | Subnets |
+------
| 47bd1566-
| 69b82de6-
| 8fe59e7d-
| b121d356-
| d3e43571-
| edbe9f22-
+------
I tried again dry cleanup:
$ tempest cleanup --dry-run
and I could see that even my private network was present in the dry_run.json file - but why now? Before it wasn't marked as a leftover resource (even it should have, because it was not in the saved_state.json), but now yes.
And after I did the real cleanup, the network was gone.
$ openstack network list
+------
| ID | Name | Subnets |
+------
| b121d356-
+------
I guess that the public network wasn't deleted, because its ID is also written in my tempest.conf file. But the private network shouldn't be deleted and therefore networks should be also part of the saved_state.json file.
Changed in tempest: | |
assignee: | nobody → Martin Kopec (mkopec) |
When tempest cleanup queries the cloud for discovering initial state, it iterates only via global services [1][2] and only those services are listed for any initial state, which is then saved to saved_state.json.
Then the cleanup continues by iterating over projects [3], where every found project is cleaned by _clean_project method. Cleaning of every project means iterating over project_services [4][5] and executing their delete/dry_run methods.
Note, that these project_services [5] don't have implemented save_stae method and their list method doesn't contain code which filters saved resources out - that means that if the delete/dry_run method is executed, it will delete all resources found.
Now to the question why my private network was deleted. rvice, its list method discovers also private subnet and that's why it's deleted. rvice [6] and NetworkService don't have implemented save_state method and their list methods don't exclude resources (subnets and networks) from saved_state.json.
When a project is cleaned in the _clean_project method and the loop over project_services [4] is on NetworkSubnetSe
The same happens when the loop [4] is on NetworkService - its list method discovers the private network and it's deleted.
Why?
Because NetworkSubnetSe
When you have a look at the definition of these project_services [8], you'll see that none of them contains saved_state method and list method which would contain code for excluding initial resources. That's very strange because those services have implemented delete method. Is it really intended? I would say that every class/service which deletes resources should have a code for their initial discovery. Otherwise we may end up in a situation in which this bug was created.
[1] https:/ /github. com/openstack/ tempest/ blob/c1d3267610 2cf2e5a04083aa6 6fe76179412b200 /tempest/ cmd/cleanup. py#L304 /github. com/openstack/ tempest/ blob/c1d3267610 2cf2e5a04083aa6 6fe76179412b200 /tempest/ cmd/cleanup_ service. py#L954 /github. com/openstack/ tempest/ blob/c1d3267610 2cf2e5a04083aa6 6fe76179412b200 /tempest/ cmd/cleanup. py#L154 /github. com/openstack/ tempest/ blob/c1d3267610 2cf2e5a04083aa6 6fe76179412b200 /tempest/ cmd/cleanup. py#L204 /github. com/openstack/ tempest/ blob/c1d3267610 2cf2e5a04083aa6 6fe76179412b200 /tempest/ cmd/cleanup_ service. py#L925 /github. com/openstack/ tempest/ blob/c1d3267610 2cf2e5a04083aa6 6fe76179412b200 /tempest/ cmd/cleanup_ service. py#L655 /github. com/openstack/ tempest/ blob/c1d3267610 2cf2e5a04083aa6 6fe76179412b200 /tempest/ cmd/cleanup_ service. py#L353 /github. com/openstack/ tempest/ blob/c1d3267610 2cf2e5a04083aa6 6fe76179412b200 /tempest/ cmd/cleanup_ service. py#L353- L677
[2] https:/
[3] https:/
[4] https:/
[5] https:/
[6] https:/
[7] https:/
[8] https:/