Master/Wallaby: Failure deleting stack in fs002 - "One or more ports have an IP allocation from this subnet.

Bug #1925367 reported by Ronelle Landy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

fs002 master and wallaby jobs are failing when deleting the stack:

Tempest passes and then the stack delete fails as below:

2021-04-21 01:18:58.888294 | primary | FAILED - RETRYING: check for delete command to complete or fail (1 retries left).
2021-04-21 01:19:11.919491 | primary | fatal: [undercloud]: FAILED! => {
2021-04-21 01:19:11.919549 | primary | "attempts": 60,
2021-04-21 01:19:11.919563 | primary | "changed": true,
2021-04-21 01:19:11.919574 | primary | "cmd": "source /home/zuul/stackrc\nopenstack stack show $(cat /home/zuul/overcloud_id) -f yaml\n",
2021-04-21 01:19:11.919585 | primary | "delta": "0:00:02.665803",
2021-04-21 01:19:11.919596 | primary | "end": "2021-04-21 01:19:11.880711",
2021-04-21 01:19:11.919967 | primary | "rc": 0,
2021-04-21 01:19:11.919987 | primary | "start": "2021-04-21 01:19:09.214908"
2021-04-21 01:19:11.919998 | primary | }

2021-04-21 01:19:11.920149 | primary | stack_status: DELETE_FAILED
2021-04-21 01:19:11.920160 | primary | stack_status_reason: 'Resource DELETE failed: Conflict: resources.Networks.resources.InternalApiNetwork.resources.InternalApiSubnet:
2021-04-21 01:19:11.920765 | primary | Unable to complete operation on subnet 6db9b80b-4106-4d28-af45-4a60ae4236fd: One
2021-04-21 01:19:11.920780 | primary | or more ports have an IP allocation from this subnet.

Example logs:

https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-wallaby/a3e97ca/job-output.txt
https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-master/3e34109/job-output.txt

2021-04-21 11:58:58.022653 | primary | stack_status_reason: 'Resource DELETE failed: Conflict: resources.Networks.resources.InternalApiNetwork.resources.InternalApiSubnet:
2021-04-21 11:58:58.022668 | primary | Unable to complete operation on subnet 8058e6d0-a654-4048-b227-4a897256c3d4: One
2021-04-21 11:58:58.022677 | primary | or more ports have an IP allocation from this subnet.

Ronelle Landy (rlandy)
Changed in tripleo:
status: New → Triaged
importance: Undecided → Critical
milestone: none → wallaby-rc1
tags: added: ci promotion-blocker
Revision history for this message
Ronelle Landy (rlandy) wrote :

 002 jobs are only one that delete the overcloud stack - tempest may have some leftovers that block stack deletion

Revision history for this message
Rabi Mishra (rabi) wrote :

I think it's a regression from https://review.opendev.org/c/openstack/tripleo-heat-templates/+/777259.

As the networks/subnets are 'still' created as part of the heat stack and then ports are created outside heat, those subnet heat resources could not be deleted, so the stack deletion fails. I've also commented in the patch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/787468

Revision history for this message
Rabi Mishra (rabi) wrote :

Though the assumption that networks/subnets are managed by heat, whereas ports are created on those networks outside heat is incorrect, as we plan to move to managing networks outside heat as default soon, I've proposed a few patches

1. Leave the network resources when deleting (if not managed by heat)
https://review.opendev.org/c/openstack/tripleo-ansible/+/787468

2. Change fs002 to provision networks before heat stack.
https://review.opendev.org/c/openstack/tripleo-quickstart/+/787469

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-ansible (master)

Change abandoned by "Rabi Mishra <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/787468
Reason: DeletionPolicy has to be changed prior to (an earlier update) it being managed without heat. In the same update it won't work.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Harald Jensås <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/787514
Reason: Let's move forward with Rabi's proposal.

Revision history for this message
Ronelle Landy (rlandy) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-quickstart (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-quickstart/+/787469
Committed: https://opendev.org/openstack/tripleo-quickstart/commit/b12659df86c820d634926a66bb151d31ca6dcfd3
Submitter: "Zuul (22348)"
Branch: master

commit b12659df86c820d634926a66bb151d31ca6dcfd3
Author: ramishra <email address hidden>
Date: Thu Apr 22 08:58:57 2021 +0530

    Provision networks prior to heat stack in fs002

    Related-Bug: #1925367
    Change-Id: I81d098822af34f16f532aa48fa57025d29c511da

Ronelle Landy (rlandy)
Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.