tenant_vif_port_id is not removed during tear_down

Bug #1685592 reported by Kaifeng Wang
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Medium
Vasyl Saienko

Bug Description

Normally, tenant vif port is removed by nova ironic driver during tear_down. In some cases, an instance is removed from nova, but still residents at ironic node, leaving status unsynchronized.

We manually control ironic node into available state by deleted operation, however the node can't be deploy successfully. We found that tenant_vif_port is not removed in this case, the remained vif does not reflect real environment which is not exists in neutron anymore.

Proposed change:
During tear_down of deploy drivers, check vif attach information, and remove them if exists.

Kaifeng Wang (kaifeng)
Changed in ironic:
assignee: nobody → Wang KaiFeng (kaifeng)
Revision history for this message
Sam Betts (sambetts) wrote :

Hi, thanks for the bug report, your description implies that you are manually tearing down the in Ironic instead of using "nova delete", is this the case?

Changed in ironic:
status: New → Incomplete
Revision history for this message
Kaifeng Wang (kaifeng) wrote :

Hi Sam, the scenario is:
1. an ironic node is deployed, and hosting an instance.
2. the operator delete instance by nova delete.
3. due to network issue, the request from nova.virt.ironic to ironic api is lost
4. nova mark instance as error without waiting (it should after 2 minutes by default, but that's another issue).
5. nova delete instance again, at this time nova will not call virt driver, so ironic has no chance to clean port information.
6. we manually move node to available state (by set-provision-state deleted).
7. when the node is scheduled to deploy next time, ironic would complain tenant_vif_port is not found in neutron, and fail to deploy.

In such an error state, we can't move the node to available state, because ironic will unconfigure tenant network during tear_down, which can't succeed due to the same reason.

At first, I add some code to tear_down to clean the internal vif information from existing ports of the node, but realized later that is't not quite good to do so.

I am thinking, can we remove related vif information when we know that the vif is not existed in neutron any more, that may solve the problem.

Revision history for this message
Kaifeng Wang (kaifeng) wrote :

I found this bug is fixed by vsaienko in master, we can close now.

See also: https://github.com/openstack/ironic/blob/master/ironic/common/neutron.py#L91-L93

Changed in ironic:
assignee: Wang KaiFeng (kaifeng) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Ironic because there has been no activity for 60 days.]

Changed in ironic:
status: Incomplete → Expired
Pierre Riteau (priteau)
Changed in ironic:
status: Expired → Confirmed
Revision history for this message
Vladyslav Drok (vdrok) wrote :

Hi Pierre! Do you mind explaining why you've reopened this?

Changed in ironic:
status: Confirmed → Incomplete
Revision history for this message
Pierre Riteau (priteau) wrote :

Hi Vlad. I was seeing this issue while configuring multi-tenant networking with Ironic, so I clicked the "Does this bug affect you?" link to remember about it. It seems to have reopened the bug automatically, sorry. I am using Ocata which doesn't have the fix.

I am not seeing the issue anymore now that I have finished setting up the multi-tenant configuration, although I guess I could reproduce it using kaifeng's method. Maybe this fix should be backported? I believe it is https://github.com/openstack/ironic/commit/940c87d6c49be181d969e85d343b745149b6db10, kaifeng's link refers to master so lines have changed.

Revision history for this message
Kaifeng Wang (kaifeng) wrote :

Hi Pierre, your link is right, the fix did not file a bug, i attached gerrit link here for reference https://review.openstack.org/#/c/451691/
If you need this fix for ocata, i believe a backport is totally fine.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/505367

Revision history for this message
Ruby Loo (rloo) wrote :

Hi, this is an odd bug. Really, it has been fixed, and we want to backport it, so I'll update the status to reflect that :) Thx Wang & Pierre.

Changed in ironic:
status: Incomplete → Confirmed
status: Confirmed → Fix Released
importance: Undecided → Medium
assignee: nobody → Vasyl Saienko (vsaienko)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (stable/ocata)

Reviewed: https://review.openstack.org/505367
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=71d1c26376c84783e2c0176b9222e0343df701eb
Submitter: Zuul
Branch: stable/ocata

commit 71d1c26376c84783e2c0176b9222e0343df701eb
Author: Vasyl Saienko <email address hidden>
Date: Thu Mar 30 11:18:59 2017 +0300

    Skip PortNotFound when unbinding port

    There might be cases when user deleted port before doing vif_detach.
    With this patch info message will be shown in the logs for such cases.

    Change-Id: I60deab450b427f1a1b4ccb0bb5963ec30d255d48
    Closes-Bug: #1685592
    (cherry picked from commit 940c87d6c49be181d969e85d343b745149b6db10)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ironic 7.0.4

This issue was fixed in the openstack/ironic 7.0.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.