Erroneously determined to be completed when updating a container

Bug #1998756 reported by Ayumu Ueha
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tacker
Fix Released
Low
Ayumu Ueha

Bug Description

The following error is occured in test_container_update_multi_kinds method of multinode-sol job.

```
ft1.1: tacker.tests.functional.sol_kubernetes.vnflcm.test_kubernetes_container_update.VnfLcmKubernetesContainerUpdate.test_container_update_multi_kindstesttools.testresult.real._StringException: Traceback (most recent call last):
  File "/home/zuul/src/opendev.org/openstack/tacker/tacker/tests/functional/sol_kubernetes/vnflcm/test_kubernetes_container_update.py", line 98, in test_container_update_multi_kinds
    self.assertNotEqual(before_resource['resourceId'],
  File "/usr/lib/python3.8/unittest/case.py", line 921, in assertNotEqual
    raise self.failureException(msg)
AssertionError: 'vdu1-update-5b9d95d894-l8xnp' == 'vdu1-update-5b9d95d894-l8xnp'
```

In my investigation, this is due to the fact that the update process has not completed yet, but lcm process has been finished.

[Before replace api]
  vdu1-update-764bbf8846-q5hzh: phase=RUNNING

[After replace api]
  vdu1-update-764bbf8846-q5hzh: phase=RUNNING
  vdu1-update-5b9d95d894-l8xnp: phase=PENDING <- newly created

[Erroneously determined state]
  vdu1-update-764bbf8846-q5hzh: phase=RUNNING <- not yet deleted
  vdu1-update-5b9d95d894-l8xnp: phase=RUNNING <- become to "RUNNING"

  ==> Judge as waiting completed <- wrong!!

[Ideal state of judging as finished]
  vdu1-update-5b9d95d894-l8xnp: phase=RUNNING

  ==> Old pod is deleted completely.
      (number of associated Pods is the same as Replicas.)

Ayumu Ueha (ueha)
Changed in tacker:
assignee: nobody → Ayumu Ueha (ueha)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tacker (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tacker/+/866583

Changed in tacker:
status: New → In Progress
Revision history for this message
Yasufumi Ogawa (yasufum) wrote :

Let me confirm the cause of the failure because your description is something ambiguous. Is the point of the problem just an old process is still running although it's expected to be terminated? So, you mentioned "[Erroneously determined state]" is not erroneously actually because the status RUNNING is not wrong actually, correct?

Changed in tacker:
importance: Undecided → Low
Revision history for this message
Ayumu Ueha (ueha) wrote :

Thank you for your question.

> Is the point of the problem just an old process is still running although it's expected to be terminated?

No. The problem is that the old Pod that is being replaced by replace remains and has stopped waiting for the replace to complete.

> So, you mentioned "[Erroneously determined state]" is not erroneously actually because the status RUNNING is not wrong actually, correct?

Kubernetes also returns the state of the pod as "Running" while it is being removed.
In the current implementation, when checking to see if it is really removed and replace is complete, it will decide that it is OK to leave the old Pods that are being removed.

This really need to exit the wait process when the completely old pod is removed.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tacker (master)

Reviewed: https://review.opendev.org/c/openstack/tacker/+/866583
Committed: https://opendev.org/openstack/tacker/commit/be6d22b40986f0d0030c02cce43057aa8fa2a281
Submitter: "Zuul (22348)"
Branch: master

commit be6d22b40986f0d0030c02cce43057aa8fa2a281
Author: Ayumu Ueha <email address hidden>
Date: Mon Dec 5 14:12:44 2022 +0000

    Fix waiting for pod removal when container update

    This patch fixes the problem of aborting the wait process even if there
    are pods left to be removed by replace when container update feature.

    Closes-Bug: #1998756
    Change-Id: I6bc5c10fe4c7f646b84798d3b4a296721ccea0f8

Changed in tacker:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tacker 9.0.0.0rc1

This issue was fixed in the openstack/tacker 9.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.