Deleted applications leave behind errored base instances

Bug #2046249 reported by Michele Lo Russo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Anbox Cloud
Fix Committed
Undecided
Jatin Arora

Bug Description

1. A description of the problem

In both our dev VM and our demo server we currently have dangling base instances for deleted applications:

```
ubuntu@vm12:~$ amc ls
+----------------------+--------------------------------+------+--------+------+------+---------------+-----------+-------+----------------+
| ID | APPLICATION | TYPE | STATUS | TAGS | NODE | ADDRESS | ENDPOINTS | VM | STATUS MESSAGE |
+----------------------+--------------------------------+------+--------+------+------+---------------+-----------+-------+----------------+
| cll5f6a7h686018t8e6g | deleted (cll5f627h686018t8e60) | base | error | | lxd0 | 192.168.96.15 | | false | |
+----------------------+--------------------------------+------+--------+------+------+---------------+-----------+-------+----------------+
| clmpdni7h686018t8eq0 | deleted (clmpdni7h686018t8epg) | base | error | | lxd0 | 192.168.96.7 | | false | |
+----------------------+--------------------------------+------+--------+------+------+---------------+-----------+-------+----------------+
| clmpfra7h686018t8erg | deleted (clmpfra7h686018t8er0) | base | error | | lxd0 | 192.168.96.8 | | false | |
+----------------------+--------------------------------+------+--------+------+------+---------------+-----------+-------+----------------+
ubuntu@vm12:~$ amc application ls
+----------------------+-------------------+---------------+--------+------+-----------+--------+---------------------+-------+
| ID | NAME | INSTANCE TYPE | ADDONS | TAGS | PUBLISHED | STATUS | LAST UPDATED | VM |
+----------------------+-------------------+---------------+--------+------+-----------+--------+---------------------+-------+
| ciogteoinebsvq41budg | lorumic-bombsquad | a4.3 | | tag1 | true | ready | 2023-12-11 20:59:11 | false |
+----------------------+-------------------+---------------+--------+------+-----------+--------+---------------------+-------+
| cir4g6ginebsvq41bv50 | lorumic-1list | a4.3 | | | true | ready | 2023-12-11 20:59:11 | false |
+----------------------+-------------------+---------------+--------+------+-----------+--------+---------------------+-------+
```

2. A set of steps to reproduce the problem

Not available. It is unclear when and why this problem occurs. It might be occurring erratically on application deletion.

3. Logs as outlined on https://anbox-cloud.io/docs/howto/troubleshoot/landing

Not available. All instances in this state have no logs stored, and `error_message` is empty for all of them except the one on the demo server, which has the following error message: "Failed to create instance update operation: A matching non-reusable operation has now succeeded".

Simon Fels (morphis)
information type: Public → Private
information type: Private → Public
description: updated
Simon Fels (morphis)
Changed in anbox-cloud:
assignee: nobody → Jatin Arora (jatinarora)
Simon Fels (morphis)
Changed in anbox-cloud:
milestone: none → 1.21.0
assignee: Jatin Arora (jatinarora) → Gary.Wang (gary-wzl77)
Revision history for this message
Gary.Wang (gary-wzl77) wrote :

Thanks for the report.

I took a closer look at the AMS logs in VM12, and the earliest entries related to the dangling base instance issue are as follows:

```
2024-01-14T14:57:06Z ams.ams[637614]: I0114 14:57:06.114456 637614 housekeeper.go:276] Housekeeper: Processing task clmpdni7h686018t8eqg with status error
2024-01-14T14:57:06Z ams.ams[637614]: I0114 14:57:06.115212 637614 housekeeper.go:276] Housekeeper: Processing task cll5f6a7h686018t8e70 with status error
2024-01-14T14:57:06Z ams.ams[637614]: I0114 14:57:06.120490 637614 housekeeper.go:276] Housekeeper: Processing task clmpfra7h686018t8es0 with status error
2024-01-14T14:57:06Z ams.ams[637614]: I0114 14:57:06.122499 637614 housekeeper.go:327] Housekeeper: Container clmpdni7h686018t8eq0 is already marked as failed
2024-01-14T14:57:06Z ams.ams[637614]: I0114 14:57:06.122703 637614 housekeeper.go:327] Housekeeper: Container cll5f6a7h686018t8e6g is already marked as failed
2024-01-14T14:57:06Z ams.ams[637614]: I0114 14:57:06.123620 637614 housekeeper.go:327] Housekeeper: Container clmpfra7h686018t8erg is already marked as failed

```

The corresponding application deletion logs, which are most likely causing this situation, have been truncated. Until now, I have not found a reliable way to reproduce this problem. Before we address or work around this issue, if the dangling base instances now prevents AMS from scheduling new instances, please refer to the workaround below.

https://discourse.ubuntu.com/t/anbox-container-change-disk-size-doesnt-work/41931/8?u=gary-wzl77

Thanks
Gary

Simon Fels (morphis)
Changed in anbox-cloud:
status: New → Incomplete
Revision history for this message
Michele Lo Russo (lorumic) wrote :

Hi Gary, we have had a lot more occurrences in the last few days, in case you want to have another look. They are piling up in the VM12.

Keirthana (keirthana)
Changed in anbox-cloud:
milestone: 1.21.0 → 1.21.2
Revision history for this message
Michele Lo Russo (lorumic) wrote :

UPDATE: I think I found a way to reproduce this bug reliably.

I'm operating from the dashboard, so haven't tried with the CLI.

I create a new application, with no APK. Immediately after creating it, I delete it (while it is still "initializing" - something that we do quite often for testing purposes, maybe because we just need to test something in the application creation flow and don't really need the application after).

After deleting it, the application transitions to an "Error" status, so I (impatiently) delete it again. At that point, the application transitions to a "Deleted" status.

After a few seconds, the application disappears from the applications list, but if I go to the instance list, I see a "deleted (<id>)" base instance dangling there for that application I had just deleted.

I hope this helps, please let me know if something is not clear.

Gary.Wang (gary-wzl77)
Changed in anbox-cloud:
status: Incomplete → Triaged
Gary.Wang (gary-wzl77)
Changed in anbox-cloud:
milestone: 1.21.2 → 1.22.0
status: Triaged → Fix Committed
assignee: Gary.Wang (gary-wzl77) → nobody
assignee: nobody → Jatin Arora (jatinarora)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.