Bug #2032172 “Containers stuck at started status” : Bugs : Anbox Cloud

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-21:

#1

container.log Edit (207.4 KiB, text/plain)

Revision history for this message

Simon Fels (morphis) wrote on 2023-08-21:

#2

Hey Ashish,

Can you describe a bit further what you mean by "are not functioning as expected"?

Can you provide us the output of

$ sudo anbox-cloud-appliance.buginfo

Thanks!

Changed in anbox-cloud:
status:	New → Incomplete
assignee:	nobody → Simon Fels (morphis)

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-21:

#3

appliance.buginfo Edit (104.8 KiB, text/plain)

Hi Simon,

Please find the attached file for bug report.

Revision history for this message

Gary.Wang (gary-wzl77) wrote on 2023-08-21:

#4

Hey Ashish
Thanks for the attached log.
- Regarding the issue `Containers stuck at started status` you mentioned above. Do you mean whenever you launch a new container, it got stuck at *started* status and never progresses to the *running* status?

  - If possible,
    1. Could you please enable the debug logging for ams
      $ /snap/anbox-cloud-appliance/current/bin/juju config -m appliance:anbox-cloud ams log_level=debug

    2. Then launch a new container as you did previously
    3. Once the container got stuck at started status for a long time, running the following command and share the output with us
      $ /snap/anbox-cloud-appliance/current/bin/juju ssh -m appliance:anbox-cloud ams/0 "sudo snap logs -n=all ams | grep <new_container_id>"

Also please dump the details of lxd0 node as well:
$ /snap/anbox-cloud-appliance/current/bin/juju ssh -m appliance:anbox-cloud ams/0 "sudo ETCDCTL_API=3 /snap/etcd/current/bin/etcdctl --debug --cert=/var/snap/ams/common/etcd/client-cert.pem --key=/var/snap/ams/common/etcd/client-key.pem --insecure-transport=true --endpoints=240.12.250.85:2379 get /ams/1.0/nodes/lxd0"

Thanks
Gary

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-21:

#5

Download full text (11.0 KiB)

Hi Gary,

So when start launching the container for some time container start and run properly with running status.
After some time we observed that the container was stuck in started status and then change to running status and then error.

I have attached the screenshot for all the references. Also below are the details you have asked for.

1. Could you please enable the debug logging for ams = done.
2. Once the container got stuck at started status for a long time, running the following command and share the output with us =

output = ubuntu@ip-172-31-15-176:~$ amc ls
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| ID | APPLICATION | TYPE | STATUS | TAGS | NODE | ADDRESS | ENDPOINTS |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl2u41jpke7jvo4tug | app | regular | stopped | | lxd0 | 192.168.96.1 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl32s1jpke7jvo4tvg | app | regular | stopped | | lxd0 | 192.168.96.2 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl33s1jpke7jvo4u0g | app | regular | stopped | | lxd0 | 192.168.96.3 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl34s1jpke7jvo4u1g | app | regular | stopped | | lxd0 | 192.168.96.4 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl35s1jpke7jvo4u2g | app | regular | stopped | | lxd0 | 192.168.96.5 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl37s1jpke7jvo4u3g | app | regular | stopped | | lxd0 | 192.168.96.6 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3941jpke7jvo4u4g | app | regular | stopped | | lxd0 | 192.168.96.7 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl39s1jpke7jvo4u5g | app | regular | stopped | | lxd0 | 192.168.96.8 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3as1jpke7jvo4u6g | app | regular | stopped | | lxd0 | 192.168.96.9 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3c41jpke7jvo4u7g | app | regular | stopped | | lxd0 | 192.168.96.10 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3dc1jpke7jvo4u8g | app | regular | error | | lxd0 | 192.168.96.11 | |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3ec1jpke7jvo4u9g | app | regular | error | | lxd0 | 192.168.96.12 | ...

Hi Gary,

So when start launching the container for some time container start and run properly with running status.
After some time we observed that the container was stuck in started status and then change to running status and then error.

I have attached the screenshot for all the references. Also below are the details you have asked for.

1. Could you please enable the debug logging for ams = done.
 2. Once the container got stuck at started status for a long time, running the following command and share the output with us =

output = ubuntu@ip-172-31-15-176:~$ amc ls
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
|          ID          | APPLICATION |  TYPE   | STATUS  | TAGS | NODE |    ADDRESS    | ENDPOINTS |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl2u41jpke7jvo4tug | app         | regular | stopped |      | lxd0 | 192.168.96.1  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl32s1jpke7jvo4tvg | app         | regular | stopped |      | lxd0 | 192.168.96.2  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl33s1jpke7jvo4u0g | app         | regular | stopped |      | lxd0 | 192.168.96.3  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl34s1jpke7jvo4u1g | app         | regular | stopped |      | lxd0 | 192.168.96.4  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl35s1jpke7jvo4u2g | app         | regular | stopped |      | lxd0 | 192.168.96.5  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl37s1jpke7jvo4u3g | app         | regular | stopped |      | lxd0 | 192.168.96.6  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3941jpke7jvo4u4g | app         | regular | stopped |      | lxd0 | 192.168.96.7  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl39s1jpke7jvo4u5g | app         | regular | stopped |      | lxd0 | 192.168.96.8  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3as1jpke7jvo4u6g | app         | regular | stopped |      | lxd0 | 192.168.96.9  |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3c41jpke7jvo4u7g | app         | regular | stopped |      | lxd0 | 192.168.96.10 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3dc1jpke7jvo4u8g | app         | regular | error   |      | lxd0 | 192.168.96.11 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3ec1jpke7jvo4u9g | app         | regular | error   |      | lxd0 | 192.168.96.12 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3f41jpke7jvo4uag | app         | regular | error   |      | lxd0 | 192.168.96.13 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl3gc1jpke7jvo4ubg | app         | regular | error   |      | lxd0 | 192.168.96.14 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl46s1jpk6094p54cg | app         | regular | error   |      | lxd0 | 192.168.96.15 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl4b41jpk6094p54dg | app         | regular | error   |      | lxd0 | 192.168.96.16 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl6mc1jpk6094p54eg | app         | regular | error   |      | lxd0 | 192.168.96.17 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhl91c1jpk6094p54fg | app         | regular | error   |      | lxd0 | 192.168.96.18 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlaik1jpk6094p54gg | app         | regular | error   |      | lxd0 | 192.168.96.19 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlajs1jpk6094p54hg | app         | regular | error   |      | lxd0 | 192.168.96.20 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlcus1jpk6094p54ig | app         | regular | error   |      | lxd0 | 192.168.96.21 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlf9s1jpk6094p54jg | app         | regular | error   |      | lxd0 | 192.168.96.22 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlfcc1jpk6094p54kg | app         | regular | error   |      | lxd0 | 192.168.96.23 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlfdk1jpk6094p54lg | app         | regular | error   |      | lxd0 | 192.168.96.24 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlhok1jpk6094p54mg | app         | regular | error   |      | lxd0 | 192.168.96.25 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlk3s1jpk6094p54ng | app         | regular | error   |      | lxd0 | 192.168.96.26 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlk6c1jpk6094p54og | app         | regular | error   |      | lxd0 | 192.168.96.27 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlk7k1jpk6094p54pg | app         | regular | running |      | lxd0 | 192.168.96.28 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlmik1jpk6094p54qg | app         | regular | started |      | lxd0 | 192.168.96.29 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
| cjhlobc1jpk5c0216ea0 | app         | regular | started |      | lxd0 | 192.168.96.30 |           |
+----------------------+-------------+---------+---------+------+------+---------------+-----------+
ubuntu@ip-172-31-15-176:~$ /snap/anbox-cloud-appliance/current/bin/juju ssh -m appliance:anbox-cloud ams/0 "sudo snap logs -n=all ams | grep cjhlobc1jpk5c0216ea0"
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.348722   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status created desired running)
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.350827   14142 worker.go:451] Worker: Found new regular container cjhlobc1jpk5c0216ea0 to launch
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.354174   14142 worker.go:324] Worker: Scheduled container cjhlobc1jpk5c0216ea0 onto node lxd0
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.355536   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status prepared desired running)
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.371411   14142 container.go:218] Launcher: Processing task cjhlobc1jpk5c0216eag for container cjhlobc1jpk5c0216ea0
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.408677   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status prepared desired running)
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297790   14142 container.go:478] Launcher: Container cjhlobc1jpk5c0216ea0 is now fully initialized
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297823   14142 container.go:494] Launcher: Doing actual start of container cjhlobc1jpk5c0216ea0
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297910   14142 container.go:186] Launcher: Waiting for container cjhlobc1jpk5c0216ea0 to switch to running status
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297935   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status started desired running)
2023-08-21T12:44:33Z ams.ams[14142]: I0821 12:44:33.156648   14142 container.go:502] Launcher: Successfully started container cjhlobc1jpk5c0216ea0
Connection to 240.15.176.22 closed.

details of lxd0 node =

ubuntu@ip-172-31-15-176:~$ /snap/anbox-cloud-appliance/current/bin/juju ssh -m appliance:anbox-cloud ams/0 "sudo ETCDCTL_API=3 /snap/etcd/current/bin/etcdctl --debug --cert=/var/snap/ams/common/etcd/client-cert.pem --key=/var/snap/ams/common/etcd/client-key.pem --insecure-transport=true --endpoints=240.12.250.85:2379 get /ams/1.0/nodes/lxd0"
ETCDCTL_CACERT=
ETCDCTL_CERT=/var/snap/ams/common/etcd/client-cert.pem
ETCDCTL_COMMAND_TIMEOUT=5s
ETCDCTL_DEBUG=true
ETCDCTL_DIAL_TIMEOUT=2s
ETCDCTL_DISCOVERY_SRV=
ETCDCTL_DISCOVERY_SRV_NAME=
ETCDCTL_ENDPOINTS=[240.12.250.85:2379]
ETCDCTL_HEX=false
ETCDCTL_INSECURE_DISCOVERY=true
ETCDCTL_INSECURE_SKIP_TLS_VERIFY=false
ETCDCTL_INSECURE_TRANSPORT=true
ETCDCTL_KEEPALIVE_TIME=2s
ETCDCTL_KEEPALIVE_TIMEOUT=6s
ETCDCTL_KEY=/var/snap/ams/common/etcd/client-key.pem
ETCDCTL_PASSWORD=
ETCDCTL_USER=
ETCDCTL_WRITE_OUT=simple
WARNING: 2023/08/21 12:55:10 Adjusting keepalive ping interval to minimum period of 10s
WARNING: 2023/08/21 12:55:10 Adjusting keepalive ping interval to minimum period of 10s
INFO: 2023/08/21 12:55:10 parsed scheme: "endpoint"
INFO: 2023/08/21 12:55:10 ccResolverWrapper: sending new addresses to cc: [{240.12.250.85:2379  <nil> 0 <nil>}]
WARNING: 2023/08/21 12:55:13 grpc: addrConn.createTransport failed to connect to {240.12.250.85:2379  <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 240.12.250.85:2379: connect: no route to host". Reconnecting...
WARNING: 2023/08/21 12:55:13 grpc: addrConn.createTransport failed to connect to {240.12.250.85:2379  <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 240.12.250.85:2379: connect: no route to host". Reconnecting...
{"level":"warn","ts":"2023-08-21T12:55:15.363Z","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-0243fcc1-b960-4bfc-8b5e-6cbb04d3b9c7/240.12.250.85:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 240.12.250.85:2379: connect: no route to host\""}
Error: context deadline exceeded
Connection to 240.15.176.22 closed.

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-21:

#6

anboxerror.JPG Edit (147.7 KiB, image/jpeg)

In the attached screenshot you can see initially the containers were running fine and then suddenly we are getting this errors.

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-21:

#7

Download full text (6.6 KiB)

Hi Gary,

Updated logs.

ubuntu@ip-172-31-15-176:~$ /snap/anbox-cloud-appliance/current/bin/juju ssh -m appliance:anbox-cloud ams/0 "sudo snap logs -n=all ams | grep cjhlobc1jpk5c0216ea0"
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.348722 14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status created desired running)
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.350827 14142 worker.go:451] Worker: Found new regular container cjhlobc1jpk5c0216ea0 to launch
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.354174 14142 worker.go:324] Worker: Scheduled container cjhlobc1jpk5c0216ea0 onto node lxd0
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.355536 14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status prepared desired running)
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.371411 14142 container.go:218] Launcher: Processing task cjhlobc1jpk5c0216eag for container cjhlobc1jpk5c0216ea0
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.408677 14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status prepared desired running)
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297790 14142 container.go:478] Launcher: Container cjhlobc1jpk5c0216ea0 is now fully initialized
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297823 14142 container.go:494] Launcher: Doing actual start of container cjhlobc1jpk5c0216ea0
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297910 14142 container.go:186] Launcher: Waiting for container cjhlobc1jpk5c0216ea0 to switch to running status
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297935 14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status started desired running)
2023-08-21T12:44:33Z ams.ams[14142]: I0821 12:44:33.156648 14142 container.go:502] Launcher: Successfully started container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:41Z ams.ams[14142]: I0821 12:54:41.624321 14142 container.go:656] Backend: Got status error from container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:41Z ams.ams[14142]: E0821 12:54:41.625582 14142 container.go:718] Backend: Container cjhlobc1jpk5c0216ea0 reported an error: service exited with status 0
2023-08-21T12:54:43Z ams.ams[14142]: I0821 12:54:43.790562 14142 housekeeper.go:170] Housekeeper: Fetching anbox logs directory from container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: I0821 12:54:44.887032 14142 housekeeper.go:381] Housekeeper: Updated task cjhlobc1jpk5c0216eag for object cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: I0821 12:54:44.887186 14142 container.go:201] Launcher: Status of container cjhlobc1jpk5c0216ea0 was updated to error
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887261 14142 trace.go:83] Trace[2116291573]: "Launching container cjhlobc1jpk5c0216ea0" (started: 2023-08-21 12:44:29.371415267 +0000 UTC m=+13.176393515) (total time: 10m15.515796499s):
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887273 14142 trace.go:83] Trace[2116291573]: [1.866466ms] [1.866466ms] Found applicatio...

Hi Gary,

Updated logs.

ubuntu@ip-172-31-15-176:~$ /snap/anbox-cloud-appliance/current/bin/juju ssh -m appliance:anbox-cloud ams/0 "sudo snap logs -n=all ams | grep cjhlobc1jpk5c0216ea0"
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.348722   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status created desired running)
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.350827   14142 worker.go:451] Worker: Found new regular container cjhlobc1jpk5c0216ea0 to launch
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.354174   14142 worker.go:324] Worker: Scheduled container cjhlobc1jpk5c0216ea0 onto node lxd0
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.355536   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status prepared desired running)
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.371411   14142 container.go:218] Launcher: Processing task cjhlobc1jpk5c0216eag for container cjhlobc1jpk5c0216ea0
2023-08-21T12:44:29Z ams.ams[14142]: I0821 12:44:29.408677   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status prepared desired running)
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297790   14142 container.go:478] Launcher: Container cjhlobc1jpk5c0216ea0 is now fully initialized
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297823   14142 container.go:494] Launcher: Doing actual start of container cjhlobc1jpk5c0216ea0
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297910   14142 container.go:186] Launcher: Waiting for container cjhlobc1jpk5c0216ea0 to switch to running status
2023-08-21T12:44:30Z ams.ams[14142]: I0821 12:44:30.297935   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status started desired running)
2023-08-21T12:44:33Z ams.ams[14142]: I0821 12:44:33.156648   14142 container.go:502] Launcher: Successfully started container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:41Z ams.ams[14142]: I0821 12:54:41.624321   14142 container.go:656] Backend: Got status error from container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:41Z ams.ams[14142]: E0821 12:54:41.625582   14142 container.go:718] Backend: Container cjhlobc1jpk5c0216ea0 reported an error: service exited with status 0
2023-08-21T12:54:43Z ams.ams[14142]: I0821 12:54:43.790562   14142 housekeeper.go:170] Housekeeper: Fetching anbox logs directory from container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: I0821 12:54:44.887032   14142 housekeeper.go:381] Housekeeper: Updated task cjhlobc1jpk5c0216eag for object cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: I0821 12:54:44.887186   14142 container.go:201] Launcher: Status of container cjhlobc1jpk5c0216ea0 was updated to error
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887261   14142 trace.go:83] Trace[2116291573]: "Launching container cjhlobc1jpk5c0216ea0" (started: 2023-08-21 12:44:29.371415267 +0000 UTC m=+13.176393515) (total time: 10m15.515796499s):
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887273   14142 trace.go:83] Trace[2116291573]: [1.866466ms] [1.866466ms] Found application version for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887278   14142 trace.go:83] Trace[2116291573]: [4.063457ms] [2.196991ms] Prepared container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: I0821 12:54:44.887280   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status error desired unknown)
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887282   14142 trace.go:83] Trace[2116291573]: [7.894175ms] [3.830718ms] Allocated network endpoint 192.168.96.30 for container cjhlobc1jpk5c0216ea0 on node lxd0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887353   14142 trace.go:83] Trace[2116291573]: [21.833496ms] [13.939321ms] Found image for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887358   14142 trace.go:83] Trace[2116291573]: [34.619357ms] [12.785861ms] Checked if image exists for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887363   14142 trace.go:83] Trace[2116291573]: [34.62231ms] [2.953µs] Allocated 0 network ports for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887367   14142 trace.go:83] Trace[2116291573]: [37.159141ms] [2.536831ms] Updated container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887371   14142 trace.go:83] Trace[2116291573]: [37.161988ms] [2.847µs] Configured services for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887375   14142 trace.go:83] Trace[2116291573]: [776.871163ms] [739.709175ms] Created LXD container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887379   14142 trace.go:83] Trace[2116291573]: [867.601409ms] [90.730246ms] Published session configuration for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887384   14142 trace.go:83] Trace[2116291573]: [880.709944ms] [13.108535ms] Published network configuration for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887388   14142 trace.go:83] Trace[2116291573]: [902.313457ms] [21.603513ms] Published server certificate for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887392   14142 trace.go:83] Trace[2116291573]: [913.230292ms] [10.916835ms] Published features for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887396   14142 trace.go:83] Trace[2116291573]: [924.233606ms] [11.003314ms] Published instance id for container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887399   14142 trace.go:83] Trace[2116291573]: [926.393257ms] [2.159651ms] Marked container cjhlobc1jpk5c0216ea0 as started
2023-08-21T12:54:44Z ams.ams[14142]: W0821 12:54:44.887403   14142 trace.go:83] Trace[2116291573]: [3.785215592s] [2.858822335s] Started container cjhlobc1jpk5c0216ea0
2023-08-21T12:54:44Z ams.ams[14142]: E0821 12:54:44.887412   14142 launcher.go:173] Launcher: Failed to process launch request for container cjhlobc1jpk5c0216ea0: container start failed
2023-08-21T12:54:44Z ams.ams[14142]: I0821 12:54:44.890977   14142 housekeeper.go:320] Housekeeper: Container cjhlobc1jpk5c0216ea0 is already marked as failed
2023-08-21T12:54:44Z ams.ams[14142]: I0821 12:54:44.892553   14142 orchestrator.go:245] Orchestrator: Received update for container cjhlobc1jpk5c0216ea0 (status error desired unknown)
Connection to 240.15.176.22 closed.

Revision history for this message

Simon Fels (morphis) wrote on 2023-08-21:

#8

Can you also provide us the system.log android.log* files from the failed container?

You can extract them via

$ amc show-log <container id> system.log
$ amc show-log <container id> android.log
$ amc show-log <container id> android.log.1

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-21:

#9

system.log Edit (86.1 KiB, text/plain)

Hi Simon,

Attached all three files as requested.

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-21:

#10

android.log Edit (1.0 MiB, text/plain)

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-21:

#11

android.log.1 Edit (2.0 MiB, text/plain)

Revision history for this message

Simon Fels (morphis) wrote on 2023-08-21:

#12

Thanks Ashish! We will have a look and let you know what we find.

Revision history for this message

Gary.Wang (gary-wzl77) wrote on 2023-08-22:

#13

Hey Ashish
I confirm the problem persists on ami-0b8ae058a54209687(Anbox-Cloud 1.18.2 arm64) after the Ubuntu kernel is rolling from 5.15.0-1031-aws to 6.2.0-1009-aws in your case. The issue here was that after upgrading the kernel using userfaultfd syscalls are disallowed by unprivileged users.
Meanwhile, it's Android 13 image specific issue. We're going to fix it in the next patch release(1.19.1)

  As an immediate step, there are two options:
  a). If you still want to use applications built on top of Android 13 image, you have to downgrade the kernel to 5.13
     To downgrade the kernel to 5.13, please refer to the following post entry [1].
     After downgrading the kernel to 5.13 and rebooting the VM, re-build the anbox dkms modules
     $ sudo dpkg-reconfigure anbox-modules-dkms-118
     $ sudo modprobe virt_wifi
     $ sudo modprobe anbox_sync
  b). Rebuild the application on top of Android 12 image, this enables you to run applications on the latest rolling kernel(6.2.0-1009-aws) without downgrading the kernel version.

With this, containers would be working as normal.
Could you please give it a try?

Thanks.
Gary

[1] https://discourse.ubuntu.com/t/how-to-downgrade-the-kernel-on-ubuntu-20-04-to-the-5-4-lts-version/26459

Simon Fels (morphis) on 2023-08-22

Changed in anbox-cloud:
assignee:	Simon Fels (morphis) → Gary.Wang (gary-wzl77)
milestone:	none → 1.19.1
importance:	Undecided → High
status:	Incomplete → Triaged

Gary.Wang (gary-wzl77) on 2023-08-22

Changed in anbox-cloud:
status:	Triaged → In Progress

Simon Fels (morphis) on 2023-08-24

Changed in anbox-cloud:
status:	In Progress → Fix Committed

Revision history for this message

Ashish sawarkar (ash-anbox) wrote on 2023-08-28:

#14

Hello Team,

Is this bug fixed, can we use the latest AMI for our use now?

Revision history for this message

Simon Fels (morphis) wrote on 2023-08-28:

#15

Hey Ashish,

The fix hasn't released yet it has been just committed to our internal repositories. It will roll out with 1.19.1 release mid September. See https://anbox-cloud.io/docs/ref/roadmap for more details.

Also which AMI you use doesn't matter so much as snaps are still upgraded and are not fixed per AMI. You can do a `snap refresh anbox-cloud-appliance` on any AMI and get the latest version. Also see https://snapcraft.io/docs/keeping-snaps-up-to-date

Revision history for this message

Gary.Wang (gary-wzl77) wrote on 2023-09-18:

#16

Just a heads up.
This issue has been fixed in the Anbox Cloud 1.19.1 release.
Please check the release announcement[1] for details.

Thanks
Gary

[1] https://discourse.ubuntu.com/t/anbox-cloud-1-19-1-has-been-released/38595

Changed in anbox-cloud:
status:	Fix Committed → Fix Released

Anbox Cloud

Containers stuck at started status

Bug Description

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches