Docker failed to start due to invalid local registry ""

Bug #1832220 reported by ChenjieXu
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Low
Mingyuan Qi

Bug Description

Brief Description
-----------------
During the installation of StarlingX, the docker local registry is set to "" after first unlock. And after the second lock/unlock, the docker will fail to start because invalid local registry.

Severity
--------
Major

Steps to Reproduce
------------------
- Deploy StarlingX AIO SIMPLEX/DUPLEX with local registry
- after executing ansible, check the /etc/docker/daemon.json. It should be like:
  {
    "insecure-registries" : [ "edgehost01.sh.intel.com:5000" ]
  }
- after first unlock, check the /etc/docker/daemon.json. It should be like:
  {
    "insecure-registries" : [ "" ]
  }
- after the second lock/unlock, the docker will fail to start. You can check docker status by the following command:
  sudo systemctl status docker
  source /etc/platform/openrc
  fm alarm-list

Expected Behavior
------------------
- Docker registry is written in /etc/docker/daemon.json correctly.
- Docker start correctly all the time

Actual Behavior
----------------
- Wrong docker registry "" in /etc/docker/daemon.json after first unlock.
- After the second lock/unlock, docker will fail to restart.

Reproducibility
---------------
100%

System Configuration
--------------------
AIO SIMPLEX & DUPLEX

Branch/Pull Time/Commit
-----------------------
stx master as of 20190524T013000Z

Last Pass
---------
Unclear

Timestamp/Logs
--------------
controller-1:~$ sudo cat /etc/docker/daemon.json
{
    "insecure-registries" : [ "" ]
}

controller-1:~$ sudo systemctl status docker
\u25cf docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/docker.service.d
           \u2514\u2500docker-stx-override.conf
        /usr/lib/systemd/system/docker.service.d
           \u2514\u2500starlingx-docker-override.conf
   Active: failed (Result: exit-code) since Mon 2019-06-10 18:34:45 UTC; 1h 38min ago
     Docs: https://docs.docker.com
 Main PID: 73369 (code=exited, status=1/FAILURE)

cat /var/log/daemon.log | grep "invalid" -A 5
2019-06-10T16:00:16.110 controller-1 dockerd[76570]: info insecure registry is not valid: invalid host ""
2019-06-10T16:00:16.113 controller-1 systemd[1]: notice docker.service: main process exited, code=exited, status=1/FAILURE
2019-06-10T16:00:16.156 controller-1 systemd[1]: err Failed to start Docker Application Container Engine.
2019-06-10T16:00:16.156 controller-1 systemd[1]: notice Unit docker.service entered failed state.
2019-06-10T16:00:16.156 controller-1 systemd[1]: warning docker.service failed.
2019-06-10T16:00:18.132 controller-1 systemd[1]: info Reloading.

Test Activity
-------------
Developer Testing

Revision history for this message
ChenjieXu (midone) wrote :

Workaround for AIO simplex:
  Add the correct local registry to the file /etc/docker/daemon.json before unlock.

But this workaround doesn't work for the controller-1 in Duplex deployment.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

As per Cindy, Mingyuan will help Chenjie with this issue. It may be procedural given local registry is one of the configurations verified by sanity.

tags: added: stx.containers
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Mingyuan Qi (myqi)
Revision history for this message
Cristopher Lemus (cjlemusc) wrote :

We had an issue with the exact same behavior, we needed to do a small change on our localhost.yml

BEFORE (this one fails)
docker_registries:
 - 192.168.80.60

AFTER (this one works)
docker_registries:
  unified: 192.168.80.60

With that, docker/daemon.json is properly populated on simplex and all other configurations propagate a correct daemon.json.

Hopefully it'll help.

Revision history for this message
Tee Ngo (teewrs) wrote :

This is an old build which does not have the fix for private registry which was merged on May 30th and is available in May 31st build onward.

Please upgrade to latest master build. To provision a unified private registry, add the the following to host override file (i.e. localhost.yml)

docker_registries:
  unified: <ip-or-domain-name>

If this registry is insecure, add the following line to the override file:

is_secure_registry: True

Ghada Khalil (gkhalil)
Changed in starlingx:
status: New → Incomplete
Revision history for this message
Dariush Eslimi (deslimi) wrote :

Marking as incomplete until the reporter test with new load.

Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
Revision history for this message
Tee Ngo (teewrs) wrote :

Amendment to my previous comment:

If this registry is "insecure", add the following line to the override file:

is_secure_registry: False

By the default this flag is True.

Revision history for this message
ChenjieXu (midone) wrote :

Hi all,

Thank you for your suggestions! I'm using the 0607 ISO image and the following configuration now:
docker_registries:
  unified: <ip-or-domain-name>

This time this bug doesn't occur.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Closing this bug as Chenjie confirmed that the issue does not happen on a more recent load.

Changed in starlingx:
importance: Medium → Low
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.