Zun

Docker client error to join the network

Bug #1671713 reported by Shunli Zhou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zun
Medium
Pradeep Kumar Singh

Bug Description

When run zun start xxx

there is exception throw by docker client:

2017-03-08 15:55:26.950 ERROR zun.compute.manager [^[[01;36mreq-e719322a-67a4-4f4e-abe1-f656a9670a25 ^[[00;36madmin admin] ^[[01;35mError occurred while calling Docker start API: Docker internal error: 409 Client Error: Conflict ("cannot join network of a non running container: 33a59b411a165a4e061612c9da13e5383a6277b4cb0cd72ffd8626057541a656").

Actually the zun api has checked the container status as below:
    @pecan.expose('json')
    @exception.wrap_pecan_controller_exception
    def start(self, container_id, **kw):
        container = _get_container(container_id)
        check_policy_on_container(container.as_dict(), "container:start")
        utils.validate_container_state(container, 'start')

but the status is checked by the db status, the container status in docker may be not start.

Revision history for this message
Shunli Zhou (shunliz) wrote :

I prefer to log a blueprint to add a periodic task to sync the container status from docker to db.
Any suggestion?

Changed in zun:
assignee: nobody → Shunli Zhou (shunliz)
Revision history for this message
Pradeep Kumar Singh (pradeep-singh-u) wrote :

@shunliz,

I think you can add real status check before each validation call something like
https://github.com/openstack/zun/blob/master/zun/api/controllers/v1/containers.py#L145.

There is a BP https://blueprints.launchpad.net/zun/+spec/keep-container-alive,

status synching may be the part of that BP.
What do you think?

Revision history for this message
hongbin (hongbin034) wrote :

I registered a BP for periodically task: https://blueprints.launchpad.net/zun/+spec/periodic-task , but I don't think periodical task can completely resolve the problem. I think what Pradeep suggested might directly fix this problem.

Revision history for this message
Shunli Zhou (shunliz) wrote :

@Pradeep, @hongbin, it's good to see there are a BPs to cover my concern.Let's discuss at the meeting to find out how to implement the status sync and in which BP is better.

I would like assign this bug to nobody, let anyone who implement the status sync BP to fix this bug. Thanks.

Changed in zun:
assignee: Shunli Zhou (shunliz) → nobody
Revision history for this message
Shunli Zhou (shunliz) wrote :

Reinstall the devstack and cannot reproduce this bug. Mark as invalid.

Changed in zun:
status: New → Invalid
Changed in zun:
status: Invalid → Confirmed
assignee: nobody → Pradeep Kumar Singh (pradeep-singh-u)
importance: Undecided → Medium
Revision history for this message
Pradeep Kumar Singh (pradeep-singh-u) wrote :

I am concerned about below case exception

2017-03-08 15:55:26.950 ERROR zun.compute.manager [^[[01;36mreq-e719322a-67a4-4f4e-abe1-f656a9670a25 ^[[00;36madmin admin] ^[[01;35mError occurred while calling Docker start API: Docker internal error: 409 Client Error: Conflict ("cannot join network of a non running container: 33a59b411a165a4e061612c9da13e5383a6277b4cb0cd72ffd8626057541a656").

I reproduce this error by using below steps:
1) Created a container by 'zun run'
2) restarted the docker daemon
3) Checked the status of all containers, sandbox as well as user containers both were stopped.
4) tried to start the container by 'zun start', and got the above exception in compute logs.

Reason is 'zun start' is not starting the sandbox container, that's why we are getting the error 'cannot join network of a non running container'.

Possible Solutions:
1) Use restart policy for sandbox container in the nova-docker driver
2) Start the sandbox container before starting the user container if sandbox container is in stopped state. Now how to check the status of sandbox container using nova apis or using docker apis. Since calling nova APIs may add more performance overhead as compared to docker API.

@Hongbin, @Shunli,

What do you guys think?

Revision history for this message
Shunli Zhou (shunliz) wrote :

@Pradeep Kumar Singh, good catch. You reproduced it finally.

I'm a little confused that zun&kuryr integration has no code merged, from the zun code i only saw sandbox and the container created seperately, so when the container will try to join the sandbox network?

For your solutions, I prefer 2 using docker api.

Revision history for this message
Pradeep Kumar Singh (pradeep-singh-u) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to zun (master)

Fix proposed to branch: master
Review: https://review.openstack.org/446358

hongbin (hongbin034)
Changed in zun:
status: Confirmed → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers