Docker doesn't work since Containerd integration

Bug #1567096 reported by bugproxy on 2016-04-06
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
golang-1.6 (Ubuntu)
High
Taco Screen team
Trusty
High
Unassigned
Xenial
High
Unassigned

Bug Description

-- Problem Description --
Docker build hangs indefinitely when run using a 1.11.0 binary built after containerd integration, and go 1.6 on ppc64le. Doing the same thing works with gccgo.

Looking at the differences in docker logs shows that the containerd event "exit", never happens when using a binary built with gc. fsnotify, the file system handler for go, doesn't seem to receive the correct event when a file is either written to, or closed, which I believe is whats causing this issue.

Link to fsnotify issue which shows some failing tests : https://github.com/fsnotify/fsnotify/issues/130

I have a patch that fixes the errors when I run fsnotify. I am preparing it for submission now and should be out there as a golang CL this morning.

Do you want the patch so you can rebuild golang with it? If fsnotify is a separate package then you will have to rebuild it with the new golang.

Here's the CL link if you want to get the patch for ppc64le: https://go-review.googlesource.com/#/c/21582/
Go to the upper right where it says download and I think if you select patch file it will give you the patch.

We'll update with more info after testing the patch Lynn submitted, but we wanted to let Canonical know about this issue in the meantime since 1.11 is about to GA upstream.

bugproxy (bugproxy) on 2016-04-06
tags: added: architecture-ppc64le bugnameltc-139981 severity-critical targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
Luciano Chavez (lnx1138) on 2016-04-06
affects: ubuntu → golang (Ubuntu)
Michael Hudson-Doyle (mwhudson) wrote :

I'm happy to take this patch into the Ubuntu packaging once upstream is happy with it.

Jon Grimm (jgrimm) wrote :

marked 16.04 milestone just so i don't lose track of wanting to get this in if it settles in upstream.

Changed in golang (Ubuntu):
milestone: none → ubuntu-16.04
importance: Undecided → High
Michael Hudson-Doyle (mwhudson) wrote :

The patch is upstream now, but my understanding is that it is only required for docker.io 1.11.x, not anything that is in the archive now. If that's the case then I propose not updating the golang-1.6 package yet again this close to Xenial release, but make sure that the patch is SRUed to Xenial (ideally by SRUing a 1.6.2 which includes the fix) before we try to SRU docker.io 1.11.x. Objections?

affects: golang (Ubuntu) → golang-1.6 (Ubuntu)

------- Comment From <email address hidden> 2016-04-14 09:01 EDT-------
This bug affects anything that uses one of the epoll* syscalls that returns information in the EpollEvent structure that was incorrectly defined for ppc64/ppc64le in golang. Without this fix those syscalls will return incorrect event information, and many of the tests in fsnotify fail. It wasn't found until upstream Docker hit it, but I think it could affect more than just Docker.

The fix consists only of correctly defining the data structure used by these syscalls and only for ppc64le/ppc64 so I don't think it is very risky.

On the other hand, I just requested it get into go 1.6.2, I think that should go in, but I don't know what kind of time line we are talking about for Ubuntu 16.04 golang to move to that.

Michael Hudson-Doyle (mwhudson) wrote :

It looks like 1.6.2 is going to be released around the same time as 16.04. I'll try to arrange for that to get into Xenial as soon as possible after release. I'll include your patch whether or not 16.04 does :-)

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package golang-1.6 - 1.6.2-0ubuntu2

---------------
golang-1.6 (1.6.2-0ubuntu2) yakkety; urgency=medium

  * Add d/patches/0002-no-pie-when-race.patch to fix amd64 FTBFS. (LP: #1574916)

 -- Michael Hudson-Doyle <email address hidden> Tue, 26 Apr 2016 13:57:00 +1200

Changed in golang-1.6 (Ubuntu):
status: New → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-04-28 15:15 EDT-------
Hi Canonical,

I don't know the context of the last comment, and also see that the epoll fix hasn't made it in yet, so want to sync up and make sure 1.6.2 from upstream is what you're pulling in and patching.

Thanks!

- Christy

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-04-28 15:35 EDT-------
I understood that this comment means that Canonical has golang version 1.6.2 in 16.10:
$ go version
go version go1.6.2 linux/ppc64le

On the other side, 16.04 still has 1.6.1
$ go version
go version go1.6.1 linux/ppc64le

Yes, I need to go through the SRU process to get Go 1.6.2 into Xenial
(and Trusty). Hopefully I can get to that today.

On 29 April 2016 at 07:42, bugproxy <email address hidden> wrote:
> ------- Comment From <email address hidden> 2016-04-28 15:35 EDT-------
> I understood that this comment means that Canonical has golang version 1.6.2 in 16.10:
> $ go version
> go version go1.6.2 linux/ppc64le
>
> On the other side, 16.04 still has 1.6.1
> $ go version
> go version go1.6.1 linux/ppc64le
>
> --
> You received this bug notification because you are subscribed to
> golang-1.6 in Ubuntu.
> https://bugs.launchpad.net/bugs/1567096
>
> Title:
> Docker doesn't work since Containerd integration
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/golang-1.6/+bug/1567096/+subscriptions

Hi, can you write a test case for this problem that's simple enough to include in an SRU bug report? I tried a bit and failed (for example, the golang-fsnotify package from xenial still fails tests with go 1.6.2, although fewer than go 1.6.1).

------- Comment From <email address hidden> 2016-04-29 10:15 EDT-------
Hi, here's a very simple testcase to verify that the EpollEvent structure is correct in the src/syscall directory:

package main

import "fmt"
import "syscall"
import "reflect"

func main() {
var ee syscall.EpollEvent
fmt.Printf("EpollEvent fields (should be 4): %d\n", reflect.ValueOf(ee).NumField())
}

As far as the fsnotify tests working, sometime after I submitted my fix there have been changes in fsnotify to use golang.org/x/sys/unix instead of the stdlib syscall package, so I'm not sure if that is affecting your results or not. A fix for EpollEvent for that directory has been submitted.

That's not really the sort of test I wanted; I want something I can
show to the SRU team that will motivate including Go 1.6.2 in Ubuntu
16.04. This bug was originally about docker not working, can you
provide reproduction steps for that?

As far as I can tell, everything in yakkety should be new enough to
run the golang-fsnotify tests, but they still fail on ppc64el for me
(TestInotifyRemoveTwice fails with "no error on removing invalid file"
and TestInotifyInnerMapLength hangs until the 10 minute timeout kills
it).

Cheers,
mwh

On 30 April 2016 at 02:19, bugproxy <email address hidden> wrote:
> ------- Comment From <email address hidden> 2016-04-29 10:15 EDT-------
> Hi, here's a very simple testcase to verify that the EpollEvent structure is correct in the src/syscall directory:
>
> package main
>
> import "fmt"
> import "syscall"
> import "reflect"
>
> func main() {
> var ee syscall.EpollEvent
> fmt.Printf("EpollEvent fields (should be 4): %d\n", reflect.ValueOf(ee).NumField())
> }
>
> As far as the fsnotify tests working, sometime after I submitted my fix
> there have been changes in fsnotify to use golang.org/x/sys/unix instead
> of the stdlib syscall package, so I'm not sure if that is affecting your
> results or not. A fix for EpollEvent for that directory has been
> submitted.
>
> --
> You received this bug notification because you are subscribed to
> golang-1.6 in Ubuntu.
> https://bugs.launchpad.net/bugs/1567096
>
> Title:
> Docker doesn't work since Containerd integration
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/golang-1.6/+bug/1567096/+subscriptions

------- Comment From <email address hidden> 2016-05-02 13:45 EDT-------
Here is a bit more detail on my earlier comment:

You need fsnotify commit 836bfd to see the problem with go1.6.1. If the fsnotify package is built with this commit using go1.6.1 and the test built with go1.6.1, then there will be several failures and a hang TestInotifyInnerMapLength:

./fsnotify.test -test.v
=== RUN TestPollerWithBadFd
--- PASS: TestPollerWithBadFd (0.00s)
=== RUN TestPollerWithData
--- FAIL: TestPollerWithData (0.00s)
inotify_poller_test.go:85: expected poller to return true
=== RUN TestPollerWithWakeup
--- PASS: TestPollerWithWakeup (0.00s)
=== RUN TestPollerWithClose
--- FAIL: TestPollerWithClose (0.00s)
inotify_poller_test.go:119: expected poller to return true
=== RUN TestPollerWithWakeupAndData
--- FAIL: TestPollerWithWakeupAndData (0.00s)
inotify_poller_test.go:140: expected poller to return true
=== RUN TestPollerConcurrent
--- FAIL: TestPollerConcurrent (0.05s)
inotify_poller_test.go:197: expected true
=== RUN TestInotifyCloseRightAway
--- PASS: TestInotifyCloseRightAway (0.05s)
=== RUN TestInotifyCloseSlightlyLater
--- PASS: TestInotifyCloseSlightlyLater (0.10s)
=== RUN TestInotifyCloseSlightlyLaterWithWatch
--- PASS: TestInotifyCloseSlightlyLaterWithWatch (0.10s)
=== RUN TestInotifyCloseAfterRead
--- PASS: TestInotifyCloseAfterRead (0.10s)
=== RUN TestInotifyCloseCreate
--- FAIL: TestInotifyCloseCreate (0.05s)
inotify_test.go:136: Took too long to wait for event
=== RUN TestInotifyStress
--- FAIL: TestInotifyStress (5.00s)
inotify_test.go:238: Expected at least 50 creates, got 0
=== RUN TestInotifyRemoveTwice
--- PASS: TestInotifyRemoveTwice (0.00s)
=== RUN TestInotifyInnerMapLength
<hangs here>

However, if you switch to using go1.6.2, rebuild the fsnotify package and testcase from this same fsnotify commit id and run the test, it passes:
boger@ampere:~/fsnotify/src/github.com/fsnotify/fsnotify$ go version
go version go1.6.2 linux/ppc64le
boger@ampere:~/fsnotify/src/github.com/fsnotify/fsnotify$ go test -c
boger@ampere:~/fsnotify/src/github.com/fsnotify/fsnotify$ ./fsnotify.test
PASS

If you change to use the latest commit for fsnotify (containing the switch to use x/sys/unix for the header files), rebuild the fsnotify package and the test, that seems to work for both go1.6.1 and go1.6.2, since it is no longer using the header file from the golang directories but from the golang/x directories.

Hm, that's not what I see, but I am running in a yakkey chroot on a
wily system -- could there be a dependence on kernel version here?

On 3 May 2016 at 05:50, bugproxy <email address hidden> wrote:
> ------- Comment From <email address hidden> 2016-05-02 13:45 EDT-------
> Here is a bit more detail on my earlier comment:
>
> You need fsnotify commit 836bfd to see the problem with go1.6.1. If the
> fsnotify package is built with this commit using go1.6.1 and the test
> built with go1.6.1, then there will be several failures and a hang
> TestInotifyInnerMapLength:
>
> ./fsnotify.test -test.v
> === RUN TestPollerWithBadFd
> --- PASS: TestPollerWithBadFd (0.00s)
> === RUN TestPollerWithData
> --- FAIL: TestPollerWithData (0.00s)
> inotify_poller_test.go:85: expected poller to return true
> === RUN TestPollerWithWakeup
> --- PASS: TestPollerWithWakeup (0.00s)
> === RUN TestPollerWithClose
> --- FAIL: TestPollerWithClose (0.00s)
> inotify_poller_test.go:119: expected poller to return true
> === RUN TestPollerWithWakeupAndData
> --- FAIL: TestPollerWithWakeupAndData (0.00s)
> inotify_poller_test.go:140: expected poller to return true
> === RUN TestPollerConcurrent
> --- FAIL: TestPollerConcurrent (0.05s)
> inotify_poller_test.go:197: expected true
> === RUN TestInotifyCloseRightAway
> --- PASS: TestInotifyCloseRightAway (0.05s)
> === RUN TestInotifyCloseSlightlyLater
> --- PASS: TestInotifyCloseSlightlyLater (0.10s)
> === RUN TestInotifyCloseSlightlyLaterWithWatch
> --- PASS: TestInotifyCloseSlightlyLaterWithWatch (0.10s)
> === RUN TestInotifyCloseAfterRead
> --- PASS: TestInotifyCloseAfterRead (0.10s)
> === RUN TestInotifyCloseCreate
> --- FAIL: TestInotifyCloseCreate (0.05s)
> inotify_test.go:136: Took too long to wait for event
> === RUN TestInotifyStress
> --- FAIL: TestInotifyStress (5.00s)
> inotify_test.go:238: Expected at least 50 creates, got 0
> === RUN TestInotifyRemoveTwice
> --- PASS: TestInotifyRemoveTwice (0.00s)
> === RUN TestInotifyInnerMapLength
> <hangs here>
>
> However, if you switch to using go1.6.2, rebuild the fsnotify package and testcase from this same fsnotify commit id and run the test, it passes:
> boger@ampere:~/fsnotify/src/github.com/fsnotify/fsnotify$ go version
> go version go1.6.2 linux/ppc64le
> boger@ampere:~/fsnotify/src/github.com/fsnotify/fsnotify$ go test -c
> boger@ampere:~/fsnotify/src/github.com/fsnotify/fsnotify$ ./fsnotify.test
> PASS
>
> If you change to use the latest commit for fsnotify (containing the
> switch to use x/sys/unix for the header files), rebuild the fsnotify
> package and the test, that seems to work for both go1.6.1 and go1.6.2,
> since it is no longer using the header file from the golang directories
> but from the golang/x directories.
>
> --
> You received this bug notification because you are subscribed to
> golang-1.6 in Ubuntu.
> https://bugs.launchpad.net/bugs/1567096
>
> Title:
> Docker doesn't work since Containerd integration
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/golang-1.6/+bug/1567096/+subscriptions

------- Comment From <email address hidden> 2016-05-03 11:47 EDT-------
(In reply to comment #35)
> That's not really the sort of test I wanted; I want something I can
> show to the SRU team that will motivate including Go 1.6.2 in Ubuntu
> 16.04. This bug was originally about docker not working, can you
> provide reproduction steps for that?
>
> As far as I can tell, everything in yakkety should be new enough to
> run the golang-fsnotify tests, but they still fail on ppc64el for me
> (TestInotifyRemoveTwice fails with "no error on removing invalid file"
> and TestInotifyInnerMapLength hangs until the 10 minute timeout kills
> it).

If you have an installation of Ubuntu 16.04, you can install go & docker, build docker from upstream source, copy all the docker binaries (dockerd, docker, docker-containerd, docker-containerd-ctr, docker-containerd-shim, and docker-runc) into /usr/bin/ , and then run a container (docker run -it ppc64le/ubuntu echo hi) and it will hang and fail to exit.

====
Or, you can build docker in a container and then run (the way the CI tests work):

0. Install docker
1. checkout and patch docker master:

diff --git a/Dockerfile.ppc64le b/Dockerfile.ppc64le
index 208c3a5..3fa36a0 100644
--- a/Dockerfile.ppc64le
+++ b/Dockerfile.ppc64le
@@ -73,9 +73,9 @@ RUN cd /usr/local/lvm2 \
## BUILD GOLANG 1.6
# NOTE: ppc64le has compatibility issues with older versions of go, so make sure the version >= 1.6
-ENV GO_VERSION 1.6.2
+ENV GO_VERSION 1.6.1
ENV GO_DOWNLOAD_URL https://golang.org/dl/go${GO_VERSION}.src.tar.gz
-ENV GO_DOWNLOAD_SHA256 787b0b750d037016a30c6ed05a8a70a91b2e9db4bd9b1a2453aa502a63f1bccc
+ENV GO_DOWNLOAD_SHA256 1d4b53cdee51b2298afcf50926a7fa44b286f0bf24ff8323ce690a66daa7193f
ENV GOROOT_BOOTSTRAP /usr/local
RUN curl -fsSL "$GO_DOWNLOAD_URL" -o golang.tar.gz

2. build the docker dev container:
$docker build -t docker:1.6.1 -f Dockerfile.ppc64le .

3. Run the docker dev container
$docker run -it --privileged docker:1.6.1 /bin/bash

4. From the container, build the docker binary
root@05f8c2e2a546:/go/src/github.com/docker/docker# ./hack/make.sh binary

5. Run a docker container:
root@05f8c2e2a546:/go/src/github.com/docker/docker# cd bundles/latest/binary-daemon
root@05f8c2e2a546:/go/src/github.com/docker/docker# ./docker &
root@05f8c2e2a546:/go/src/github.com/docker/docker# cd ../docker-client
root@05f8c2e2a546:/go/src/github.com/docker/docker# ./ ./docker run -it ppc64le/ubuntu echo hi
hi
[infinite cursor]

You can see that the container runs, but never exits.

==

Is that helpful?

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-05-04 10:42 EDT-------
I've tried several Ubuntu distros and it passes everywhere I've tried, even on 15.10.

Just to be sure: fsnotify and everything that depends on go has been rebuilt with the new go1.6.2?

Hello bugproxy, or anyone else affected,

Accepted golang-1.6 into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/golang-1.6/1.6.2-0ubuntu5~16.04 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in golang-1.6 (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed

I've verified (finally!) that 1.6.2 fixes the observed problems with docker.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package golang-1.6 - 1.6.2-0ubuntu5~16.04

---------------
golang-1.6 (1.6.2-0ubuntu5~16.04) xenial; urgency=medium

  * Backport new upstream release to Xenial to fix epoll on ppc64el and s390x.
    (LP: #1567096, #1591010)

 -- Michael Hudson-Doyle <email address hidden> Fri, 10 Jun 2016 15:34:17 +1200

Changed in golang-1.6 (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for golang-1.6 has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Martin Pitt (pitti) wrote :

As for trusty: This requires a full rebuild and smoke runtime test of all Go packages in trusty to ensure that the new Go compiler does not introduce regressions. Please lay out a test plan/rebuild PPAs etc. here.

There aren't any such packages (unless I'm missing something). I'm not updating the default version of Go in trusty, but rather the special golang-1.6 packages that was uploaded to allow juju to use it -- but juju has not used it yet.

Changed in golang-1.6 (Ubuntu Trusty):
status: New → In Progress

Hello bugproxy, or anyone else affected,

Accepted golang-1.6 into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/golang-1.6/1.6.2-0ubuntu5~14.04 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in golang-1.6 (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: removed: verification-done
tags: added: verification-needed

The fix for this bug has been awaiting testing feedback in the -proposed repository for trusty for more than 90 days. Please test this fix and update the bug appropriately with the results. In the event that the fix for this bug is still not verified 15 days from now, the package will be removed from the -proposed repository.

tags: added: removal-candidate
Changed in golang-1.6 (Ubuntu Xenial):
importance: Undecided → High
Changed in golang-1.6 (Ubuntu Trusty):
importance: Undecided → High

The version of golang-1.6 in the proposed pocket of Trusty that was purported to fix this bug report has been removed because the bugs that were to be fixed by the upload were not verified in a timely (105 days) fashion.

Changed in golang-1.6 (Ubuntu Trusty):
status: Fix Committed → Won't Fix

Default Comment by Bridge

Default Comment by Bridge

------- Comment From <email address hidden> 2016-12-16 12:24 EDT-------
Hi Canonical,

This somehow fell off of everyone's radar. Can we get another shot at getting it in? We'll have someone look at it ASAP.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-29 18:04 EDT-------
This bug has not been touched in over two years, so I am rejecting it. If you feel this is in error, please reopen and justify.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers