Kubernetes test suite fails on mainline kernel 4.11+

Bug #1741887 reported by Chris Glass
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-gke (Ubuntu)
Incomplete
High
Joseph Salisbury
Artful
Incomplete
High
Joseph Salisbury
Bionic
Incomplete
High
Joseph Salisbury

Bug Description

The to-be-introduced 4.13 based kernel fails the kubernetes test suite (https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-ubuntudev2-k8sbeta-default/15?log#log)

After rough bisecting (using kernels from http://kernel.ubuntu.com/~kernel-ppa/mainline/ ), it seems some kernel change introduced in 4.11 breaks those tests (4.10 is good, 4.11+ are bad. 4.11-rc* are bad).

The minimal reproducer (doesn't have to be on a cloud image - my bionic desktop reproduces it just as well):

sudo apt install docker.io
sudo docker run -d gcr.io/kubernetes-e2e-test-images/hostexec-amd64:1.0
(note down returned hash)
sudo docker exec -it (first few chars of returned hash) /bin/sh
# (Inside the docker prompt)
timeout -t 1 cat

On kernels exhibiting the problem, the "timeout/cat" command above will terminate the container.

On kernels exhibiting the problem, docker logs exhibits:

# sudo docker logs (first few chars of returned hash)
/bin/sh: can't open /fifo: Interrupted system call

Kernels that do not exhibit the problem will not terminate after the "timeout" command and return to a normal command prompt.

Changed in linux-gke (Ubuntu):
importance: Undecided → High
assignee: nobody → Joseph Salisbury (jsalisbury)
status: New → In Progress
tags: added: performing-bisect
Revision history for this message
Chris Glass (tribaal) wrote :

Just tried with v4.15-rc7 from http://kernel.ubuntu.com/~kernel-ppa/mainline/ and the problem persists.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v4.10 final and v4.10-rc1. The kernel bisect will require testing of about 13 test kernels.

I built the first test kernel, up to the following commit:
caa59428971d5ad81d19512365c9ba580d83268c

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1741887/caa59428971d5ad81d19512365c9ba580d83268c

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Chris Glass (tribaal) wrote :

Kernel from #2 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

Kernel from #4 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

kernel from #6 is bad.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

kernel from #8 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

kernel in #10 is bad.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

Kernel in #12 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the first test kernel, up to the following commit:
601109c5c74a10e6b89465cb6aa31a40d1efc8e3

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1741887/601109c5c74a10e6b89465cb6aa31a40d1efc8e3

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

comment #14 should read the "Next" test kernel and not "first".

Revision history for this message
Chris Glass (tribaal) wrote :

Kernel in #14 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

Kernel in #17 is bad

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

f1ef09fde17f9b77ca1435a5b53a28b203afb81c was good.

Next kernel is ready:

http://kernel.ubuntu.com/~jsalisbury/lp1741887/d6cffbbe9a7e51eb705182965a189457c17ba8a3

Revision history for this message
Chris Glass (tribaal) wrote :

Kernel in #19 is bad.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

Kernel in #21 is bad

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

Kernel in #23 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

Kernel in #25 is good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Chris Glass (tribaal) wrote :

Kernel in #27 was good.

Revision history for this message
Chris Glass (tribaal) wrote :

Looks like the failure there is due to the following commit in the linux kernel:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=c6c70f4455d1eda91065e93cc4f7eddf4499b105

Changed in linux-gke (Ubuntu Artful):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I noticed this bug was still open. I built a test kernel with a revert of commit c6c70f4455d.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1741887

Can you test this kernel and see if it resolves this bug?

It might also be worth testing the v4.16-rc4 mainline kernel to see if a fix already exists:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc4/

Changed in linux-gke (Ubuntu Artful):
status: In Progress → Incomplete
Changed in linux-gke (Ubuntu Bionic):
status: In Progress → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.