Comment 24 for bug 1996678

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Hans,

Thanks for the pointer to the synthetic reproducer!

It provided accurate and consistent results considering
the kernel versions reported (not) to exhibit the issue.

The Azure test kernel with the 3 patches [1] to address
that shows the same (good) results as the Azure kernel
prior to the regression being introduced.

P.S.: the issue isn't strictly having that patch in,
as it's included in later kernel versions w/out this
issue (eg, 5.15), but having that patch in while not
having these other patches in as well (as, eg, 5.15).

[1] https://lists.ubuntu.com/archives/kernel-team/2022-November/135069.html

...

Test Results from 4x VMs on Azure (2x 4vCPU/16G and 2x 8vCPU/32G)

Test Steps follow below; essentially, run the for-loop with curl
10x times, and count how many times it doesn't finish / is stuck.
(i.e., epoll wait didn't return/finish).

1) original/"good" kernel: 0% error rate
-- 5.4.0-1094-azure #100-Ubuntu SMP Mon Oct 17 03:14:36 UTC 2022

VM1: 0/10
VM2: 0/10
VM3: 0/10
VM4: 0/10

2) regression/"bad" kernel: 60%-80% error rate
-- 5.4.0-1095-azure #101-Ubuntu SMP Thu Oct 20 15:50:47 UTC 2022

VM1: 8/10
VM2: 7/10
VM3: 7/10
VM4: 6/10

3) candidate/"test" kernel: 0% error rate
-- 5.4.0-1098-azure #104-Ubuntu SMP Wed Nov 23 21:19:57 UTC 2022

VM1: 0/10
VM2: 0/10
VM3: 0/10
VM4: 0/10

...

Test Steps/Criteria on Focal:

Install go 1.19:

$ sudo snap install --channel=1.19/stable --classic go

Create test programs:

$ cat <<EOF >main.go
package main

import (
        "fmt"
        "github.com/prometheus/procfs/sysfs"
        "log"
        "net/http"
)

func main() {

        fs, err := sysfs.NewFS("/sys")
        if err != nil {
                panic(err)
        }
        netDevices, err := fs.NetClassDevices()

        http.HandleFunc("/", func(writer http.ResponseWriter, request *http.Request) {
                for _, device := range netDevices {
                        _, err := fs.NetClassByIface(device)
                        if err != nil {
                                panic(err)
                        }
                }
                fmt.Printf(". ")
                writer.WriteHeader(200)
                writer.Write([]byte("ok"))
        })
        log.Fatal(http.ListenAndServe(":9100", nil))
}
EOF

$ cat <<EOF >go.mod
module app

go 1.19

require (
        github.com/prometheus/procfs v0.8.0 // indirect
        golang.org/x/sync v0.0.0-20220601150217-0de741cfad7f // indirect
)
EOF

Fetch test deps:

$ go mod download github.com/prometheus/procfs
$ go get <email address hidden>

Start test program:

go run main.go &

Exercise it:
for i in {0..10000} ; do curl localhost:9100/metrics ; done

Test Criteria:

PASS = for-loop finishes.
FAIL = for-loop doesn't finish.

# reference: https://github.com/prometheus/node_exporter/issues/2500#issuecomment-1304847221

Stack traces on FAIL / for-loop not finished:

azureuser@ktest-3:~$ sudo grep -l epoll /proc/$(pidof go)/task/*/stack | xargs sudo grep -H ^
/proc/33267/task/33287/stack:[<0>] ep_poll+0x3bb/0x410
/proc/33267/task/33287/stack:[<0>] do_epoll_wait+0xb8/0xd0
/proc/33267/task/33287/stack:[<0>] __x64_sys_epoll_pwait+0x4c/0xa0
/proc/33267/task/33287/stack:[<0>] do_syscall_64+0x5e/0x200
/proc/33267/task/33287/stack:[<0>] entry_SYSCALL_64_after_hwframe+0x5c/0xc1

azureuser@ktest-3:~$ sudo grep -l epoll /proc/$(pidof go)/task/*/stack | xargs sudo grep -H ^
/proc/1193/task/1193/stack:[<0>] ep_poll+0x3bb/0x410
/proc/1193/task/1193/stack:[<0>] do_epoll_wait+0xb8/0xd0
/proc/1193/task/1193/stack:[<0>] __x64_sys_epoll_pwait+0x4c/0xa0
/proc/1193/task/1193/stack:[<0>] do_syscall_64+0x5e/0x200
/proc/1193/task/1193/stack:[<0>] entry_SYSCALL_64_after_hwframe+0x5c/0xc1

azureuser@ktest-3:~$ sudo grep -l epoll /proc/$(pidof go)/task/*/stack | xargs sudo grep -H ^
/proc/1173/task/1193/stack:[<0>] ep_poll+0x3bb/0x410
/proc/1173/task/1193/stack:[<0>] do_epoll_wait+0xb8/0xd0
/proc/1173/task/1193/stack:[<0>] __x64_sys_epoll_pwait+0x4c/0xa0
/proc/1173/task/1193/stack:[<0>] do_syscall_64+0x5e/0x200
/proc/1173/task/1193/stack:[<0>] entry_SYSCALL_64_after_hwframe+0x5c/0xc1