test_bpf.sh from ubuntu_kernel_selftests.net from linux ADT test failure with linux/4.15.0-149.153 i386 (Segmentation fault)

Bug #1934414 reported by Kelsey Skunberg
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Invalid
Undecided
Unassigned
Bionic
Fix Released
Low
Thadeu Lima de Souza Cascardo

Bug Description

[Impact]
selftests net will fail on i386, requiring engineers time to investigate, or leading to new BPF failures to be ignored.

[Fix]
The fix is only to tests.

[Test case]
Run test_bpf.sh from tools/testing/selftests/net/.

[Potential regression]
Tests could fail on other architectures.

====================================

This is a scripted bug report about ADT failures while running linux tests for linux/4.15.0-149.153 on bionic. Whether this is caused by the dep8 tests of the tested source or the kernel has yet to be determined.

Testing failed on:
    i386: https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/i386/l/linux/20210702_010055_445cd@/log.gz

03:37:02 DEBUG| [stdout] selftests: test_bpf.sh
03:37:02 DEBUG| [stdout] ========================================
03:37:03 ERROR| [stderr] Segmentation fault
03:37:03 DEBUG| [stdout] test_bpf: [FAIL]
03:37:03 DEBUG| [stdout] not ok 1..8 selftests: test_bpf.sh [FAIL]
03:37:03 DEBUG| [stdout] selftests: netdevice.sh
03:37:03 DEBUG| [stdout] ========================================

tags: added: kernel-adt-failure
description: updated
summary: - linux ADT test failure with linux/4.15.0-149.153
+ test_bpf.sh from ubuntu_kernel_selftests.net from linux ADT test failure
+ with linux/4.15.0-149.153
description: updated
Revision history for this message
Kelsey Skunberg (kelsey-skunberg) wrote : Re: test_bpf.sh from ubuntu_kernel_selftests.net from linux ADT test failure with linux/4.15.0-149.153

Also seeing this failure through regression testing on 4.15.0-149.153 i386 lowlatency and generic

Revision history for this message
Krzysztof Kozlowski (krzk) wrote :

Dmesg error is:
[ 157.476410] test_bpf: #48 INT: ALU MIX jited:0 167 PASS
[ 157.482773] test_bpf: #49 INT: shifts by register jited:0 317 PASS
[ 157.488472] test_bpf: #50 INT: DIV + ABS jited:0 103 140 PASS
[ 157.495519] test_bpf: #51 INT: DIV by zero jited:0 74
[ 157.496424] divide error: 0000 [#1] SMP PTI

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

This comes from the BPF changes done on 4.15.0-147.

i386 uses the interpreter, while other arches have JIT. The JIT cannot be disabled on the arches that support it and have CONFIG_BPF_JIT_ALWAYS_ON, which is our default.

The interpreter change that causes this has been identified. It doesn't check that the source register is zero anymore, relying on either eBPF fixup_bpf_calls or classic BPF bpf_convert_filter doing the job of testing for it.

The test_bpf module, though, does not go through any of these conversions, which leads to the test failure. Partial revert of the interpreter changes would fix it up.

We have yet to determine if any other legitimate path for either eBPF or cBPF will skip any of these conversions. If none do, we may leave this for a later cycle.

Cascardo.

Changed in linux (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

So, test_bpf is not doing either conversion and any other use on the kernel should either be using something like sk_attach*, bpf_prog_create*, or just using a fd that was created from BPF_PROG_LOAD. Any user that accepts BPF programs without those checks are up to cause issues worse than a simple div-by-zero.

So far, I could not find any such user by code inspection.

I also tested that a SO_ATTACH_FILTER (classic BPF) that does a div-by-zero behaves the same way on 4.15.0-145, 4.15.0-147 and 4.15.0-149, that is, the program will have an "exception", which means it returns 0. With sockets, this means the packet is dropped. That is expected behavior.

Marking this bug priority as low as a result.
Cascardo.

Changed in linux (Ubuntu Bionic):
importance: Undecided → Low
Po-Hsu Lin (cypressyew)
tags: added: 4.15 bionic i386 sru-20210621 ubuntu-kernel-selftests
Po-Hsu Lin (cypressyew)
summary: test_bpf.sh from ubuntu_kernel_selftests.net from linux ADT test failure
- with linux/4.15.0-149.153
+ with linux/4.15.0-149.153 i386 (Segmentation fault)
Stefan Bader (smb)
tags: added: sru-20210719
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

As Thadeu mentioned, this happens because the test_bpf module doesn't go through the verifier causing the failure when the code is not jitted. Upstream fixed the testcase with 21ccaf21497b (bpf: add further test cases around div/mod and others) by replacing "INT: DIV by zero" by "INT: DIV by -1" on test_bpf and moving/adding the div by zero tests to tools/testing/selftests/bpf/test_verifier.c.

Changing the div test from zero to -1 in bionic is trivial, however this would leave the zero case uncovered. Making the proper fix would require backporting 21ccaf21497b, which in turn would need at least 87c1793b1b7f (bpf: add couple of test cases for div/mod by zero) and 111e6b45315c (selftests/bpf: make test_verifier run most programs) but possibly many others.

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Backporting everything necessary would be up to many potential regressions. That would include support for BPF calls and the change to make unknown opcode handling more robust.

The former would be commit cc8b0b92a1699bc32f7fec71daa2bfc90de43a4d "bpf: introduce function calls (function boundaries)", which would open a big can of worms we don't want to do.

The latter is commit 5e581dad4fec0e6d062740dc35b8dc248b39d224 "bpf: make unknown opcode handling more robust". Though it might add more robustness, it changes the table in the interpreter, so I decided not to backport it now, and fixup the error messages in test_verifier, instead.

By the way, commit 21ccaf21497b72f42133182716a42dbf573d314b "bpf: add further test cases around div/mod and others" should have been 3, at least 2, different commits. As it only changes tests, I am find simply doing a partial backport of it.

I have it ready and tested, will send a patchset to the list.

Cascardo.

description: updated
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/4.15.0-163.171 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

Confirmed test_bpf.sh is now successfully on i386 with bionic/linux 4.15.0-163.171:

https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/i386/l/linux/20211117_200907_95566@/log.gz

18:27:04 INFO | START ubuntu_kernel_selftests.net:test_bpf.sh ubuntu_kernel_selftests.net:test_bpf.sh timestamp=1637173624 timeout=1800 localtime=Nov 17 18:27:04
18:27:04 DEBUG| Persistent state client._record_indent now set to 2
18:27:04 DEBUG| Persistent state client.unexpected_reboot now set to ('ubuntu_kernel_selftests.net:test_bpf.sh', 'ubuntu_kernel_selftests.net:test_bpf.sh')
18:27:04 DEBUG| Waiting for pid 10439 for 1800 seconds
18:27:04 DEBUG| Running 'make run_tests -C net TEST_PROGS=test_bpf.sh TEST_GEN_PROGS='' TEST_CUSTOM_PROGS='''
18:27:04 DEBUG| [stdout] make: Entering directory '/tmp/autopkgtest.U5vLyA/build.Xnj/src/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net'
18:27:04 DEBUG| [stdout] make --no-builtin-rules ARCH=x86 -C ../../../.. headers_install
18:27:04 DEBUG| [stdout] make[1]: Entering directory '/tmp/autopkgtest.U5vLyA/build.Xnj/src/autotest/client/tmp/ubuntu_kernel_selftests/src/linux'
18:27:05 DEBUG| [stdout] CHK include/generated/uapi/linux/version.h
18:27:05 DEBUG| [stdout] make[1]: Leaving directory '/tmp/autopkgtest.U5vLyA/build.Xnj/src/autotest/client/tmp/ubuntu_kernel_selftests/src/linux'
18:27:05 DEBUG| [stdout] TAP version 13
18:27:05 DEBUG| [stdout] selftests: test_bpf.sh
18:27:05 DEBUG| [stdout] ========================================
18:27:28 DEBUG| [stdout] test_bpf: ok
18:27:28 DEBUG| [stdout] ok 1..1 selftests: test_bpf.sh [PASS]
18:27:28 DEBUG| [stdout] make: Leaving directory '/tmp/autopkgtest.U5vLyA/build.Xnj/src/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net'
18:27:28 INFO | GOOD ubuntu_kernel_selftests.net:test_bpf.sh ubuntu_kernel_selftests.net:test_bpf.sh timestamp=1637173648 localtime=Nov 17 18:27:28 completed successfully
18:27:28 INFO | END GOOD ubuntu_kernel_selftests.net:test_bpf.sh ubuntu_kernel_selftests.net:test_bpf.sh timestamp=1637173648 localtime=Nov 17 18:27:28

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (7.3 KiB)

This bug was fixed in the package linux - 4.15.0-163.171

---------------
linux (4.15.0-163.171) bionic; urgency=medium

  * bionic/linux: 4.15.0-163.171 -proposed tracker (LP: #1949874)

  * Packaging resync (LP: #1786013)
    - [Packaging] update Ubuntu.md
    - debian/dkms-versions -- update from kernel-versions (main/2021.11.08)

  * Unable to build net/reuseport_bpf and other tests in ubuntu_kernel_selftests
    on Bionic with make command (LP: #1949889)
    - selftests: Fix loss of test output in run_kselftests.sh
    - selftests: Makefile set KSFT_TAP_LEVEL to prevent nested TAP headers
    - selftests: fix headers_install circular dependency
    - selftests: fix bpf build/test workflow regression when KBUILD_OUTPUT is set
    - selftests: vm: Fix test build failure when built by itself

  * KVM emulation failure when booting into VM crash kernel with multiple CPUs
    (LP: #1948862)
    - KVM: x86: Properly reset MMU context at vCPU RESET/INIT

  * aufs: kernel bug with apparmor and fuseblk (LP: #1948470)
    - SAUCE: aufs: bugfix, stop omitting path->mnt

  * ebpf: bpf_redirect fails with ip6 gre interfaces (LP: #1947164)
    - net: handle ARPHRD_IP6GRE in dev_is_mac_header_xmit()

  * require CAP_NET_ADMIN to attach N_HCI ldisc (LP: #1949516)
    - Bluetooth: hci_ldisc: require CAP_NET_ADMIN to attach N_HCI ldisc

  * ACL updates on OCFS2 are not revalidated (LP: #1947161)
    - ocfs2: fix remounting needed after setfacl command

  * ppc64 BPF JIT mod by 1 will not return 0 (LP: #1948351)
    - powerpc/bpf: Fix BPF_MOD when imm == 1

  * Drop "UBUNTU: SAUCE: cachefiles: Page leaking in
    cachefiles_read_backing_file while vmscan is active" (LP: #1947709)
    - Revert "UBUNTU: SAUCE: cachefiles: Page leaking in
      cachefiles_read_backing_file while vmscan is active"
    - cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is
      active

  * Some test in ubuntu_bpf test_verifier failed on i386 Bionic kernel
    (LP: #1788578)
    - bpf: fix context access in tracing progs on 32 bit archs

  * test_bpf.sh from ubuntu_kernel_selftests.net from linux ADT test failure
    with linux/4.15.0-149.153 i386 (Segmentation fault) (LP: #1934414)
    - selftests/bpf: make test_verifier run most programs
    - bpf: add couple of test cases for div/mod by zero
    - bpf: add further test cases around div/mod and others

  * Bionic update: upstream stable patchset 2021-11-02 (LP: #1949512)
    - usb: gadget: r8a66597: fix a loop in set_feature()
    - usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()
    - cifs: fix incorrect check for null pointer in header_assemble
    - xen/x86: fix PV trap handling on secondary processors
    - usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c
    - USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
    - staging: greybus: uart: fix tty use after free
    - Re-enable UAS for LaCie Rugged USB3-FW with fk quirk
    - USB: serial: mos7840: remove duplicated 0xac24 device ID
    - USB: serial: option: add Telit LN920 compositions
    - USB: serial: option: remove duplicate USB device ID
    - USB: serial: option: add device id for Foxco...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Po-Hsu Lin (cypressyew)
Changed in ubuntu-kernel-tests:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments