[Regression] ubuntu_bpf failed to build on Groovy

Bug #1917609 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Fix Released
Undecided
Ian May
linux (Ubuntu)
Invalid
Undecided
Unassigned
Groovy
Fix Released
Medium
Unassigned

Bug Description

[Impact]
Upstream commit d3bec0138bfbe58606fc1d6f57a4cdc1a20218db (bpf: Zero-fill re-used per-cpu map element) was applied to Groovy as part of an upstream stable update. This patch fixes a bpf issue and at the same time adds new selftests to verify these changes. However, the selftests can't be compiled on 5.8 due to missing helper functions that were added only later. The bpf selftest build fails with errors such as:

/usr/bin/ld: /tmp/autopkgtest.IzBxE1/build.8NX/src/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/bpf/map_init.test.o: in function `prog_run_insert_elem':
  /tmp/autopkgtest.IzBxE1/build.8NX/src/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/bpf/prog_tests/map_init.c:89: undefined reference to `ASSERT_OK'

[Fix]
The proposed fix it to partially revert this commit by removing the selftests parts.

[Testcase]
On a groovy/linux repo:

$ make -C tools/testing/selftests TARGETS=bpf SKIP_TARGETS=

[Where problems could occur]
By removing the selftests we could be introducing a regression with the bpf code change which would be likely unoticed during the tests.

==== Original Description ====

Issue found on 5.8.0-1024.26-azure / 5.8.0-1022.23-oracle

Test failed to build, I didn't see any meaningful error message, this might need to be double checked.

    HDRINST usr/include/asm/fcntl.h
    HDRINST usr/include/asm/termbits.h
    HDRINST usr/include/asm/errno.h
    INSTALL ./usr/include
  make[1]: Leaving directory '/home/ubuntu/autotest/client/tmp/ubuntu_bpf/src/linux'
  make: Leaving directory '/home/ubuntu/autotest/client/tmp/ubuntu_bpf/src/linux/tools/testing/selftests'
  stderr:
  make: *** [Makefile:159: all] Error 1

Please find attachment for the complete test log.

== Process to reproduce this ==
1. git clone git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/groovy
2. cd groovy/tools/testing/selftests
3. make TARGETS=bpf
You will see the build failed if it's affected by this.

I think this is affecting the ubuntu_kernel_selftests as well, as the net test would require the bpf test to be built first.

CVE References

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
tags: added: 5.8 groovy kqa-blocker sru-20210222 ubuntu-bpf
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This issue can be found in oracle 5.8.0-1021.22, but not oracle 5.8.0-1020.21

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Tested on Groovy master-next branch, make TARGETS=bpf does not work.
But it works with master branch.

This is a regression.

summary: - ubuntu_bpf failed to build on Groovy
+ [Regression] ubuntu_bpf failed to build on Groovy
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

A quick bisect against groovy shows:
dd773889e74605c1a0a97d701f589733800c9dc8 is the first bad commit
commit dd773889e74605c1a0a97d701f589733800c9dc8
Author: David Verbeiren <email address hidden>
Date: Wed Nov 4 12:23:32 2020 +0100

    bpf: Zero-fill re-used per-cpu map element

    BugLink: https://bugs.launchpad.net/bugs/1908555

    [ Upstream commit d3bec0138bfbe58606fc1d6f57a4cdc1a20218db ]

    Zero-fill element values for all other cpus than current, just as
    when not using prealloc. This is the only way the bpf program can
    ensure known initial values for all cpus ('onallcpus' cannot be
    set when coming from the bpf program).

    The scenario is: bpf program inserts some elements in a per-cpu
    map, then deletes some (or userspace does). When later adding
    new elements using bpf_map_update_elem(), the bpf program can
    only set the value of the new elements for the current cpu.
    When prealloc is enabled, previously deleted elements are re-used.
    Without the fix, values for other cpus remain whatever they were
    when the re-used entry was previously freed.

    A selftest is added to validate correct operation in above
    scenario as well as in case of LRU per-cpu map element re-use.

    Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
    Signed-off-by: David Verbeiren <email address hidden>
    Signed-off-by: Alexei Starovoitov <email address hidden>
    Acked-by: Matthieu Baerts <email address hidden>
    Acked-by: Andrii Nakryiko <email address hidden>
    Link: https://<email address hidden>
    Signed-off-by: Sasha Levin <email address hidden>
    Signed-off-by: Kamal Mostafa <email address hidden>
    Signed-off-by: Ian May <email address hidden>

 kernel/bpf/hashtab.c | 30 ++-
 tools/testing/selftests/bpf/prog_tests/map_init.c | 214 ++++++++++++++++++++++
 tools/testing/selftests/bpf/progs/test_map_init.c | 33 ++++
 3 files changed, 275 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/map_init.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_map_init.c

$ git tag --contains dd773889e74605c1a0a97d701f589733800c9dc8
Ubuntu-5.8.0-44.50
Ubuntu-5.8.0-45.51

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

$ git bisect log
git bisect start
# good: [5639486106644d7703887ca16739a831eec67fa1] UBUNTU: Ubuntu-5.8.0-41.46
git bisect good 5639486106644d7703887ca16739a831eec67fa1
# bad: [2ec0e0f3a56409f7285bd2a657da49d985ed2c46] UBUNTU: [Config] Enforce CONFIG_DRM_BOCHS=m
git bisect bad 2ec0e0f3a56409f7285bd2a657da49d985ed2c46
# bad: [110d7233381da63e7e21910eb61a3aaf5e024ea4] samples/ftrace: Mark my_tramp[12]? global
git bisect bad 110d7233381da63e7e21910eb61a3aaf5e024ea4
# good: [f5ed4ba307483eb3188efa37a04825c31afdf70e] NFSD: Fix use-after-free warning when doing inter-server copy
git bisect good f5ed4ba307483eb3188efa37a04825c31afdf70e
# bad: [08e848150a7eecf983736e761be07844773f1246] iio: accel: kxcjk1013: Replace is_smo8500_device with an acpi_type enum
git bisect bad 08e848150a7eecf983736e761be07844773f1246
# bad: [19f81b4b50a1337e4d554f732c5ab2ee0ba6b7bc] qed: fix ILT configuration of SRC block
git bisect bad 19f81b4b50a1337e4d554f732c5ab2ee0ba6b7bc
# bad: [e981f03cb066472ea2d889cdfefe8672629a73cc] mmc: renesas_sdhi_core: Add missing tmio_mmc_host_free() at remove
git bisect bad e981f03cb066472ea2d889cdfefe8672629a73cc
# bad: [1da3544787a1ad2ff438c2fce0562100e6707121] erofs: derive atime instead of leaving it empty
git bisect bad 1da3544787a1ad2ff438c2fce0562100e6707121
# bad: [b2c873e5bac3b81ea97bd638995f7c6bb47dc96a] xfs: fix a missing unlock on error in xfs_fs_map_blocks
git bisect bad b2c873e5bac3b81ea97bd638995f7c6bb47dc96a
# bad: [550ce4814c2aa6d125c7e7e8d791ecef5ef10e5a] selftest: fix flower terse dump tests
git bisect bad 550ce4814c2aa6d125c7e7e8d791ecef5ef10e5a
# bad: [e282b884d7e0f26ebe7e60a72df5cf5b23b2df10] r8169: fix potential skb double free in an error path
git bisect bad e282b884d7e0f26ebe7e60a72df5cf5b23b2df10
# good: [2e5ecf0671032f3c046e0d4694c2886b25bf127f] tools/bpftool: Fix attaching flow dissector
git bisect good 2e5ecf0671032f3c046e0d4694c2886b25bf127f
# bad: [dd773889e74605c1a0a97d701f589733800c9dc8] bpf: Zero-fill re-used per-cpu map element
git bisect bad dd773889e74605c1a0a97d701f589733800c9dc8
# first bad commit: [dd773889e74605c1a0a97d701f589733800c9dc8] bpf: Zero-fill re-used per-cpu map element

Po-Hsu Lin (cypressyew)
description: updated
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

The same commit also broke focal a while ago, though at first sight it seems in a different way: bug 1906866.

The solution at the time was to partially revert this commit to remove the tools/testing/selftests part.

Changed in linux (Ubuntu Groovy):
status: New → Confirmed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1917609

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
status: Confirmed → Invalid
description: updated
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :
Changed in linux (Ubuntu Groovy):
status: Confirmed → In Progress
Stefan Bader (smb)
Changed in linux (Ubuntu Groovy):
importance: Undecided → Medium
Changed in linux (Ubuntu Groovy):
status: In Progress → Fix Committed
Stefan Bader (smb)
tags: removed: kqa-blocker
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-groovy' to 'verification-done-groovy'. If the problem still exists, change the tag 'verification-needed-groovy' to 'verification-failed-groovy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-groovy
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hello,
I can still see this failure on Groovy Azure kernel (5.8.0-1027.29-azure), do we have this patch applied to groovy variants?

Revision history for this message
Kelsey Steele (kelsey-steele) wrote :

Verified the patch is in G/azure (5.8.0-1027.29-azure)

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hello Kelsey,
I tried to run the ubuntu_bpf test on a Groovy VM as well, but it's still not working with the following error message:

  HDRINST usr/include/asm/ioctls.h
  HDRINST usr/include/asm/sockios.h
  HDRINST usr/include/asm/termios.h
  HDRINST usr/include/asm/poll.h
  INSTALL ./usr/include
make[1]: Leaving directory '/home/ubuntu/autotest/client/tmp/ubuntu_bpf/src/linux'
make: *** [Makefile:159: all] Error 1

Please find attachment for the complete build log.

I think it's worthy to discuss about this Klebers.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Oh! interestingly with SKIP_TARGETS= added to the make command it looks like this is working now, we might need to adjust our test case.

Revision history for this message
Marcelo Cerri (mhcerri) wrote :

Still happening with linux-azure 5.8.0-1027.29.

tags: added: azure sru-20210315
Revision history for this message
Ian May (ian-may) wrote :

Looks like the following commit impacted bpf on groovy

670b0872ad88481a5c2bdaba7bd29defc755ce64 UBUNTU: SAUCE: selftests: Skip BPF seftests by default

this patch causes the following to no longer work
sudo make -C linux/tools/testing/selftests TARGETS=bpf clean all

Confirmed the following does still work
sudo make -C linux/tools/testing/selftests/bpf clean all

preparing patch for autotest-client-tests/ubuntu_bpf

Po-Hsu Lin (cypressyew)
Changed in ubuntu-kernel-tests:
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Po-Hsu Lin (cypressyew)
Changed in ubuntu-kernel-tests:
assignee: Po-Hsu Lin (cypressyew) → nobody
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Nice catch on that cause 670b0872ad88 ("UBUNTU: SAUCE: selftests:
Skip BPF seftests by default"), in which we make the bpf subset as
a default skip target by default. (This explained why this works in comment #13)

I prefer to keep TARGETS=bpf and add an empty SKIP_TARGETS= to the
make command to override the default value, so that we can still
align with how we start a kernel-selftests subset (like in the
ubuntu_kernel_selftests).

e.g.
utils.make('-C linux/tools/testing/selftests TARGETS=bpf SKIP_TARGETS= clean all')

A SKIP_TARGETS= might be helpful for pepole to know this is for
overriding the default value as well.

Thanks
Sam

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Since it's just a tool issue for the moment, and I can run the bpf test manually with the corrected make command. I will flip the status to verified.
Thanks

tags: added: verification-done-groovy
removed: verification-needed-groovy
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (28.3 KiB)

This bug was fixed in the package linux - 5.8.0-49.55

---------------
linux (5.8.0-49.55) groovy; urgency=medium

  * groovy/linux: 5.8.0-49.55 -proposed tracker (LP: #1921053)

  * selftests: bpf verifier fails after sanitize_ptr_alu fixes (LP: #1920995)
    - bpf: Simplify alu_limit masking for pointer arithmetic
    - bpf: Add sanity check for upper ptr_limit
    - bpf, selftests: Fix up some test_verifier cases for unprivileged

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * improper memcg accounting causes NULL pointer derefs (LP: #1918668)
    - SAUCE: Revert "mm: memcg/slab: optimize objcg stock draining"

  * kernel: Enable CONFIG_BPF_LSM on Ubuntu (LP: #1905975)
    - [Config] Enable CONFIG_BPF_LSM

  * Groovy update: upstream stable patchset 2021-03-10 (LP: #1918516)
    - gpio: mvebu: fix pwm .get_state period calculation
    - HID: wacom: Correct NULL dereference on AES pen proximity
    - media: v4l2-subdev.h: BIT() is not available in userspace
    - RDMA/vmw_pvrdma: Fix network_hdr_type reported in WC
    - kernel/io_uring: cancel io_uring before task works
    - io_uring: dont kill fasync under completion_lock
    - objtool: Don't fail on missing symbol table
    - mm/page_alloc: add a missing mm_page_alloc_zone_locked() tracepoint
    - mm: fix a race on nr_swap_pages
    - tools: Factor HOSTCC, HOSTLD, HOSTAR definitions
    - iwlwifi: provide gso_type to GSO packets
    - tty: avoid using vfs_iocb_iter_write() for redirected console writes
    - ACPI: sysfs: Prefer "compatible" modalias
    - kernel: kexec: remove the lock operation of system_transition_mutex
    - ALSA: hda/realtek: Enable headset of ASUS B1400CEPE with ALC256
    - ALSA: hda/via: Apply the workaround generically for Clevo machines
    - parisc: Enable -mlong-calls gcc option by default when !CONFIG_MODULES
    - media: cec: add stm32 driver
    - media: hantro: Fix reset_raw_fmt initialization
    - media: rc: fix timeout handling after switch to microsecond durations
    - media: rc: ite-cir: fix min_timeout calculation
    - media: rc: ensure that uevent can be read directly after rc device register
    - ARM: dts: tbs2910: rename MMC node aliases
    - ARM: dts: ux500: Reserve memory carveouts
    - ARM: dts: imx6qdl-gw52xx: fix duplicate regulator naming
    - wext: fix NULL-ptr-dereference with cfg80211's lack of commit()
    - ASoC: AMD Renoir - refine DMI entries for some Lenovo products
    - drm/i915: Always flush the active worker before returning from the wait
    - drm/i915/gt: Always try to reserve GGTT address 0x0
    - drivers/nouveau/kms/nv50-: Reject format modifiers for cursor planes
    - net: usb: qmi_wwan: added support for Thales Cinterion PLSx3 modem family
    - s390: uv: Fix sysfs max number of VCPUs reporting
    - s390/vfio-ap: No need to disable IRQ after queue reset
    - PM: hibernate: flush swap writer after marking
    - x86/entry: Emit a symbol for register restoring thunk
    - efi/apple-properties: Reinstate support for boolean properties
    - drivers: soc: atmel: Avoid calling at91_soc_init on non AT91 SoCs
    - drivers: soc: atmel: add null entry at the end of at91_soc_allowed_list[]
   ...

Changed in linux (Ubuntu Groovy):
status: Fix Committed → Fix Released
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Changed in ubuntu-kernel-tests:
assignee: nobody → Ian (ian-may)
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.