[wily][regression] systemtap script compilation broken by new kernels

Bug #1545330 reported by apport hater on 2016-02-13
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Tim Gardner
Wily
Undecided
Tim Gardner
Xenial
Undecided
Tim Gardner
systemtap (Fedora)
Fix Released
Undecided
systemtap (Ubuntu)
High
Unassigned
Wily
Undecided
Unassigned
Xenial
High
Unassigned

Bug Description

The following errors appear when compiling any systemtap script:

In file included from include/linux/mutex.h:15:0,
                 from /tmp/stapbdpxn3/stap_a0ec17f995e8f89d672d8c2eb7fe7c24_1693_src.c:25:
include/linux/spinlock_types.h:55:14: error: ‘__ARCH_SPIN_LOCK_UNLOCKED’ undeclared here (not in a function)
  .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \
              ^
include/linux/spinlock_types.h:79:15: note: in expansion of macro ‘__RAW_SPIN_LOCK_INITIALIZER’
  { { .rlock = __RAW_SPIN_LOCK_INITIALIZER(lockname) } }
               ^
include/linux/spinlock_types.h:82:16: note: in expansion of macro ‘__SPIN_LOCK_INITIALIZER’
  (spinlock_t ) __SPIN_LOCK_INITIALIZER(lockname)
                ^
include/linux/mutex.h:111:18: note: in expansion of macro ‘__SPIN_LOCK_UNLOCKED’
   , .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
                  ^
include/linux/mutex.h:117:27: note: in expansion of macro ‘__MUTEX_INITIALIZER’
  struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)

Upstream fix:
https://www.sourceware.org/git/gitweb.cgi?p=systemtap.git;a=commitdiff;h=320e1ecb16427b5769f0f5a097d80823ee1fb765

Description of problem:

I was trying to run the following simple probe:

ruben@wodan: ~$ sudo stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'
Pass 1: parsed user script and 122 library script(s) using 237480virt/49484res/7192shr/42152data kb, in 190usr/70sys/250real ms.
Pass 2: analyzed script: 1 probe(s), 1 function(s), 4 embed(s), 0 global(s) using 376932virt/183612res/8544shr/181604data kb, in 2170usr/1540sys/4358real ms.
Pass 3: translated to C into "/tmp/stapbdpxn3/stap_a0ec17f995e8f89d672d8c2eb7fe7c24_1693_src.c" using 376932virt/183860res/8792shr/181604data kb, in 10usr/390sys/417real ms.
In file included from include/linux/mutex.h:15:0,
                 from /tmp/stapbdpxn3/stap_a0ec17f995e8f89d672d8c2eb7fe7c24_1693_src.c:25:
include/linux/spinlock_types.h:55:14: error: ‘__ARCH_SPIN_LOCK_UNLOCKED’ undeclared here (not in a function)
  .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \
              ^
include/linux/spinlock_types.h:79:15: note: in expansion of macro ‘__RAW_SPIN_LOCK_INITIALIZER’
  { { .rlock = __RAW_SPIN_LOCK_INITIALIZER(lockname) } }
               ^
include/linux/spinlock_types.h:82:16: note: in expansion of macro ‘__SPIN_LOCK_INITIALIZER’
  (spinlock_t ) __SPIN_LOCK_INITIALIZER(lockname)
                ^
include/linux/mutex.h:111:18: note: in expansion of macro ‘__SPIN_LOCK_UNLOCKED’
   , .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
                  ^
include/linux/mutex.h:117:27: note: in expansion of macro ‘__MUTEX_INITIALIZER’
  struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
                           ^
/tmp/stapbdpxn3/stap_a0ec17f995e8f89d672d8c2eb7fe7c24_1693_src.c:26:8: note: in expansion of macro ‘DEFINE_MUTEX’
 static DEFINE_MUTEX(module_refresh_mutex);
        ^
scripts/Makefile.build:258: recipe for target '/tmp/stapbdpxn3/stap_a0ec17f995e8f89d672d8c2eb7fe7c24_1693_src.o' failed
make[1]: *** [/tmp/stapbdpxn3/stap_a0ec17f995e8f89d672d8c2eb7fe7c24_1693_src.o] Error 1
Makefile:1386: recipe for target '_module_/tmp/stapbdpxn3' failed
make: *** [_module_/tmp/stapbdpxn3] Error 2
WARNING: kbuild exited with status: 2
Pass 4: compiled C into "stap_a0ec17f995e8f89d672d8c2eb7fe7c24_1693.ko" in 8650usr/5580sys/15546real ms.
Pass 4: compilation failed. [man error::pass4]

Version-Release number of selected component (if applicable):
systemtap-2.8-1.fc23.x86_64
ruben@wodan: ~$ uname -r
4.2.0-0.rc0.git4.1.fc23.x86_64

I'm seeing the same thing, looking into it now.

It looks like this was caused by the addition of qspinlocks in the following kernel commit:

====
commit a33fda35e3a7655fb7df756ed67822afb5ed5e8d
Author: Waiman Long <email address hidden>
Date: Fri Apr 24 14:56:30 2015 -0400

    locking/qspinlock: Introduce a simple generic 4-byte queued spinlock
====

With this change we need to include linux/module.h before including linux/mutex.h.

To fix your immediate problem, add the following text to /usr/share/systemtap/runtime/linux/runtime_defines.h:

====
#ifndef _LINUX_RUNTIME_DEFINES_H_
#define _LINUX_RUNTIME_DEFINES_H_

#include <linux/module.h>
#include <linux/mutex.h>

#endif /* _LINUX_RUNTIME_DEFINES_H_ */
====

I'm not 100% sure if that's the final fix, but it will get you going again.

The final fix ended up being a small tweak to our code generation - commit 320e1ec.

<https://www.sourceware.org/git/gitweb.cgi?p=systemtap.git;a=commitdiff;h=320e1ecb16427b5769f0f5a097d80823ee1fb765>

Thanks for the quick fix David!

Hi David, would it be possible to do a F21 update too? It's also broken by this bug. Thanks.

(In reply to Marcelo Ricardo Leitner from comment #5)
> Hi David, would it be possible to do a F21 update too? It's also broken by
> this bug. Thanks.

Thanks for letting us know. I'll see if we can't get an update out soon for f21 (and f22).

(In reply to Marcelo Ricardo Leitner from comment #5)
> Hi David, would it be possible to do a F21 update too? It's also broken by
> this bug. Thanks.

Marcelo,

Have you got a reproducer for this bug on f21? The one in comment #1 seems to work fine there.

Oops. Then it depends on kernel version. I'm running a custom one, 4.2.0-rc8+, and the one in comment #0 reproduces it in here (greped for brevity):

[root@localhost ~]# stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}' 2>&1 | grep error
include/linux/spinlock_types.h:55:14: error: ‘__ARCH_SPIN_LOCK_UNLOCKED’ undeclared here (not in a function)
Pass 4: compilation failed. [man error::pass4]
[root@localhost ~]# uname -r
4.2.0-rc8+
[root@localhost ~]# cat /etc/fedora-release
Fedora release 21 (Twenty One)
[root@localhost ~]# rpm -q systemtap
systemtap-2.8-1.fc21.x86_64

I also cannot reproduce it with 4.1.6-100.fc21.x86_64.
Now checking, the fix seems to be compatible with both, just yeah not strictly needed for f21.. I hope it's okay to include it anyway.

systemtap-2.8-2.fc21 has been submitted as an update to Fedora 21. https://bodhi.fedoraproject.org/updates/FEDORA-2015-16095

systemtap-2.8-2.fc21 has been pushed to the Fedora 21 testing repository. If problems still persist, please make note of it in this bug report.\nIf you want to test the update, you can install it with \n su -c 'yum --enablerepo=updates-testing update systemtap'. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-16095

systemtap-2.9-1.fc21 has been submitted as an update to Fedora 21. https://bodhi.fedoraproject.org/updates/FEDORA-2015-9ef098b6d4

systemtap-2.9-1.fc21 has been pushed to the Fedora 21 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
$ su -c 'dnf --enablerepo=updates-testing update systemtap'
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-9ef098b6d4

systemtap-2.9-1.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.

apport hater (g112) on 2016-02-13
description: updated
apport hater (g112) wrote :
Changed in systemtap (Ubuntu):
importance: Undecided → High
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemtap (Ubuntu):
status: New → Confirmed
Dan Streetman (ddstreet) wrote :

I've sent a kernel patch to correct this also, which IMHO is the correct fix.

https://lkml.org/lkml/2016/2/19/533

summary: - [wily][regression] completely broken by new kernels
+ [wily][regression] systemtap script compilation broken by new kernels
Tim Gardner (timg-tpi) on 2016-02-29
Changed in linux (Ubuntu Wily):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Tim Gardner (timg-tpi) on 2016-03-01
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Wily):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.4.0-10.25

---------------
linux (4.4.0-10.25) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1552247

  * linux: 4.4.0-9.X fails yama ptrace restrictions tests (LP: #1551894)
    - security: let security modules use PTRACE_MODE_* with bitmasks

  * [wily][regression] systemtap script compilation broken by new kernels (LP: #1545330)
    - SAUCE: (noup) locking/qspinlock: Move __ARCH_SPIN_LOCK_UNLOCKED to qspinlock_types.h

  * [Feature]SD/SDIO/eMMC support for Broxton-P (LP: #1520454)
    - mmc: sdhci: 64-bit DMA actually has 4-byte alignment
    - mmc: sdhci: Fix DMA descriptor with zero data length

  * Miscellaneous Ubuntu changes
    - SAUCE: (noup) cgroup: fix and restructure error handling in copy_cgroup_ns()

 -- Tim Gardner <email address hidden> Mon, 29 Feb 2016 13:04:14 -0700

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Kamal Mostafa (kamalmostafa) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-wily' to 'verification-done-wily'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-wily
Dan Streetman (ddstreet) on 2016-03-23
tags: added: verification-done-wily
removed: verification-needed-wily
Tim Gardner (timg-tpi) wrote :

UBUNTU: Ubuntu-4.2.0-35.40

Changed in linux (Ubuntu Wily):
status: Fix Committed → Fix Released

AFAICT systemtap works well in xenial with the latest version:

systemtap (2.9-2ubuntu2) xenial; urgency=medium

  * d/p/0001-Fix-PR9497-by-updating-the-runtime-to-handle-linux-4.patch:
    Fix stap compilation after kernel 4.4 commit 7523e4dc50. (LP: #1557673)

Changed in systemtap (Ubuntu Xenial):
status: Confirmed → Fix Released
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemtap (Ubuntu Wily):
status: New → Confirmed

https://bugs.launchpad.net/ubuntu/+source/systemtap/+bug/1683876

Similar problem appears whenever there is a new HWE kernel (specially for LTS versions).

Changed in systemtap (Fedora):
importance: Unknown → Undecided
status: Unknown → Fix Released
Dan Streetman (ddstreet) on 2019-04-08
Changed in systemtap (Ubuntu Wily):
status: Confirmed → Won't Fix
Frank Ch. Eigler (fche) wrote :

By the way, a simple diagnostic for whether any particular version of systemtap has been ported to a kernel is to run

% stap -V
Systemtap translator/driver (version 4.1/0.174, rpm 4.1-0.20190327git2ede4cecb20c.fc28)
Copyright (C) 2005-2019 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
tested kernel versions: 2.6.18 ... 5.0-rc3
enabled features: AVAHI BOOST_STRING_REF DYNINST BPF JAVA PYTHON2 PYTHON3 LIBRPM LIBSQLITE3 LIBVIRT LIBXML2 NLS NSS READLINE

Note the "tested kernel versions" line. If your kernel is newer than that, you'll need to switch to a fresher upstream systemtap version.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.