perf top problem on z with Ubuntu 18.04

Bug #1828166 reported by bugproxy on 2019-05-08
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
High
Canonical Kernel Team
linux (Ubuntu)
Undecided
Skipper Bug Screeners
Bionic
High
Unassigned
Cosmic
Undecided
Unassigned
Disco
High
Unassigned

Bug Description

SRU Justification:
==================

[Impact]

* The perf top tool hangs and shows error messages, like 'Not enough memory for annotating'

[Fix]

* b9c0a64901d5bdec6eafd38d1dc8fa0e2974fccb b9c0a64 "perf annotate: Fix s390 gap between kernel end and module start"

* 12a6d2940b5f02b4b9f71ce098e3bb02bc24a9ea 12a6d29 "perf record: Fix module size on s390"

Disco needs also as prereq:

* 6738028dd57df064b969d8392c943ef3b3ae705d 6738028 "perf record: Fix s390 missing module symbol and warning for non-root users"

[Test Case]

* start a benchmark (mem_alloc, but it doesn't really matter what)

* execute perf top in a second terminal

* the output of perf top is correct

* now stop the benchmark

* and perf top shows an error message, like "Not enough memory for annotating '__irf_end' symbol!)"

* and perf top can't be exited anymore

[Regression Potential]

* The regression potential can be considered as medium since this happens only while using the perf top tool

* and just 3 files are changed, and one of them is arch/s390/util/machine.c

* but symbol and machine header in /tools/perf/util modified and several loc added

[Other Info]

* cherry-pick was possible to bionic-master-next and to disco-master-next (but used '--strategy=recursive -X theirs' for disco)

* adding the patches to disco is to avoid regressions
_________________________

perf top hangs and shows error messages

---uname output---
Linux weather 4.15.0-46-generic #49-Ubuntu SMP Wed Feb 6 09:32:27 UTC 2019 s390x s390x s390x GNU/Linux

---Steps to Reproduce---
 I start a benchmark (mem_alloc, but it really doesn't matter) and then issue perf top in a second terminal, the output from perf top is correct. Now I stop the benchmark: perf top shows a error message (Not enough memory for annotating '__irf_end' symbol!) and I can't quit from perf top anymore

Following analyse took place:
No problem with current kernel .
Bi-Secting of perf tool took place and following commit was found:

commit edeb0c90df3581b821a764052d185df985f8b8dc (HEAD, refs/bisect/bad)
Author: Arnaldo Carvalho de Melo <email address hidden>
Date: Tue Oct 16 17:08:29 2018 -0300

    perf tools: Stop fallbacking to kallsyms for vdso symbols lookup

When you apply this patch the issue is gone, however it is contained in these versions:

git tag --contains edeb0c90df3581b821
v4.19
v4.20
....

The level I was debugging was kernel 4.15 which does not contain this patch.

Default Comment by Bridge

tags: added: architecture-s39064 bugnameltc-177443 severity-high targetmilestone-inin18041
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (frank-heimes) wrote :

Hence, this needs to land in bionic (and cosmic) - disco (and eoan) kernels are already > 4.20

Changed in ubuntu-z-systems:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)

------- Comment From <email address hidden> 2019-05-08 03:52 EDT-------
(In reply to comment #11)
> Hence, this needs to land in bionic (and cosmic) - disco (and eoan) kernels
> are already > 4.20

Bionic aka 18.04 LTS should be priority

description: updated
Changed in linux (Ubuntu):
status: New → Fix Released

Hello IBM,

I have built test kernels with the requested patch applied which can be found at:

Bionic:
https://people.canonical.com/~ksouza/lp1828166/bionic/

Cosmic:
https://people.canonical.com/~ksouza/lp1828166/cosmic/

Please note that at least linux-image- and linux-modules- need to be installed to get a functional kernel.

Please let us know if these test kernels fix the issue reported.

Thank you.

Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu Cosmic):
status: New → In Progress
Changed in ubuntu-z-systems:
status: Triaged → In Progress
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-09 03:53 EDT-------
Talked to Thomas Richter for a fresh Ubuntu 18.04 where we can test the provided changes, he will provide this image later today (tessia). I can't use the image where I found this issue for this test.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-09 08:59 EDT-------
ok, I go the image with 18.04 and could recreate the problem.
Then I installed all packages with apt install ./package-name and did a reboot to activate the new kernel.
Questions:
- perf (binary) is no longer available via PATH after package install. Did the package install of linux-tools-common remove perf?

with
dpkg -c linux-tools-common_4.15.0-49.53+lp1828166_all.deb
drwxr-xr-x root/root 0 2019-05-08 16:19 ./
drwxr-xr-x root/root 0 2019-05-08 16:19 ./usr/
drwxr-xr-x root/root 0 2019-05-08 16:19 ./usr/share/
drwxr-xr-x root/root 0 2019-05-08 16:19 ./usr/share/doc/
drwxr-xr-x root/root 0 2019-05-08 16:19 ./usr/share/doc/linux-tools-common/
-rw-r--r-- root/root 194820 2019-05-08 16:17 ./usr/share/doc/linux-tools-common/changelog.Debian.gz
-rw-r--r-- root/root 1292 2019-05-08 16:17 ./usr/share/doc/linux-tools-common/copyright

Should perf be included here?

Frank Heimes (frank-heimes) wrote :

Hi, perf should be in package linux-tools-common and each kernel versions has a dedicated linux-tool-common) package version.
$ dpkg -S $(which perf)
linux-tools-common: /usr/bin/perf
But on disk it's usually in /usr/bin.

Please can you check if it's in with:
$ dpkg -L linux-tools-common | grep perf$
/usr/bin/perf

In case it's not can you just take it temporarily from the last recent bionic package?
wget http://launchpadlibrarian.net/417515745/linux-tools-common_4.15.0-48.51_all.deb
dpkg-deb -x ./linux-tools-common_4.15.0-48.51_all.deb
I think that would be okay for this pre-test, since the perf tool wasn't touched, but just the kernel.

The verification during the SRU process later of course needs to be with the proper package.

Frank Heimes (frank-heimes) wrote :

Looks like a dedicated linux-tool-common package is needed, because perf too tight to the kernel ...

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-09 11:05 EDT-------
using perf from package recommended by Frank Heimes results in:
root@m3545035:/home/theurich/people.canonical.com/~ksouza/lp1828166/bionic/tmp/usr/bin# ./perf
WARNING: perf not found for kernel 4.15.0-49

You may need to install the following packages for this specific kernel:
linux-tools-4.15.0-49-generic
linux-cloud-tools-4.15.0-49-generic

You may also want to install one of the following packages to keep up to date:
linux-tools-generic
linux-cloud-tools-generic

my understanding is that I need a perf binary which is matching the kernel (which was provided by kleber-souza

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-10 03:06 EDT-------
sure ... and this package does not have the perf binary included

Hello Klaus,

The packages I provided indeed don't ship the perf binaries with it, my apologies.

I'm building the packages again in another build environment which will produce the perf files we need, I will post a link to the new packages soon.

Thank you.

Klaus,

Please download the new packages from the following URL:
https://people.canonical.com/~ksouza/lp1828166/bionic/lp/

Additionally to the linux-image and linux-modules packages, please install also:
linux-tools-common_4.15.0-49.53+lp1828166_all.deb
linux-tools-4.15.0-49-generic_4.15.0-49.53+lp1828166_s390x.deb
linux-tools-4.15.0-49_4.15.0-49.53+lp1828166_s390x.deb

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-13 11:48 EDT-------
thanks, now I got the perf binary back.

I still see
Not enough memory for annotating '__irf_end' symbol!
though

Hi,

Can you please provide the output of 'uname -a'?

Also, on the original bug report it states that the issue is fixed with the mentioned patch applied. On top of which kernel was this patch applied and tested?

Thank you.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-14 08:35 EDT-------
I have debugged this issue Klaus reported.

I have installed the following system:
root@s8360046:~/linux/tools/perf# uname -a
Linux s8360046 4.18.0-17-generic #18-Ubuntu SMP Wed Mar 13 14:29:38 UTC 2019 s390x s390x s390x GNU/Linux
root@s8360046:~/linux/tools/perf#

and ran into the issue Klaus reported by running command
root@s8360046:~/ # perf top
in one window and in another window running this command while perf top was running:

root@s8360046:~/ # for i in $(seq 15); do find / >/dev/null 2>&1; echo $i; done

The issue shows up pretty quickly.

I then cloned the linux kernel tree and build the perf tool from
the downloaded linux kernel in directory linux/tools/perf. I did the same test with the locally compiled
perf tools and the issue did not show up. To me it looked like the latest version did not have this problem.

Next bisected the linux kernel tree (which also includes the perf tool) and the bisecting process
stopped at the mentioned commit. Thats how I arrived at the commit id stated im comment 4.

Obviously it did not fix the issue otherwise the patch would work. Sorry for that.
It looks like we are back to start again.

PS: I'm not sure but I have the feeling the issue only shows up when the debuginfo packages for the kernel are not installed.

Frank Heimes (frank-heimes) wrote :

Changed to Incomplete while waiting for new analysis results.

Changed in ubuntu-z-systems:
status: In Progress → Incomplete
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-24 08:19 EDT-------
I have debugged this issue and created three patches. Two have been accepted by the perf maintainer
and the third is currently under review.
Once the patches are upstream I will post the commit id's for further processing by Ubuntu.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-06-04 02:52 EDT-------
Two patches are now upstream and available in Linux 5.2-rc3:

commit 6738028dd57df064b969d8392c943ef3b3ae705d
("perf record: Fix s390 missing module symbol and warning for non-root users")

commit ed9adb2035b5be5896d465b19040262be5f4a824
("perf machine: Read also the end of the kernel")

The second one is from Jiri Olsa (Red Hat) who investigated a similar issue,

These 2 patches should be sufficient to solve the issue.

Changed in ubuntu-z-systems:
status: Incomplete → Triaged
Frank Heimes (frank-heimes) wrote :

Thx for the analysis and the sharing of the commit IDs for the two patches from kernel 5.2-rc3 in comment #19.
Since you mentioned three patches in comment #18 I assume the 3rd one became obsolete?

With that I assume that:
"perf record: Fix s390 missing module symbol and warning for non-root users"
6738028dd57df064b969d8392c943ef3b3ae705d
and
"perf machine: Read also the end of the kernel"
ed9adb2035b5be5896d465b19040262be5f4a824
need to be applied to Disco, (Cosmic) and Bionic.
They will automatically land in Eoan, since kernel 5.2 is Eoan's target kernel.

Is it correct that:
edeb0c90df3581b821a764052d185df985f8b8dc
"perf tools: Stop fallbacking to kallsyms for vdso symbols lookup"
does not need to be considered anymore (neither for cosmic nor for bionic)?

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-06-04 04:14 EDT-------
> Since you mentioned three patches in comment #18 I assume the 3rd one became obsolete?
Yes.

> Is it correct that:
> edeb0c90df3581b821a764052d185df985f8b8dc
> "perf tools: Stop fallbacking to kallsyms for vdso symbols lookup"
> does not need to be considered anymore (neither for cosmic nor for bionic)?

Yes

Frank Heimes (frank-heimes) wrote :

Thx for the confirmation

Changed in ubuntu-z-systems:
status: Triaged → Confirmed
Manoj Iyer (manjo) on 2019-06-17
Changed in linux (Ubuntu):
status: Fix Released → Confirmed
Frank Heimes (frank-heimes) wrote :

Just tried to cherry-pick 1) 6738028 and 2) ed9adb2 from 5.3-rc3 to bionic's master-next tree:
git://kernel.ubuntu.com/ubuntu/ubuntu-bionic --branch master-next
1) applied cleanly, but 2) did not - I got 3 conflicts while cherry-picking ed9adb2
Is there something still missing?
(I used bionic to check, because it's the oldest release this should SRUed to, and master-next is needed, because things progress and some patches might already got accepted on top of master, even if not yet released.)

Hello Frank,

On 6/19/19 7:13 AM, Frank Heimes wrote:
> Just tried to cherry-pick 1) 6738028 and 2) ed9adb2 from 5.3-rc3 to bionic's master-next tree:
> git://kernel.ubuntu.com/ubuntu/ubuntu-bionic --branch master-next
> 1) applied cleanly, but 2) did not - I got 3 conflicts while cherry-picking ed9adb2
> Is there something still missing?
> (I used bionic to check, because it's the oldest release this should SRUed to, and master-next is needed, because things progress and some patches might already got accepted on top of master, even if not yet released.)
>

Sorry I am late getting back to you. Yes, Kleber set out to do what
you tried and found the same. he was hoping to finish this before his
vacation but was unable to. I asked Khaled to try and finish that
trello card and of course he ran into problems and ran out of time due
to impact on our SRU from the SACK Panic release and having to get new
kernels ready for July 1 SRU release in the two weeks we had left.
Khaled is hoping to get back to this card tomorrow, Friday, June 21.

Will keep you updated but hope this is resolved and fix-committed for
SRU 2019.07.01

Terry

Download full text (5.9 KiB)

------- Comment From <email address hidden> 2019-07-05 07:48 EDT-------
I have tried to last 2 days to find a patch series to apply the required patch which failed above.

The difference between upstream linux kernel 5.2.0rc7 and the ubuntu
18.04 kernel I downloaded from
url = git://kernel.ubuntu.com/ubuntu/ubuntu-bionic
branch master-next
in regards to the function machine__create_kernel_maps() in file tools/perf/util/machine.c
is very big.

Command
git log --oneline '-L: machine__create_kernel_maps:tools/perf/util/machine.c'
identifies the difference of applied patches. The following patches
are missing in ubuntu version:

Commit-ID Description
ed9adb2035b5 perf machine: Read also the end of the kernel
977c7a6d1e26 perf machine: Update kernel map address and re-order properly
1c5aae7710bb perf machine: Create maps for x86 PTI entry trampolines
3183f8ca304f perf symbols: Unify symbol maps
ee05d21791db perf machine: Set main kernel end address properly
1fb87b8e9599 perf machine: Don't search for active kernel start in __machine__create_kernel_maps

Already the first patch missing:
Commit-ID Description
1fb87b8e9599 perf machine: Don't search for active kernel start in __machine__create_kernel_maps

does not apply because in this patch a function named machine__set_kernel_mmap()
is moved in file perf/tools/util/machine.c but this function is missing
in the ubuntu 18.04 master-next tree.
It turned out that this tree also misses patch:

Commit-ID Description
05db6ff73d80 perf machine: Generalize machine__set_kernel_mmap()

Having found the starting patch, I can applied the following patch sequence:
root@s8360046:~/ubuntu-bionic# patch -p1 < ../patches/0001-perf-machine-Generalize- machine__set_kernel_mmap.patch
patching file tools/perf/util/machine.c
Hunk #1 succeeded at 1255 (offset -7 lines).
Hunk #2 succeeded at 1370 (offset -5 lines).
root@s8360046:~/ubuntu-bionic# patch -p1 < ../patches/0001-perf-machine-Don-t-search-for-active-kernel-start-in.patch
patching file tools/perf/util/machine.c
Hunk #1 succeeded at 849 (offset -7 lines).
Hunk #2 succeeded at 861 (offset -7 lines).
Hunk #3 succeeded at 1212 (offset -7 lines).
Hunk #4 succeeded at 1254 (offset -7 lines).
patching file tools/perf/util/machine.h
Hunk #1 succeeded at 239 (offset 1 line).
root@s8360046:~/ubuntu-bionic# patch -p1 < ../patches/0001-perf-machine-Set-main-kernel-end-address-properly.patch
patching file tools/perf/util/machine.c
Hunk #1 succeeded at 1020 (offset 1 line).
Hunk #2 succeeded at 1227 (offset 1 line).
Hunk #3 succeeded at 1254 (offset 1 line).
root@s8360046:~/ubuntu-bionic# patch -p1 < ../patches/0001-perf-machine-Update-kernel-map-address-and-re-order-.patch
patching file tools/perf/util/machine.c
Hunk #1 succeeded at 1223 with fuzz 1 (offset -198 lines).
Hunk #2 succeeded at 1269 with fuzz 1 (offset -198 lines).
Hunk #3 succeeded at 1381 (offset -226 lines).
root@s8360046:~/ubuntu-bionic# patch -p1 < ../patches/0001-perf-machine-Read-also-the-end-of-the-kernel.patch
patching file tools/perf/util/machine.c
Hunk #1 succeeded at 821 (offset -103 lines).
Hunk #2 succeeded at 847 (offset -103 lines).
Hunk #3 succeeded at ...

Read more...

Changed in linux (Ubuntu Cosmic):
status: In Progress → Invalid

Hello IBM,

I have built a kernel for tests with the patches mentioned in the previous comment:

perf record: Fix s390 missing module symbol and warning for non-root users
perf machine: Read also the end of the kernel
perf machine: Update kernel map address and re-order properly
perf machine: Set main kernel end address properly
perf machine: Don't search for active kernel start in __machine__create_kernel_maps
perf machine: Generalize machine__set_kernel_mmap()

The packages for s390x can be found at:
https://people.canonical.com/~ksouza/lp1828166/bionic/

Could you please test the kernel from the link above if it fixes the problem?

Thank you.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-07-16 08:52 EDT-------
ran tests on m3545035. ubuntu 18.02 with latest updates. Recreated the problem first (Not enough memory for annotating '__irf_end' symbol! and a hang) with perf top and a workload. Installed provided packages, reboot to activate the new kernel:

root@m3545035:~# uname -a
Linux m3545035 4.15.0-55-generic #60+lp1828166.2-Ubuntu SMP Fri Jul 12 12:49:26 UTC 2019 s390x s390x s390x GNU/Linux

ran perf top again and could still recreate the problem: Not enough memory for annotating '__irf_end' symbol! and a hang.

What I see: perf -v gives me 4.15.18 before and after linux-tools* install. Is that expected? The perf fixes do not come with a new version of perf?

Hi Klaus,

The same version of perf is expected.

I was able to reproduce the issue and I can confirm that I see the same behavior as you described even with the patched test kernel.

Frank Heimes (frank-heimes) wrote :

I can also confirm that the issue still exists, but only if I run perf top as root or with sudo.
Running perf top it as regular user is fine - even w/o the patches packages!

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-07-17 04:49 EDT-------
from LoZ and ZaaS performance point of view we need perf for root to analyze system behavior

------- Comment From <email address hidden> 2019-07-17 04:55 EDT-------
All the developer for perf on s390 I would like to investage this issue further.

Can I get the source code of the kernel subdirectory tools (taken out of the linux tree) to
build perf and get a closer look at the issue.
For debugging this I need to build the perf tool myself.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-07-17 11:06 EDT-------
How can I get the source code of the kernel tree iand the perf tool?

I have tried this command and it did not get the sources:

root@s8360046:~# uname -r
4.15.0-55-generic
root@s8360046:~# apt-get source linux-image-$(uname -r)
Reading package lists... Done
E: You must put some 'source' URIs in your sources.list
root@s8360046:~#

Thanks for your help

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-07-17 11:24 EDT-------
(In reply to comment #45)
> How can I get the source code of the kernel tree iand the perf tool?
>
> I have tried this command and it did not get the sources:
>
> root@s8360046:~# uname -r
> 4.15.0-55-generic
> root@s8360046:~# apt-get source linux-image-$(uname -r)
> Reading package lists... Done
> E: You must put some 'source' URIs in your sources.list
> root@s8360046:~#
>
> Thanks for your help

Check the file /etc/apt/sources.list on your system, and look for lines beginning with 'deb-src'
that are commented out. Uncomment them, and try again.

Frank Heimes (frank-heimes) wrote :

Sorry, I missed comment #31, just saw now #32.

Please have a look here on how to install the build env. and the kernel sources:
https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel
Just notice that the deb line in the sources.list file is different for non-x86 (like s390x)
It must be:
deb-src http://us.ports.ubuntu.com/ubuntu-ports/ bionic main
deb-src http://us.ports.ubuntu.com/ubuntu-ports/ bionic-updates main
instead of:
deb-src http://archive.ubuntu.com/ubuntu bionic main
deb-src http://archive.ubuntu.com/ubuntu bionic-updates main

Frank Heimes (frank-heimes) wrote :

And afterwards you can get the package that contains perf like this:

$ dpkg -S $(which perf)
linux-tools-common: /usr/bin/perf
$ apt-get source linux-tools-common

For the linux-tools source package that fits to your running kernel, do:
apt-get source linux-tools-$(uname -r)

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-07-18 03:46 EDT-------
I have this /etc/apt/sources.list file with the following deb-src lines

root@s8360046:~# fgrep bionic /etc/apt/sources.list | fgrep deb-src
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic main restricted
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic-updates main restricted
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic universe
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic-updates universe
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic multiverse
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic-updates multiverse
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic-backports main restricted universe multiverse
# deb-src http://archive.canonical.com/ubuntu bionic partner
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic-security main restricted
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic-security universe
deb-src http://us.ports.ubuntu.com/ubuntu-ports bionic-security multiverse
root@s8360046:~# apt-get source linux-image-$(uname -r)
Reading package lists... Done
E: You must put some 'source' URIs in your sources.list
root@s8360046:~# uname -r
4.15.0-55-generic
root@s8360046:~#

But this is not really the issue.

To debug this further I need the linux kernel tree (tarball preferred)
or at least the subdirectory tools inside the linux kernel tree:

root@s8360046:~# ls -ld ubuntu-*
drwxr-xr-x 31 root root 4096 Jul 5 13:37 ubuntu-bionic
drwxr-xr-x 29 root root 4096 Jul 17 14:17 ubuntu-disco
root@s8360046:~# ls -ld ubuntu-bionic/tools/perf
drwxr-xr-x 14 root root 4096 Jul 5 13:47 ubuntu-bionic/tools/perf <---- perf tool source directory.
root@s8360046:~#

I need this from the build done by kleber-souza (see entry 38 dated on 2019-07-15 10:07:50 CDT)
with the kernel named
Linux s8360046 4.15.0-55-generic #60+lp1828166.2-Ubuntu SMP Fri Jul 12 12:49:26
From this tree I would like to get a tar file of the tools directory.

Thanks a lot.

Frank Heimes (frank-heimes) wrote :

'apt-get source' should work - just double checked it here:

# just these two deb-src lines should be sufficient:
$ grep ^deb-src /etc/apt/sources.list
deb-src http://us.ports.ubuntu.com/ubuntu-ports/ bionic main
deb-src http://us.ports.ubuntu.com/ubuntu-ports/ bionic-updates main

$ apt-get source linux-image-$(uname -r)
Reading package lists... Done
Picking 'linux' as source package instead of 'linux-image-4.15.0-54-generic'
NOTICE: 'linux' packaging is maintained in the 'Git' version control system at:
git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic
Please use:
git clone git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic
to retrieve the latest (possibly unreleased) updates to the package.
Skipping already downloaded file 'linux_4.15.0-54.58.dsc'
Skipping already downloaded file 'linux_4.15.0.orig.tar.gz'
Skipping already downloaded file 'linux_4.15.0-54.58.diff.gz'
Need to get 0 B of source archives.
Skipping unpack of already unpacked source in linux-4.15.0
$

Did you do the required "apt update" after modifying the sources.list file - to update the local archive package index?

But 'klebers' will share the sources he used ...

Hi Thomas,

The tarball with the sources I used to build linux 4.15.0-55.60+lp1828166.2 can be found here:

https://people.canonical.com/~ksouza/lp1828166/bionic/linux-source-4.15.0.tar.bz2

md5sum:
58e725a4c20dcf3fb0b209e3d6155e74 linux-source-4.15.0.tar.bz2

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-07-19 07:37 EDT-------
Thanks for your help.

I have installed
root@s8360046:~/linux-4.15.0/tools/perf# uname -a
Linux s8360046 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:54:05 UTC 2019 s390x s390x s390x GNU/Linux
root@s8360046:~/linux-4.15.0/tools/perf#

and download the corresponding linux kernel source tree from us.ports/ubuntu.com.
So the linux kernel I am using matches the sources and I built the perf tool.

I can also recreate the issue and I start debugging this issue.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-07-22 05:04 EDT-------
During execution of command 'perf top' the error message:

Not enough memory for annotating '__irf_end' symbol!)

is emitted from this call sequence:
__cmd_top
perf_top__mmap_read
perf_top__mmap_read_idx
perf_event__process_sample
hist_entry_iter__add
hist_iter__top_callback
perf_top__record_precise_ip
hist_entry__inc_addr_samples
symbol__inc_addr_samples
symbol__get_annotation
symbol__alloc_hist

In this function the size of symbol __irf_end is calculated. The size
of a symbol is the difference between its start and end address.
When the symbol was read the first time, it was:
symbol__new: __irf_end 0xe954d0-0xe954d0
which is correct and maps with /proc/kallsyms:

root@s8360046:~/linux-4.15.0/tools/perf# fgrep _irf_end /proc/kallsyms
0000000000e954d0 t __irf_end
root@s8360046:~/linux-4.15.0/tools/perf#

In function symbol__alloc_hist() the end of symbol __irf_end is
symbol__alloc_hist sym:__irf_end start:0xe954d0 end:0x3ff80045a8
which is identical with the first module entry in /proc/kallsyms

This results in a symbol size of __irf_req for histogram analyses of
70334140059072 bytes and a malloc() for this requested size fails.

The root cause of this is function
__dso__load_kallsyms()
+-> symbols__fixup_end()

Function symbols__fixup_end() enlarges the last symbol in the
kallsyms map
# fgrep __irf_end /proc/kallsyms
0000000000e954d0 t __irf_end
#

to the start address of the first module:
# cat /proc/kallsyms | sort | egrep ' [tT] '
....
0000000000e952d0 T __security_initcall_end
0000000000e954d0 T __initramfs_size
0000000000e954d0 t __irf_end
000003ff800045a8 T fc_get_event_number [scsi_transport_fc]
000003ff800045d0 t store_fc_vport_disable [scsi_transport_fc]
000003ff800046a8 T scsi_is_fc_rport [scsi_transport_fc]
000003ff800046d0 t fc_target_setup [scsi_transport_fc]

On s390 the kernel is located around memory address 0x200, 0x10000
or 0x100000, depending on linux version. Modules however start some-
where around 0x3ff xxxx xxxx.

This is different than x86 and produces a large gap for which
histogram allocation fails. On x86 modules simply follow the kernel
and the gap is minor, just some pages to adjust to a page or segment
boundary.

This kernel mapping is identical when I run
# ./perf record -- true
# ./perf report -D | fgrep MAP
0 0xe8 [0x50]: PERF_RECORD_MMAP -1/0: [0x200(0x3ff800043a8) @ 0x200]:
x [kernel.kallsyms]_text
where the kernel map is extremely large.

I will post a patch to the linux kernel mailling list for discussion and fixing the issue today.
When it is upstream we can backport it to 4.18

Frank Heimes (frank-heimes) wrote :

Hi Thomas, thanks for looking into this and doing this analysis.
Sounds reasonable to bring it upstream first and do a backport afterwards.
Since 18.10/cosmic with kernel 4.18 is EOL in between, we just need to get it into bionic 4.15 and disco 5.0 - but I assume you've meant that anyway ...

Changed in ubuntu-z-systems:
status: Confirmed → Incomplete
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-08-12 02:14 EDT-------
The patches have been picked up by the perf maintainer and are now upstream in the linux kernel v5.3-rc4

b9c0a64 perf annotate: Fix s390 gap between kernel end and module start
12a6d29 perf record: Fix module size on s390

Changed in ubuntu-z-systems:
status: Incomplete → Triaged
Frank Heimes (frank-heimes) wrote :

I tried if the two commits can be cleanly applied by cherry-picking them from bionic-master-next (and disco-master-next), and it looks good (with a little hick-up on disco).

I updated the SRU justification in the bug description and new test packages will be built soon (thx klebers).

description: updated
Changed in linux (Ubuntu Disco):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu Disco):
importance: Undecided → High
Changed in ubuntu-z-systems:
status: Triaged → In Progress

Hello IBM,

I have built test kernels for Bionic and Disco with the patches mentioned:

b9c0a64 perf annotate: Fix s390 gap between kernel end and module start
12a6d29 perf record: Fix module size on s390

For Disco, I have also backported the following, which seemed to be a pre-req:

3376b6b7fb57 perf record: Fix s390 missing module symbol and warning for non-root users

The debian packages can be found at:

Bionic (4.15.0-59.66+lp1828166.20190814):
https://people.canonical.com/~ksouza/lp1828166/20190814/bionic/

Disco (5.0.0-26.27+lp1828166.20190814):
https://people.canonical.com/~ksouza/lp1828166/20190814/disco/

I have tested these kernels locally and I was not able to reproduce the issue anymore. Could you please test them to confirm the issue is fixed?

Thank you.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-08-20 02:40 EDT-------
I have installed kernel and perf tool. Now it works ok. I ran the perf top command for nearly 20 Minutes
and generated a good workload. No core dump and OOM messages anymore.
I could not reprodure the error and I used the same system as before when I debugged the issue.

I consider the issue fixed.

The final test would be Klaus environment, but I do not have access to it.

------- Comment From <email address hidden> 2019-08-20 02:42 EDT-------
Reassigned

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-08-22 09:02 EDT-------
I talked to Thomas Richter and we agreed that his test is covering all aspects. So we can go ahead to bring this into the distros

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-08-22 09:10 EDT-------
@Can: Verification done by IBM. Please work on the integration... Many thx in advance

Hello IBM,

Thank you for the feedback. We will submit the patches for integration into the next SRU cycle.

description: updated

These patches have already been committed to Eoan (LP: #1841110).

Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Khaled El Mously (kmously) wrote :

Note: The fixes for this problem were applied under a different bug number via stable updates:

For Bionic: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840520

For Disco: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1841994

Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Disco):
status: In Progress → Fix Committed
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed
description: updated

The fixes for Bionic and Disco have been released, as stated on the stable update bugs mentioned on comment #50. Therefore I'm marking this bug report manually as 'Fix Released'.

Thank you.

Changed in linux (Ubuntu Disco):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Frank Heimes (frank-heimes) wrote :

With the last update (and the patches already in eoan) this ticket can be closed as Fix Released.

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-10-14 07:24 EDT-------
IBM Bugzilla status-> closed, Fix Released with Eoan

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers