ubuntu_lttng_smoke_test failed on Yakkety s390x (zVM)

Bug #1671063 reported by Po-Hsu Lin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ltt-control (Ubuntu)
Fix Released
Medium
Colin Ian King
Xenial
Fix Released
Medium
Colin Ian King
Yakkety
Fix Released
Medium
Colin Ian King

Bug Description

== SRU Request [Xenial, Yaketty + Zesty ] ==

Without this minor fix, the lttng tool receives a "Not enough memory" error
when in fact this is not the case.

== How to reproduce ==

lttng list --kernel --sysall

produces:
Error: Unable to list system calls: Not enough memory
Error: Command error

With the fix, this does not occur.

== Fix ==

https://lists.lttng.org/pipermail/lttng-dev/2017-March/026959.html

== Regression Potential ==

Minimal, low risk.

------------------------------------------------

Two test cases failed on this arch:
== lttng smoke test list kernel events ==
PASSED (lttng list --kernel)
Error: Unable to list system calls: Not enough memory
Error: Command error
FAILED (lttng list --kernel --syscall)
FAILED (lttng list --kernel --syscall more output expected)

== lttng smoke test trace open/close system calls ==
...
FAILED (did not trace any open system calls)

Please find attachment for complete log.

ProblemType: Bug
DistroRelease: Ubuntu 16.10
Package: linux-image-4.8.0-40-generic 4.8.0-40.43
ProcVersionSignature: Ubuntu 4.8.0-40.43-generic 4.8.17
Uname: Linux 4.8.0-40-generic s390x
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.3-0ubuntu8.2
Architecture: s390x
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
Date: Wed Mar 8 06:39:24 2017
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1:
PciMultimedia:

ProcFB: Error: [Errno 2] No such file or directory: '/proc/fb'
ProcKernelCmdLine: root=/dev/mapper/kl02vg01-root crashkernel=196M debug udev.log-priority=debug rd.udev.log_priority=debug BOOT_IMAGE=0
RelatedPackageVersions:
 linux-restricted-modules-4.8.0-40-generic N/A
 linux-backports-modules-4.8.0-40-generic N/A
 linux-firmware 1.161.1
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Changed in linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
importance: Undecided → Medium
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Note, this is not a regression, it can be reproduced with 4.8.0-38-generic as well.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1671063

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Colin Ian King (colin-king) wrote :

sudo strace -f lttng list --kernel --syscall

....
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/lttng/client-lttng-sessiond"}, 110) = 0
geteuid() = 0
getegid() = 0
sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=13156}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS, cmsg_data={pid=8563, uid=0, gid=0}}], msg_controllen=28, msg_flags=0}, 0) = 13156
recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0\0\0\4\0\0\0\32\0\0\0\0\0\0\0\0\0\0\0\0", iov_len=20}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 20
shutdown(3, SHUT_RDWR) = 0
close(3) = 0
write(2, "Error: Unable to list system cal"..., 54Error: Unable to list system calls: Not enough memory
) = 54
geteuid() = 0
getuid() = 0
getegid() = 0
getgid() = 0

This seems wrong - there are no ENOMEM syscall returns to trigger this failure.

Revision history for this message
Colin Ian King (colin-king) wrote :

https://bugs.lttng.org/issues/1091

end of sudo lttng-sessiond -vvv shows that we have a vmalloc failure:

DEBUG1 - 14:25:37.954060 [10037/10081]: Woken up but nothing in the UST command queue (in thread_dispatch_ust_registration() at main.c:1891)
DEBUG1 - 14:25:37.954072 [10037/10082]: Got the wait shm fd 33 (in get_wait_shm() at shm.c:115)
DEBUG1 - 14:25:37.954074 [10037/10086]: Updating kernel poll set (in update_kernel_poll() at main.c:889)
DEBUG1 - 14:25:37.954100 [10037/10086]: Thread kernel polling (in thread_manage_kernel() at main.c:1099)
DEBUG1 - 14:25:37.953832 [10037/10087]: [load-session-thread] Load session (in thread_load_session() at load-session-thread.c:91)
DEBUG1 - 14:25:37.954079 [10037/10082]: Futex wait update active 1 (in futex_wait_update() at futex.c:66)
DEBUG1 - 14:25:37.954132 [10037/10082]: Accepting application registration (in thread_registration_apps() at main.c:2145)
DEBUG3 - 14:25:37.953889 [10037/10084]: [ust-thread] Manage notify polling (in ust_thread_manage_notify() at ust-thread.c:69)
DEBUG1 - 14:25:42.577842 [10037/10080]: Wait for client response (in thread_manage_clients() at main.c:4444)
DEBUG1 - 14:25:42.577865 [10037/10080]: Receiving data from client ... (in thread_manage_clients() at main.c:4489)
DEBUG1 - 14:25:42.577999 [10037/10080]: Processing client command 4 (in process_client_msg() at main.c:2980)
DEBUG1 - 14:25:42.578010 [10037/10080]: Syscall table listing. (in syscall_table_list() at syscall.c:280)
PERROR - 14:25:42.578021 [10037/10080]: syscall table list zmalloc: Cannot allocate memory (in syscall_table_list() at syscall.c:291)
DEBUG3 - 14:25:42.578027 [10037/10080]: Destroying syscall hash table. (in destroy_syscall_ht() at syscall.c:159)
DEBUG1 - 14:25:42.578030 [10037/10080]: Missing llm structure. Allocating one. (in process_client_msg() at main.c:4124)
DEBUG1 - 14:25:42.578032 [10037/10080]: Sending response (size: 20, retcode: Not enough memory (26)) (in thread_manage_clients()

Changed in linux (Ubuntu):
status: Incomplete → In Progress
Revision history for this message
Jonathan Rajotte Julien (joraj) wrote :

Hi Colin,

As you indicated it seems that an actual allocation problem occurred on:

syscall table list zmalloc: Cannot allocate memory (in syscall_table_list() at syscall.c:291)
From the code:

/*
 * Allocate at least the number of total syscall we have even if some of
 * them might not be valid. The count below will make sure to return the
 * right size of the events array.
 */
events = zmalloc(syscall_table_nb_entry * sizeof(*events));
if (!events) {
        PERROR("syscall table list zmalloc");
        ret = -LTTNG_ERR_NOMEM;
        goto error;
}
syscall_table_nb_entry is assigned in syscall_init_table.

Could you link the full sessiond log and strace log? Also if possible determine the value of syscall_table_nb_entry?

Cheers

Revision history for this message
Jonathan Rajotte Julien (joraj) wrote :

Hi Colin,

Please take a look at the following patch https://lists.lttng.org/pipermail/lttng-dev/2017-March/026959.html

Could you report to us if it fixes your issue?

Please note that s390 is not supported/tested by LTTng-modules.

You probably encountered this bug since no syscalls tracepoints are getting defined on lttng-modules side. Please take a look at this git subtree [1] and the following readme [2] on how to generate it and add the headers for the S390 architecture if you wish to.

Cheers

[1] https://github.com/lttng/lttng-modules/tree/master/instrumentation/syscalls
[2] https://github.com/lttng/lttng-modules/blob/master/instrumentation/syscalls/README

Revision history for this message
Colin Ian King (colin-king) wrote :

Thanks Jonathan, I'll get around to that first thing tomorrow morning. Much appreciated.

Revision history for this message
Colin Ian King (colin-king) wrote :

Note: this fixes the issue for me, that is, I don't the vmalloc issues now.

description: updated
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote :

Strangely this didn't get marked Fix Released by the janitor.

ltt-control (2.9.3-1ubuntu1) zesty; urgency=medium

  * Fix lttng list --kernel --syscall failure (LP: #1671063)

 -- Colin Ian King <email address hidden> Tue, 14 Mar 2017 13:39:11 +0000

Changed in linux (Ubuntu):
status: In Progress → Fix Released
tags: added: verification-needed
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Po-Hsu, or anyone else affected,

Accepted ltt-control into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ltt-control/2.8.1-1ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Brian Murray (brian-murray) wrote :

Oh, it didn't get changed because this has a linux not ltt-control bug task.

affects: linux (Ubuntu) → ltt-control (Ubuntu)
Changed in ltt-control (Ubuntu Yakkety):
status: New → Fix Committed
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Po-Hsu, or anyone else affected,

Accepted ltt-control into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ltt-control/2.7.1-2ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in ltt-control (Ubuntu Xenial):
status: New → Fix Committed
Changed in ltt-control (Ubuntu Xenial):
assignee: nobody → Colin Ian King (colin-king)
Changed in ltt-control (Ubuntu Yakkety):
assignee: nobody → Colin Ian King (colin-king)
Changed in ltt-control (Ubuntu Xenial):
importance: Undecided → Medium
Changed in ltt-control (Ubuntu Yakkety):
importance: Undecided → Medium
Revision history for this message
Colin Ian King (colin-king) wrote :

tested on s390x xenial and yakkety against -proposed lttng tools in universe - issue fixed.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ltt-control - 2.7.1-2ubuntu1

---------------
ltt-control (2.7.1-2ubuntu1) xenial; urgency=medium

  * Fix lttng list --kernel --syscall failure (LP: #1671063)

 -- Colin Ian King <email address hidden> Tue, 14 Mar 2017 13:39:11 +0000

Changed in ltt-control (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for ltt-control has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ltt-control - 2.8.1-1ubuntu1

---------------
ltt-control (2.8.1-1ubuntu1) yakkety; urgency=medium

  * Fix lttng list --kernel --syscall failure (LP: #1671063)

 -- Colin Ian King <email address hidden> Tue, 14 Mar 2017 13:39:11 +0000

Changed in ltt-control (Ubuntu Yakkety):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.