Ubuntu 18.04 [ briggs ]: "ipcs" command fails with error "invalid structure member offset" in crash prompt.

Bug #1765660 reported by bugproxy on 2018-04-20
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
High
Canonical Kernel Team
crash (Ubuntu)
High
Canonical Kernel Team
Xenial
Undecided
Unassigned
Artful
Undecided
Unassigned
Bionic
High
Thadeu Lima de Souza Cascardo
Cosmic
High
Canonical Kernel Team

Bug Description

[Impact]
Some analysis won't be possible on new kernels. In this case, ipcs command could not be used with the bionic GA kernel on ppc64le.

[Test Case]
The ipcs command has been run after the fix was applied to crash, on a live amd64 system, with success.
More tests will be done with the version of the package from -proposed on amd64 and ppc64el, including memory commands, and bt.

[Regression Potential]
New crash versions may have bugs and some commands not work with older kernels. The smoke test helps a little, but more testing is desirable.

In case of regressions, analysis of kernel dumps will require that the crash file be moved to a system where crash doesn't have the regression. But live analysis won't be possible in that case.

---------------------------------

== Comment: #0 - PAVITHRA R. PRAKASH <> - 2018-03-29 01:14:47 ==
---Problem Description---

Ubuntu 18.04: "ipcs" command fails with error "invalid structure member offset" in crash prompt.

---Environment--

System Name : ltc-briggs2
Model/Type : P8
Platform : BML

---Uname output---

root@ltc-briggs2:~# uname -a
Linux ltc-briggs2 4.15.0-13-generic #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

---Steps to reproduce--

1. Configure kdump.
2. Trigger crash
3. run crash on captured dump

---Logs----

root@ltc-briggs2:~# dpkg -l|grep makedumpfile
ii makedumpfile 1:1.6.3-1 ppc64el VMcore extraction tool
root@ltc-briggs2:~# dpkg -l|grep crash
ii apport 2.20.9-0ubuntu1 all automatically generate crash reports for debugging
ii crash 7.2.1-1 ppc64el kernel debugging utility, allowing gdb like syntax
ii kdump-tools 1:1.6.3-1 ppc64el scripts and tools for automating kdump (Linux crash dumps)
ii python3-apport 2.20.9-0ubuntu1 all Python 3 library for Apport crash report handling
root@ltc-briggs2:~# dpkg -l|grep kexec
ii kexec-tools 1:2.0.16-1ubuntu1 ppc64el tools to support fast kexec reboots
root@ltc-briggs2:~#

.0-13-generic dump.201803272257 03272257# crash /usr/lib/debug/boot/vmlinux-4.15.

crash 7.2.1
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc64le-unknown-linux-gnu"...

      KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-13-generic
    DUMPFILE: dump.201803272257 [PARTIAL DUMP]
        CPUS: 160
        DATE: Tue Mar 27 22:56:58 2018
      UPTIME: 00:04:07
LOAD AVERAGE: 1.06, 0.53, 0.20
       TASKS: 1734
    NODENAME: ltc-briggs2
     RELEASE: 4.15.0-13-generic
     VERSION: #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 2018
     MACHINE: ppc64le (2926 Mhz)
      MEMORY: 512 GB
       PANIC: "sysrq: SysRq : Trigger a crash"
         PID: 7420
     COMMAND: "bash"
        TASK: c000003e56c7c600 [THREAD_INFO: c000003e56cb0000]
         CPU: 41
       STATE: TASK_RUNNING (SYSRQ)

crash> ?

* files mach repeat timer
alias foreach mod runq tree
ascii fuser mount search union
bt gdb net set vm
btop help p sig vtop
dev ipcs ps struct waitq
dis irq pte swap whatis
eval kmem ptob sym wr
exit list ptov sys q
extend log rd task

crash version: 7.2.1 gdb version: 7.6
For help on any command above, enter "help <command>".
For help on input options, enter "help input".
For help on output options, enter "help output".

crash> ipcs
SHMID_KERNEL KEY SHMID UID PERMS BYTES NATTCH STATUS

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

[/usr/bin/crash] error trace: 70374748930 => 703747482d8 => 703746e8e98 => 703745b1488

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

crash> q

== Comment: #3 - PAVITHRA R. PRAKASH <> - 2018-04-09 00:36:52 ==
(In reply to comment #1)
> Hi Pavithra,
>
> Have you checked in older kernel versions, whether "ipcs" command
> was working ...I think, it is because there were too many changes
> to kernel's IPC code over several kernel versions..possibly crash tool
> don't have the updated changes w.r.t new kernel IPC code
>
> also please check if following "options" are not working:
> >>>
> char *help_ipcs[] = {
> "ipcs",
> "System V IPC facilities",
> "[-smMq] [-n pid|task] [id | addr]",
>
> " This command provides information on the System V IPC facilities. With
> no",
> " arguments, the command will display kernel usage of all three
> factilities.",
> " ",
> " -s show semaphore arrays.",
> " -m show shared memory segments.",
> " -M show shared memory segments with additional details.",
> " -q show message queues.",
> " id show the data associated with this resource ID.",
> " addr show the data associated with this virtual address of a",
> " shmid_kernel, sem_array or msq_queue.",
> "",
> " For kernels supporting namespaces, the -n option may be used to",
> " display the IPC facilities with respect to the namespace of a",
> " specified task:\n",
> " -n pid a process PID.",
> " -n task a hexadecimal task_struct pointer.",
> >>>
>
> Thanks!!

Issue is observed even with old kernel.

      KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-12-generic
    DUMPFILE: dump.201804090031 [PARTIAL DUMP]
        CPUS: 160
        DATE: Mon Apr 9 00:30:53 2018
      UPTIME: 00:04:24
LOAD AVERAGE: 2.75, 1.56, 0.64
       TASKS: 1738
    NODENAME: ltc-briggs2
     RELEASE: 4.15.0-12-generic
     VERSION: #13-Ubuntu SMP Wed Mar 7 21:37:03 UTC 2018
     MACHINE: ppc64le (2926 Mhz)
      MEMORY: 512 GB
       PANIC: "sysrq: SysRq : Trigger a crash"
         PID: 6995
     COMMAND: "bash"
        TASK: c000003f70dcfd00 [THREAD_INFO: c000003f70e78000]
         CPU: 64
       STATE: TASK_RUNNING (SYSRQ)

crash> ipcs
SHMID_KERNEL KEY SHMID UID PERMS BYTES NATTCH STATUS

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

[/usr/bin/crash] error trace: 198c7e38930 => 198c7e382d8 => 198c7dd8e98 => 198c7ca1488

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

crash>

---------------------------------------------------------------------------------------------------------------------------------

crash> ipcs
SHMID_KERNEL KEY SHMID UID PERMS BYTES NATTCH STATUS

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

[/usr/bin/crash] error trace: 69b16c28930 => 69b16c282d8 => 69b16bc8e98 => 69b16a91488

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

crash> ipcs -s
SEM_ARRAY KEY SEMID UID PERMS NSEMS

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

[/usr/bin/crash] error trace: 69b16c2ace0 => 69b16c282d8 => 69b16bc8e98 => 69b16a91488

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

crash> ipcs -m
SHMID_KERNEL KEY SHMID UID PERMS BYTES NATTCH STATUS

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

[/usr/bin/crash] error trace: 69b16c28930 => 69b16c282d8 => 69b16bc8e98 => 69b16a91488

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

crash> ipcs -M

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

[/usr/bin/crash] error trace: 69b16c28930 => 69b16c282d8 => 69b16bc8e98 => 69b16a91488

ipcs: invalid structure member offset: idr_top
      FILE: ipcs.c LINE: 628 FUNCTION: idr_find()

crash> ipcs -q
MSG_QUEUE KEY MSQID UID PERMS USED-BYTES MESSAGES
(none allocated)

bugproxy (bugproxy) on 2018-04-20
tags: added: architecture-ppc64le bugnameltc-166199 severity-high targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → crash (Ubuntu)
Changed in ubuntu-power-systems:
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
tags: added: triage-g

------- Comment From <email address hidden> 2018-04-20 11:30 EDT-------
The ipr structure was removed in the following commit in the kernel:

commit 0a835c4f090af2c76fc2932c539c3b32fd21fbbb
Author: Matthew Wilcox <email address hidden>
Date: Tue Dec 20 10:27:56 2016 -0500

Reimplement IDR and IDA using the radix tree
The IDR is very similar to the radix tree. It has some functionality that
the radix tree did not have (alloc next free, cyclic allocation, a
callback-based for_each, destroy tree), which is readily implementable on
top of the radix tree. A few small changes were needed in order to use a
tag to represent nodes with free space below them. More extensive
changes were needed to support storing NULL as a valid entry in an IDR.
Plain radix trees still interpret NULL as a not-present entry.
The IDA is reimplemented as a client of the newly enhanced radix tree. As
in the current implementation, it uses a bitmap at the last level of the
tree.

It seems that crash didn't adjust to this change and this is causing this following error.

Changed in ubuntu-power-systems:
status: New → Triaged
Manoj Iyer (manjo) on 2018-04-23
Changed in crash (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-04-24 15:08 EDT-------
Created a bug upstream at https://github.com/crash-utility/crash/issues/23

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-04-30 10:50 EDT-------
There is a fix made available upstream now. We basically need to cherry pick it:

https://github.com/crash-utility/crash/commit/759dc0c50dc6cc3199e56be57cb57d41812b0397

description: updated
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package crash - 7.2.1-1ubuntu1

---------------
crash (7.2.1-1ubuntu1) cosmic; urgency=medium

  * Add patch to fix ipcs command (LP: #1765660).

 -- Thadeu Lima de Souza Cascardo <email address hidden> Mon, 14 May 2018 16:05:51 -0300

Changed in crash (Ubuntu Cosmic):
status: New → Fix Released
Changed in crash (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
importance: Undecided → High
Manoj Iyer (manjo) on 2018-06-18
Changed in crash (Ubuntu Artful):
status: New → Won't Fix
Changed in crash (Ubuntu Xenial):
status: New → Won't Fix
Changed in ubuntu-power-systems:
status: Triaged → In Progress

I have been working on the backport for this to bionic and xenial, but in order to do so, we are taking care of some build failures on the latest version that we are going to add to cosmic. I'll update when there is more progress on that.

Regards.
Cascardo.

There is a package on ppa:cascardo/ppa that should fix the problem. Can you test it and report back?

Cascardo.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-07-04 02:32 EDT-------
(In reply to comment #16)
> There is a package on ppa:cascardo/ppa that should fix the problem. Can you
> test it and report back?
>

Hi Cascardo,

Issue not reproducible with crash-7.2.3+real-1

Thanks
Hari

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-07-08 13:34 EDT-------
Closed as fixed here.

Manoj Iyer (manjo) wrote :

Looks like the patch mentioned in this bug is already present in bionic

0a835c4f090a Reimplement IDR and IDA using the radix tree

Closing out the bionic track as fix released.

Changed in crash (Ubuntu Bionic):
status: In Progress → Fix Released
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
Robie Basak (racb) wrote :

This is in the SRU queue for Bionic. Does it need rejecting from there if the bug is already fixed in Bionic? If not, then please could you clarify and fix the bug statuses?

Yeah, this doesn't sound right. Bionic has 7.2.1-1, while the fix has been applied to 7.2.2-1. The verification mentioned on comment #7 was against a ppa.

We can't ask IBM to properly verify it on bionic until we get this package on -proposed. Let me move the task back to In Progress.

Thanks.
Cascardo.

Changed in crash (Ubuntu Bionic):
status: Fix Released → In Progress

Okay, so Manoj was confused as he looked into a linux commit, while we are talking about crash here. That explains the confusion. On the IBM side, it seems to have been closed either because the fix has already landed on cosmic, or because of the ppa test.

Cascardo.

Manoj Iyer (manjo) wrote :

Sorry for all the confusion I might have created. cascardo, thanks for fixing my errors.

Changed in ubuntu-power-systems:
status: Fix Released → In Progress
Łukasz Zemczak (sil2100) wrote :

Since the performed SRU is a straight backport from what is in cosmic, I would prefer if the test case included more than just checking the ipcs command. Maybe some other more general testing scenarios to make sure we didn't regress? There's tons of changes (4296 insertions(+), 884 deletions(-)) in comparison to what was in bionic before. Also, looks like the Regression Potential field is lying, quoting: "The patch, however, should touch only the broken command, and has been tested." - the SRU here touches far more than just the broken command I guess? Or does it not?

Sorry about the lying in the regression potential. It probably referred to a previous backport attempt, that only applied the respective patch. As we discussed a few months ago, kexec/kdump/crash would be backported because of their relationship with the linux package, and its siblings linux-lts, linux-hwe, etc. That doesn't mean we shouldn't or can't test the packages more thoroughly. That's why I introduced ADT support to kdump, and would like to work on more testing for those packages. Right now, I can do more manual testing for this, and add my results to the bug.

Thanks for bringing this up.
Cascardo.

description: updated
Steve Langasek (vorlon) wrote :

The bug description now says "but more testing may be desirable." Please be explicit and detailed, here. We need a test plan for what that testing is going to be, otherwise there's no reason to expect that it will happen.

Changed in crash (Ubuntu Bionic):
status: In Progress → Incomplete

I have done some manual testing on both amd64 and ppc64el, with mem commands, backtrace, and other ones.

Cascardo.

description: updated
Dimitri John Ledkov (xnox) wrote :

Ok, so we are now awaiting for this to be accepted into the -proposed from the queue, as I do see crash in bionic unapproved queue at https://launchpad.net/ubuntu/bionic/+queue?queue_state=1&queue_text=crash

Changed in crash (Ubuntu Bionic):
status: Incomplete → Confirmed

Can someone from the SRU team approve this package in the queue?

Cascardo.

On 12/18/2018 06:35 AM, Thadeu Lima de Souza Cascardo wrote:
> Can someone from the SRU team approve this package in the queue?
>
> Cascardo.
>
Thank you Thadeu!

Terry

description: updated

So, the SRU team asked for more detail on the tests. I will post it after January 15th.

Terry Rudd (terrykrudd) wrote :

On 12/21/2018 10:40 AM, Thadeu Lima de Souza Cascardo wrote:
> So, the SRU team asked for more detail on the tests. I will post it
> after January 15th.
>

Thank you Thadeu! Have a great holiday break!

Terry

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.