jfs filesystem is crashy, multiple panic and I/O errors

Bug #1711009 reported by Mathieu Trudel-Lapierre on 2017-08-16
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Artful
High
Unassigned

Bug Description

Loading initial ramdisk ...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 4.11.0-13-generic (buildd@bos01-ppc64el-022) (gcc version 6.4.0 20170724 (Ubuntu 6.4.0-2ubuntu1) ) #19-Ubuntu SMP Thu Aug 3 15:11:49 UTC 2017 (Ubuntu 4.11.0-13.19-generic 4.11.12)
Detected machine type: 0000000000000101
command line: BOOT_IMAGE=/boot/vmlinux-4.11.0-13-generic root=UUID=8c6162e8-9bb1-4171-a901-30e001222a14 ro splash quiet
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000007620000
  alloc_top : 0000000030000000
  alloc_top_hi : 0000000780000000
  rmo_top : 0000000030000000
  ram_top : 0000000780000000
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000007630000 -> 0x00000000076309e6
Device tree struct 0x0000000007640000 -> 0x0000000007650000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000002000000 ...
 -> smp_release_cpus()
spinning_secondaries = 7
 <- smp_release_cpus()
Linux ppc64le

Got multiple instances of panics, on artful:

[ 11.772333] ERROR: (device dm-3): dtSearch [jfs]: stack overrun!
[ 11.772333]
[ 11.772384] bn = 0, index = 0
[ 11.772408] bn = 4307af, index = 0
[ 11.772432] bn = 0, index = 0
[ 11.772456] bn = 4307af, index = 0
[ 11.772493] bn = 0, index = 0
[ 11.772537] bn = 4307af, index = 0
[ 11.772573] bn = 0, index = 0
[ 11.772599] bn = c00000077b994800, index = 0
/sbin/init: error while loading shared libraries: liblz4.so.1: cannot open shared object file: Input/output error
[ 11.772964] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[ 11.772964]
[ 11.773043] CPU: 2 PID: 1 Comm: init Not tainted 4.11.0-13-generic #19-Ubuntu
[ 11.773102] Call Trace:
[ 11.773167] [c00000077e10bc30] [c000000000b831bc] dump_stack+0xb0/0xf0 (unreliable)
[ 11.773243] [c00000077e10bc70] [c000000000b7fbc8] panic+0x144/0x318
[ 11.773311] [c00000077e10bd00] [c0000000000ea5b8] do_exit+0xc98/0xca0
[ 11.773373] [c00000077e10bdd0] [c0000000000ea690] do_group_exit+0x60/0x100
[ 11.773424] [c00000077e10be10] [c0000000000ea754] SyS_exit_group+0x24/0x30
[ 11.773470] [c00000077e10be30] [c00000000000b184] system_call+0x38/0xe0
[ 11.775434] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[ 11.775434]

And also:

[ 12.338435] Unable to handle kernel paging request for data at address 0x00000000
[ 12.338754] Faulting instruction address: 0xd0000000043be02c
[ 12.338951] Oops: Kernel access of bad area, sig: 11 [#1]
[ 12.338998] SMP NR_CPUS=2048
[ 12.338999] NUMA
[ 12.339053] pSeries
[ 12.339166] Modules linked in: ip_tables x_tables autofs4 jfs dm_service_time ibmveth ibmvscsi crc32c_vpmsum scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath
[ 12.339305] CPU: 2 PID: 2135 Comm: tty Not tainted 4.11.0-13-generic #19-Ubuntu
[ 12.339441] task: c000000005562c00 task.stack: c000000006f78000
[ 12.339512] NIP: d0000000043be02c LR: d0000000043bdfbc CTR: c0000000001458f0
[ 12.339614] REGS: c000000006f7b4b0 TRAP: 0300 Not tainted (4.11.0-13-generic)
[ 12.339707] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
[ 12.339714] CR: 44008244 XER: 00000000
[ 12.339822] CFAR: 00003fff9aa23f14 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
[ 12.339822] GPR00: d0000000043bdfbc c000000006f7b730 d0000000043d8c90 c000000778e765f8
[ 12.339822] GPR04: d0000000043cf770 d0000000043d2b58 00000000000005f1 0000000000000000
[ 12.339822] GPR08: 0000000000000000 003ffff80000102d 0000000000100002 d0000000043ca6e0
[ 12.339822] GPR12: c0000000001458f0 c00000000fb81200 0000000000000000 0000000000000000
[ 12.339822] GPR16: c00000077b8c81d8 c00000077b8c80c8 c000000778e91200 c00000077b8c8300
[ 12.339822] GPR20: c000000006f7ba30 c000000778e911fe c000000778c38800 c000000006f7b950
[ 12.339822] GPR24: 0000000000100002 0000000000100002 0000000000000000 d0000000043d2b38
[ 12.339822] GPR28: 0000000000001000 f000000001ddf440 c00000076b096480 c000000778e765b8
[ 12.340436] NIP [d0000000043be02c] __get_metapage+0x204/0x6b0 [jfs]
[ 12.340541] LR [d0000000043bdfbc] __get_metapage+0x194/0x6b0 [jfs]
[ 12.340585] Call Trace:
[ 12.340608] [c000000006f7b730] [d0000000043bdf78] __get_metapage+0x150/0x6b0 [jfs] (unreliable)
[ 12.340671] [c000000006f7b810] [d0000000043b9210] dtSearch+0x4d8/0xa20 [jfs]
[ 12.340731] [c000000006f7b930] [d0000000043a3a88] jfs_lookup+0x80/0x100 [jfs]
[ 12.340789] [c000000006f7ba60] [c0000000003683c0] lookup_slow+0xe0/0x240
[ 12.340840] [c000000006f7bae0] [c00000000036ccb8] walk_component+0x268/0x3f0
[ 12.340896] [c000000006f7bb60] [c00000000036d018] link_path_walk+0x1d8/0x5c0
[ 12.340950] [c000000006f7bc00] [c00000000036f19c] path_openat+0xbc/0x3b0
[ 12.340997] [c000000006f7bc80] [c000000000370d8c] do_filp_open+0xec/0x160
[ 12.341042] [c000000006f7bdb0] [c000000000355cdc] do_sys_open+0x1cc/0x380
[ 12.341087] [c000000006f7be30] [c00000000000b184] system_call+0x38/0xe0
[ 12.341130] Instruction dump:
[ 12.341161] 7fc9f214 39200001 fbdf0030 f93f0028 e93d0000 712a1000 41820424 ebdd0030
[ 12.341218] 41920034 e8fd0008 811e0000 e95f0038 <e8c70000> 38e80001 81060090 90fe0000
[ 12.341281] ---[ end trace f65cb3ebef53fce5 ]---
[ 12.343113]
[ 12.408425] systemd-journald[2059]: Received request to flush runtime journal from PID 1
[ 12.447465] Adding 1670336k swap on /swapfile. Priority:-1 extents:1 across:1670336k FS
[ 12.494156] crypto_register_alg 'aes' = 0
[ 12.495040] crypto_register_alg 'cbc(aes)' = 0
[ 12.495535] crypto_register_alg 'ctr(aes)' = 0
[ 12.496075] crypto_register_alg 'xts(aes)' = 0
[ 13.210945] ------------[ cut here ]------------
[ 13.211085] kernel BUG at /build/linux-OXomHO/linux-4.11.0/arch/powerpc/lib/locks.c:34!
[ 13.211421] Oops: Exception in kernel mode, sig: 5 [#2]
[ 13.211475] SMP NR_CPUS=2048
[ 13.211475] NUMA
[ 13.211517] pSeries
[ 13.211794] Modules linked in: vmx_crypto ip_tables x_tables autofs4 jfs dm_service_time ibmveth ibmvscsi crc32c_vpmsum scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath

The crashes are easily reproducible in a ppc64el when installing artful from the latest mini.iso. As soon as I manage to boot to a proper command-line I'll try to run apport-collect.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1711009

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: artful

Can't run apport-collect; looks like there's nothing to save the apport data in this kind of case, so I can't transmit it anywhere. I'll see if I can reproduce this on amd64, should make it easier.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.13 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc4

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-key
Changed in linux (Ubuntu Artful):
status: Confirmed → Triaged
Seth Forshee (sforshee) wrote :

Note too that we now have a 4.12 kernel in artful-release, so you might give that a spin.

I can't run apport-collect in that environment, but this is trivial to reproduce in any ppc64el qemu VM. Just use JFS as the filesystem for the root partition.

I can provide scripts to help setting up the VM.

Changed in linux (Ubuntu):
status: Triaged → Confirmed
tags: added: kernel-da-key
removed: kernel-key

This bug was nominated against a series that is no longer supported, ie artful. The bug task representing the artful nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Artful):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers