ocfs2 shared volume causing hang
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
linux-signed (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
This vm is running on a vmware backed system with a multiwriter shared volume mounted via ocfs2.
in kernel version 4.4.0-187 or -189 any attempt at writting to the volume results in kernel errors
root@hostname:
Killed
kernel version 186 this issue does not occur.
root@hostname:~# lsb_release -rd
Description: Ubuntu 16.04.7 LTS
Release: 16.04
package
ii linux-image-
what I expect to happen: writing to a ocfs2 shared disk should not crash
what happens: writting to an ocfs2 shared disk crashes the terminal/ssh session, and make the vm unusable. and logs errors in dmesg
[Sep 3 04:09] general protection fault: 0000 [#1] SMP
[ +0.000346] Modules linked in: mptctl ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs vmw_balloon joydev input_leds serio_raw shpchp i2c_piix4 vmw_vsock_
[ +0.001534] CPU: 0 PID: 1553 Comm: java Not tainted 4.4.0-189-generic #219-Ubuntu
[ +0.000049] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[ +0.000067] task: ffff880073863900 ti: ffff8800798e0000 task.ti: ffff8800798e0000
[ +0.000047] RIP: 0010:[<
[ +0.000086] RSP: 0018:ffff880079
[ +0.000035] RAX: 0000000000020000 RBX: 6d612d7972616e69 RCX: 00000000ffffffff
[ +0.000046] RDX: 0000000080000000 RSI: 0000000000000009 RDI: 6d612d7972616ef1
[ +0.000045] RBP: ffff8800798e3998 R08: 0000000000000000 R09: 0000000000000000
[ +0.000045] R10: ffff880075729060 R11: 0000000000008000 R12: 6d612d7972616ef1
[ +0.000044] R13: ffff8800748aeaa8 R14: ffff880076399650 R15: 00000000ffffffff
[ +0.000046] FS: 00007f9a50ad970
[ +0.000051] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000037] CR2: 00007f9a1d4f2000 CR3: 0000000079936000 CR4: 0000000000160670
[ +0.000160] Stack:
[ +0.000019] ffff8800798e39b8 ffffffff8123953e 0000000000000009 ffff8800748ae000
[ +0.000055] ffff8800798e3a38 ffffffffc057bb77 ffffffffc053309a 0000000000000000
[ +0.000054] ffff88007958e498 ffff88007958e4e0 0000000100000000 ffff88007958e3c8
[ +0.000055] Call Trace:
[ +0.000025] [<ffffffff81239
[ +0.000084] [<ffffffffc057b
[ +0.000088] [<ffffffffc0533
[ +0.000072] [<ffffffffc0520
[ +0.000075] [<ffffffffc0564
[ +0.000074] [<ffffffffc0564
[ +0.001659] [<ffffffffc0564
[ +0.001326] [<ffffffffc0547
[ +0.001622] [<ffffffffc052a
[ +0.001571] [<ffffffffc0548
[ +0.001318] [<ffffffff8122b
[ +0.001218] [<ffffffff8122d
[ +0.001255] [<ffffffff8123a
[ +0.001167] [<ffffffff8122f
[ +0.001132] [<ffffffff8121e
[ +0.001131] [<ffffffff8123d
[ +0.001072] [<ffffffff8121d
[ +0.001040] [<ffffffff8121d
[ +0.001028] [<ffffffff81869
[ +0.001041] Code: e2 fe 8d 72 02 0f b7 f6 0f 1f 80 00 00 00 00 eb d2 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 b8 00 00 02 00 <f0> 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 41 89 d0 0f b7
[ +0.003174] RIP [<ffffffff81869
[ +0.001007] RSP <ffff8800798e3998>
[ +0.002867] ---[ end trace 6fd21d2a8e763939 ]---
-------
[Sep 4 03:56] (java,5847,
[ +0.000098] (java,5847,
[ +0.000047] (java,5847,
[ +0.000046] (java,5847,
[ +0.000049] (java,5847,
[ +0.000040] (java,5847,
[ +8.987127] (java,2088,
[ +0.000093] (java,2088,
[ +0.000146] (java,2088,
[ +0.000047] (java,2088,
[ +0.000040] (java,2088,
[ +0.000039] (java,2088,
[ +9.060788] (java,2090,
[ +0.000065] (java,2090,
[ +0.000062] (java,2090,
[ +0.000047] (java,2090,
[ +0.000039] (java,2090,
[ +0.000039] (java,2090,
[Sep 4 03:57] (java,2088,
[ +0.000060] (java,2088,
[ +0.000045] (java,2088,
[ +0.000044] (java,2088,
[ +0.000039] (java,2088,
[ +0.000037] (java,2088,
[ +15.482181] (java,2088,
[ +0.000095] (java,2088,
[ +0.000157] (java,2088,
[ +0.000162] (java,2088,
[ +0.000148] (java,2088,
[ +0.000148] (java,2088,
[Sep 4 04:13] (java,5848,
[ +0.000097] (java,5848,
[ +0.000191] (java,5848,
[ +0.000902] (java,5848,
[ +0.000771] (java,5848,
[ +0.000726] (java,5848,
[ +18.618350] (java,5846,
[ +0.000767] (java,5846,
[ +0.000645] (java,5846,
[ +0.000639] (java,5846,
[ +0.000650] (java,5846,
[ +0.000631] (java,5846,
[Sep 4 04:14] (java,5848,
[ +0.000717] (java,5848,
[ +0.000671] (java,5848,
[ +0.000677] (java,5848,
[ +0.000677] (java,5848,
[ +0.000692] (java,5848,
[ +6.859084] (java,5849,
[ +0.001405] (java,5849,
[ +0.001280] (java,5849,
[ +0.001289] (java,5849,
[ +0.001327] (java,5849,
[ +0.001296] (java,5849,
[ +15.674889] (java,5848,
[ +0.000767] (java,5848,
[ +0.000693] (java,5848,
[ +0.000651] (java,5848,
[ +0.000641] (java,5848,
[ +0.000680] (java,5848,
[ +8.095135] (java,5849,
[ +0.000872] (java,5849,
[ +0.000666] (java,5849,
[ +0.000655] (java,5849,
[ +0.000648] (java,5849,
[ +0.000645] (java,5849,
[Sep 4 04:17] (java,2071,
[ +0.001019] (java,2071,
[ +0.000387] (java,2071,
[ +0.000353] (java,2071,
[ +0.000345] (java,2071,
[ +0.000343] (java,2071,
[ +36.327665] (java,2076,
[ +0.000794] (java,2076,
[ +0.000658] (java,2076,
[ +0.000653] (java,2076,
[ +0.000716] (java,2076,
[ +0.000652] (java,2076,
[Sep 4 04:18] (java,2077,
[ +0.000434] (java,2077,
[ +0.000346] (java,2077,
[ +0.000337] (java,2077,
[ +0.000349] (java,2077,
[ +0.000333] (java,2077,
[ +18.813455] (java,2069,
[ +0.000750] (java,2069,
[ +0.000706] (java,2069,
[ +0.000705] (java,2069,
[ +0.000642] (java,2069,
[ +0.000674] (java,2069,
[Sep 4 04:19] (java,2071,
[ +0.000777] (java,2071,
[ +0.000658] (java,2071,
[ +0.000650] (java,2071,
[ +0.000657] (java,2071,
[ +0.000644] (java,2071,
[ +34.510080] (java,2072,
[ +0.000811] (java,2072,
[ +0.000676] (java,2072,
[ +0.000663] (java,2072,
[ +0.000654] (java,2072,
[ +0.000686] (java,2072,
[Sep 4 07:12] BUG: unable to handle kernel paging request at ffffffffc046a088
[ +0.000540] IP: [<ffffffff81869
[ +0.000472] PGD 1e0f067 PUD 1e11067 PMD 79abe067 PTE 77bc9061
[ +0.000475] Oops: 0003 [#1] SMP
[ +0.000455] Modules linked in: mptctl ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs vmw_balloon joydev input_leds serio_raw shpchp i2c_piix4 vmw_vsock_
[ +0.004701] CPU: 0 PID: 2090 Comm: java Not tainted 4.4.0-189-generic #219-Ubuntu
[ +0.000653] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[ +0.001325] task: ffff880078e0d580 ti: ffff88007792c000 task.ti: ffff88007792c000
[ +0.000695] RIP: 0010:[<
[ +0.000730] RSP: 0018:ffff880077
[ +0.000738] RAX: 0000000000020000 RBX: ffffffffc046a000 RCX: 00000000ffffffff
[ +0.000731] RDX: 0000000080000000 RSI: 0000000000000009 RDI: ffffffffc046a088
[ +0.000741] RBP: ffff88007792f998 R08: 0000000000000000 R09: 0000000000000000
[ +0.000747] R10: ffff880078f15c00 R11: 0000000000008000 R12: ffffffffc046a088
[ +0.000745] R13: ffff8800768e8aa8 R14: ffff8800748bcc50 R15: 00000000ffffffff
[ +0.000763] FS: 00007fd5da2ae70
[ +0.000804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000779] CR2: ffffffffc046a088 CR3: 0000000078e2e000 CR4: 0000000000160670
[ +0.000868] Stack:
[ +0.000799] ffff88007792f9b8 ffffffff8123953e 0000000000000009 ffff8800768e8000
[ +0.000811] ffff88007792fa38 ffffffffc0594b77 ffffffffc054c09a 0000000000000000
[ +0.000850] ffff880079193c98 ffff880079193ce0 0000000100000000 ffff880079193bc8
[ +0.000851] Call Trace:
[ +0.000832] [<ffffffff81239
[ +0.001077] [<ffffffffc0594
[ +0.000884] [<ffffffffc054c
[ +0.000866] [<ffffffffc0539
[ +0.000928] [<ffffffffc057d
[ +0.000934] [<ffffffffc057d
[ +0.000896] [<ffffffffc057d
[ +0.000886] [<ffffffffc0560
[ +0.000891] [<ffffffffc0543
[ +0.000924] [<ffffffffc0561
[ +0.000860] [<ffffffff8122b
[ +0.000834] [<ffffffff8122d
[ +0.000812] [<ffffffff810bc
[ +0.000787] [<ffffffff8122f
[ +0.000785] [<ffffffff81864
[ +0.000753] [<ffffffff81864
[ +0.000723] [<ffffffff8123d
[ +0.000732] [<ffffffff8121d
[ +0.000692] [<ffffffff81864
[ +0.000672] [<ffffffff8121d
[ +0.000636] [<ffffffff81869
[ +0.000640] Code: e2 fe 8d 72 02 0f b7 f6 0f 1f 80 00 00 00 00 eb d2 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 b8 00 00 02 00 <f0> 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 41 89 d0 0f b7
[ +0.002030] RIP [<ffffffff81869
[ +0.000658] RSP <ffff88007792f998>
[ +0.000639] CR2: ffffffffc046a088
[ +0.001869] ---[ end trace a0b6053f39d0bb33 ]---
-------
[Sep 8 14:37] INFO: task ls:3402 blocked for more than 120 seconds.
[ +0.000509] Not tainted 4.4.0-189-generic #219-Ubuntu
[ +0.000241] "echo 0 > /proc/sys/
[ +0.000291] ls D ffff880033757978 0 3402 2166 0x00000000
[ +0.000017] ffff880033757978 ffff88003375794c ffff88007a869c80 ffff880073c28e40
[ +0.000006] ffff880033758000 7fffffffffffffff ffff880033757b18 ffff880073c28e40
[ +0.000004] 0000000000000000 ffff880033757990 ffffffff81864f15 ffff880033757b20
[ +0.000005] Call Trace:
[ +0.000028] [<ffffffff81864
[ +0.000015] [<ffffffff81868
[ +0.000007] [<ffffffff81865
[ +0.000018] [<ffffffff810b3
[ +0.000088] [<ffffffffc0526
[ +0.000034] [<ffffffffc0527
[ +0.000006] [<ffffffff811fb
[ +0.000032] [<ffffffffc052b
[ +0.000008] [<ffffffff81229
[ +0.000004] [<ffffffff81229
[ +0.000005] [<ffffffff8122c
[ +0.000007] [<ffffffff8135d
[ +0.000006] [<ffffffff8122c
[ +0.000005] [<ffffffff8122f
[ +0.000008] [<ffffffff8123d
[ +0.000006] [<ffffffff8121d
[ +0.000007] [<ffffffff8106e
[ +0.000005] [<ffffffff8121d
[ +0.000006] [<ffffffff81869
-------
[ +0.000214] (cp,2020,
[ +0.000614] (cp,2020,
[ +0.000224] (cp,2020,
[ +0.000227] (cp,2020,
[ +0.000206] (cp,2020,
[ +0.000206] (cp,2020,
[ +0.000202] (cp,2020,
[ +0.000200] (cp,2020,
[ +0.000204] (cp,2020,
[ +0.000207] (cp,2020,
[ +0.000204] (cp,2020,
[ +0.000205] (cp,2020,
-------
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-
ProcVersionSign
Uname: Linux 4.4.0-187-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.24
Architecture: amd64
Date: Wed Sep 9 14:37:18 2020
InstallationDate: Installed on 2016-10-21 (1419 days ago)
InstallationMedia: Ubuntu-Server 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
ProcEnviron:
LANGUAGE=en_CA:en
TERM=xterm
PATH=(custom, no user)
LANG=en_CA.UTF-8
SHELL=/bin/bash
SourcePackage: linux-signed
UpgradeStatus: No upgrade log present (probably fresh install)
Changed in linux (Ubuntu): | |
status: | Incomplete → Confirmed |
I see the same problem with Ubuntu 18 and Kernel 4.15.0-115 and 117. Massive trouble on ocfs2 based mounts. Writing isn't possible. A serious issue as all clustered systems are not working. Processes trying to write on the filesystem are hanging without any chance to kill them except a reboot.
Linux 10397-w2 4.15.0-117-generic #118-Ubuntu SMP Fri Sep 4 20:02:41 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Sep 10 08:33:15 10397-w2 kernel: [55129.020995] BUG: unable to handle kernel paging request at 0000000200000088 lock+0x10/ 0x30 temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf serio_raw intel_pch_thermal shpchp ie31200_edac wmi parport_pc parport mac_hid video acpi_pad sch_fq_codel drbd lru_cache ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 spin_lock+ 0x10/0x30 3c7a68 EFLAGS: 00010246 0(0000) GS:ffff8c866e58 0000(0000) knlGS:000000000 0000000
Sep 10 08:33:15 10397-w2 kernel: [55129.021604] IP: _raw_spin_
Sep 10 08:33:15 10397-w2 kernel: [55129.021942] PGD 0 P4D 0
Sep 10 08:33:15 10397-w2 kernel: [55129.022162] Oops: 0002 [#1] SMP PTI
Sep 10 08:33:15 10397-w2 kernel: [55129.022464] Modules linked in: ocfs2 quota_tree ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_log_ipv6 ip6table_filter ip6_tables xt_comment ipt_REJECT nf_reject_ipv4 xt_owner xt_tcpudp xt_conntrack nf_log_ipv4 nf_log_common xt_LOG iptable_filter nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue intel_rapl x86_pkg_
Sep 10 08:33:15 10397-w2 kernel: [55129.028575] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear raid1 e1000e ahci nvme ptp psmouse libahci nvme_core pps_core
Sep 10 08:33:15 10397-w2 kernel: [55129.029958] CPU: 6 PID: 22360 Comm: rsync Not tainted 4.15.0-117-generic #118-Ubuntu
Sep 10 08:33:15 10397-w2 kernel: [55129.030627] Hardware name: FUJITSU D3417-B1/D3417-B1, BIOS V5.0.0.11 R1.21.0.SR.2 for D3417-B1x 07/14/2017
Sep 10 08:33:15 10397-w2 kernel: [55129.031566] RIP: 0010:_raw_
Sep 10 08:33:15 10397-w2 kernel: [55129.031979] RSP: 0018:ffffb76306
Sep 10 08:33:15 10397-w2 kernel: [55129.032430] RAX: 0000000000000000 RBX: 0000000200000000 RCX: 00000000ffffffff
Sep 10 08:33:15 10397-w2 kernel: [55129.033045] RDX: 0000000000000001 RSI: 0000000000000008 RDI: 0000000200000088
Sep 10 08:33:15 10397-w2 kernel: [55129.047260] RBP: ffffb763063c7a68 R08: 0000000000000000 R09: 0000000000000000
Sep 10 08:33:15 10397-w2 kernel: [55129.061473] R10: 0000000000000005 R11: 0000000000000c65 R12: 0000000200000088
Sep 10 08:33:15 10397-w2 kernel: [55129.075691] R13: ffff8c862039c348 R14: ffff8c861ee34a08 R15: 00000000ffffffff
Sep 10 08:33:15 10397-w2 kernel: [55129.089801] FS: 00007f38b6f12b8
Sep 10 08:33:15 10397-w2 kernel: [55129.117370] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 10 08:33:15 10397-w2 kernel: [55129.131552] CR2: 0000000200000088 CR3: 0000000969088003 CR4: 00000000003606e0
Sep ...