File System inaccessible after 18.04 upgrade

Bug #1788004 reported by Chris Claggett
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Won't Fix
Medium
Unassigned

Bug Description

ZPool was functioning before upgrading to Ubuntu 18.04. Unfortunately didn't confirm pool working before upgrading the pool so can't roll back. Can't access file system anymore, though a scrub did succeed.

Simple 'ls -la' segfaults, and this shows up in dmesg:
[ 320.449464] detected buffer overflow in memmove
[ 320.449489] ------------[ cut here ]------------
[ 320.449492] kernel BUG at /build/linux-wuhukg/linux-4.15.0/lib/string.c:1052!
[ 320.449503] invalid opcode: 0000 [#1] SMP PTI
[ 320.449506] Modules linked in: thunderbolt nls_iso8859_1 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) snd_
hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_ca0132 irqbypass crct10dif_pclmu
l crc32_pclmul ghash_clmulni_intel pcbc aesni_intel snd_hda_intel snd_hda_codec aes_x86_64 crypto_simd snd_hda_core glue_helper crypt
d intel_cstate snd_hwdep intel_rapl_perf snd_pcm intel_wmi_thunderbolt snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_de
vice snd_timer snd soundcore input_leds mac_hid mei_me mei shpchp acpi_pad intel_pch_thermal sch_fq_codel vhba(OE) parport_pc ppdev l
p parport ip_tables x_tables autofs4 hid_generic usbhid hid i915 mxm_wmi e1000e i2c_algo_bit ptp drm_kms_helper pps_core syscopyarea
sysfillrect
[ 320.449599] sysimgblt fb_sys_fops alx drm mdio ahci libahci wmi video
[ 320.449617] CPU: 3 PID: 3202 Comm: updatedb.mlocat Tainted: P W OE 4.15.0-32-generic #35-Ubuntu
[ 320.449621] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z170X-Gaming 7, BIOS F8 08/26/2016
[ 320.449632] RIP: 0010:fortify_panic+0x13/0x22
[ 320.449636] RSP: 0018:ffffa85fc5d27898 EFLAGS: 00010282
[ 320.449641] RAX: 0000000000000023 RBX: ffffa85fc5d279a8 RCX: 0000000000000000
[ 320.449645] RDX: 0000000000000000 RSI: ffff97dc51d96498 RDI: ffff97dc51d96498
[ 320.449649] RBP: ffffa85fc5d27898 R08: 0000000000000000 R09: 000000000000058c
[ 320.449652] R10: ffff97dc2916501c R11: ffffffffb895380d R12: 0000000000000001
[ 320.449656] R13: ffff97dc2a717460 R14: 0000000000000070 R15: ffff97dc21e40300
[ 320.449661] FS: 00007fb9bd9c4540(0000) GS:ffff97dc51d80000(0000) knlGS:0000000000000000
[ 320.449665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 320.449669] CR2: 00007fa69800e778 CR3: 0000000812c88005 CR4: 00000000003606e0
[ 320.449673] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 320.449677] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 320.449680] Call Trace:
[ 320.449779] zfs_acl_node_read.constprop.15+0x33d/0x340 [zfs]
[ 320.449867] zfs_zaccess_aces_check+0x96/0x380 [zfs]
[ 320.449874] ? mutex_lock+0x12/0x40
[ 320.449879] ? _cond_resched+0x19/0x40
[ 320.449929] ? arc_buf_access+0x14a/0x270 [zfs]
[ 320.449977] ? dbuf_rele_and_unlock+0x1a8/0x4b0 [zfs]
[ 320.450056] zfs_zaccess_common+0xda/0x1f0 [zfs]
[ 320.450062] ? _cond_resched+0x19/0x40
[ 320.450139] zfs_zaccess+0xd4/0x3d0 [zfs]
[ 320.450235] ? rrw_enter_read_impl+0xae/0x160 [zfs]
[ 320.450327] zfs_lookup+0x1c5/0x3a0 [zfs]
[ 320.450415] zpl_lookup+0xc9/0x1e0 [zfs]
[ 320.450424] lookup_slow+0xab/0x170
[ 320.450432] walk_component+0x1c3/0x470
[ 320.450438] ? _cond_resched+0x19/0x40
[ 320.450444] ? mutex_lock+0x12/0x40
[ 320.450451] path_lookupat+0x84/0x1f0
[ 320.450458] ? ___slab_alloc+0xf2/0x4b0
[ 320.450545] ? zfs_readdir+0x267/0x460 [zfs]
[ 320.450552] ? ___slab_alloc+0xf2/0x4b0
[ 320.450560] filename_lookup+0xb6/0x190
[ 320.450569] ? __check_object_size+0xaf/0x1b0
[ 320.450578] ? strncpy_from_user+0x4d/0x170
[ 320.450586] user_path_at_empty+0x36/0x40
[ 320.450592] ? user_path_at_empty+0x36/0x40
[ 320.450600] vfs_statx+0x76/0xe0
[ 320.450606] ? _cond_resched+0x19/0x40
[ 320.450611] ? dput.part.22+0x2d/0x1e0
[ 320.450619] SYSC_newlstat+0x3d/0x70
[ 320.450628] ? SyS_poll+0x9b/0x130
[ 320.450635] ? SyS_poll+0x9b/0x130
[ 320.450643] SyS_newlstat+0xe/0x10
[ 320.450651] do_syscall_64+0x73/0x130
[ 320.450659] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 320.450664] RIP: 0033:0x7fb9bd4d2815
[ 320.450669] RSP: 002b:00007ffe3d613f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
[ 320.450675] RAX: ffffffffffffffda RBX: 00005632906b5059 RCX: 00007fb9bd4d2815
[ 320.450679] RDX: 00007ffe3d613fb0 RSI: 00007ffe3d613fb0 RDI: 00005632906b5059
[ 320.450684] RBP: 00005632906ab600 R08: 0000000000000073 R09: 0000000000000073
[ 320.450688] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe3d614190
[ 320.450692] R13: 0000000000000000 R14: 00007ffe3d614180 R15: 0000000000000000
[ 320.450697] Code: e0 4c 89 e2 e8 61 6a 00 00 42 c6 04 20 00 48 89 d8 5b 41 5c 5d c3 0f 0b 55 48 89 fe 48 c7 c7 f0 10 1a b8 48 89 e5 e8 7f cf 76 ff <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 90 90 55 31 c9 48 89 fa
[ 320.450788] RIP: fortify_panic+0x13/0x22 RSP: ffffa85fc5d27898
[ 320.450794] ---[ end trace 8e2de80852d9eb6a ]---

Segfault, ZPool status and layouts:
chris@God:/storage/media$ ls -la
Segmentation fault
chris@RED:/storage$ sudo zpool status && sudo zfs list -t all
  pool: storage
 state: ONLINE
  scan: scrub repaired 0B in 17h8m with 0 errors on Thu Aug 16 00:14:21 2018
config:

 NAME STATE READ WRITE CKSUM
 storage ONLINE 0 0 0
   raidz2-0 ONLINE 0 0 0
     ata-WDC_WD4000F9YZ-09N20L0_WD-WCC5D0027402 ONLINE 0 0 0
     ata-WDC_WD4002FYYZ-01B7CB0_K4KAWS9B ONLINE 0 0 0
     ata-WDC_WD4002FYYZ-01B7CB0_K4K8WXMB ONLINE 0 0 0
     ata-WDC_WD4000F9YZ-09N20L0_WD-WCC5D0039836 ONLINE 0 0 0
     ata-WDC_WD4000F9YZ-09N20L0_WD-WCC4A0025837 ONLINE 0 0 0
     ata-WDC_WD4000F9YZ-09N20L1_WD-WMC130338481 ONLINE 0 0 0
     ata-WDC_WD4000F9YZ-09N20L0_WD-WCC5D0047299 ONLINE 0 0 0
     ata-WDC_WD4002FYYZ-01B7CB0_K4J9X2KB ONLINE 0 0 0
     ata-WDC_WD4002FYYZ-01B7CB1_K3GX8UHL ONLINE 0 0 0
     ata-WDC_WD4000F9YZ-09N20L1_WD-WMC130E1FNR9 ONLINE 0 0 0
 logs
   ata-OCZ_SOLID_SSD_MK0209030A4760013-part4 ONLINE 0 0 0
 cache
   ata-OCZ_SOLID_SSD_MK0209030A4760013-part5 ONLINE 0 0 0

errors: No known data errors
NAME USED AVAIL REFER MOUNTPOINT
storage 18.9T 7.89T 201K /storage
storage/media 18.6T 7.89T 18.6T /storage/media
storage/steve 282G 7.89T 282G /storage/steve

System info:
Description: Ubuntu 18.04.1 LTS
Release: 18.04

Package:
zfsutils-linux:
  Installed: 0.7.5-1ubuntu16.3
  Candidate: 0.7.5-1ubuntu16.3
  Version table:
 *** 0.7.5-1ubuntu16.3 500
        500 http://us.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     0.7.5-1ubuntu15 500
        500 http://us.archive.ubuntu.com/ubuntu bionic/main amd64 Packages

This is my first time reporting a bug, let me know if anything else would be helpful.

Revision history for this message
Chris Siebenmann (cks) wrote :

Based on inspecting the ZFS source code, this looks like a ZFS inode where there is a mismatch between the claimed size of the inode's (ZFS) ACL and how it's stored. If you're willing to do some work, it might be possible to narrow this down to identify a specific inode and what's wrong with it, which could help fixing this (in Ubuntu and upstream). Unfortunately this will be a somewhat involved process involving zdb, because you need to dump on-disk structures. If you're interested and willing to tackle this, your best approach is probably to ask for help on the ZFS on Linux mailing list, because it will be easier to give the details there.

It is possible that a 'zfs send | zfs receive' sequence will allow you to recover a usable version of the filesystem (and I believe you can do this without the filesystem being mounted and thus without the risk of panic). If you have, say, a spare external USB drive, it might be worth trying this.

Revision history for this message
Chris Claggett (sadiesue43) wrote :

Thanks for taking a look at this Chris. I've started a thread on the ZoL mailing list here:

http://list.zfsonlinux.org/pipermail/zfs-discuss/2018-September/032181.html

And will be updating it with more information today.

Revision history for this message
Chris Claggett (sadiesue43) wrote :

ZoL bug has also been opened for this with more information:
https://github.com/zfsonlinux/zfs/issues/7910

Revision history for this message
Colin Ian King (colin-king) wrote :

This looks like the upstream bug report has been closed because it is stale from no more activity. I'm going to close this bug. If it requires more attention please feel free to re-open this issue.

Changed in zfs-linux (Ubuntu):
importance: Undecided → Medium
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.