zfs send encrypt causes kernel NULL pointer dereference

Bug #1940083 reported by Simon Déziel
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Dimitri John Ledkov

Bug Description

zfs send -I works well:

# uname -a
Linux sdeziel-desktop 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
# zfs send -I data/simon/musique@laptop-2020-11-30 data/simon/musique@o | wc -c
5660616
# dmesg -c
#

but when sending encrypted data sets, some snapshot combinations cause problem:

# zfs send -wI data/simon/musique@laptop-2020-11-30 data/simon/musique@o | wc -c
# # hung as nothing is written to the pipe.

# dmesg -c
[ 1179.862792] BUG: kernel NULL pointer dereference, address: 0000000000000030
[ 1179.862798] #PF: supervisor read access in kernel mode
[ 1179.862801] #PF: error_code(0x0000) - not-present page
[ 1179.862803] PGD 0 P4D 0
[ 1179.862806] Oops: 0000 [#5] SMP PTI
[ 1179.862809] CPU: 1 PID: 13834 Comm: zfs Tainted: P D IO 5.11.0-27-generic #29~20.04.1-Ubuntu
[ 1179.862813] Hardware name: /D54250WYK, BIOS WYLPT10H.86A.0054.2019.0902.1752 09/02/2019
[ 1179.862815] RIP: 0010:dmu_dump_write+0x22c/0x320 [zfs]
[ 1179.862922] Code: d0 48 89 43 58 49 8b 45 60 48 89 43 38 49 8b 45 68 48 89 43 40 49 8b 45 70 48 89 43 48 49 8b 45 78 48 89 43 50 e9 9f fe ff ff <49> 8b 45 30 45 85 c0 74 39 48 85 c0 78 04 80 4b 31 02 48 8d 53 70
[ 1179.862924] RSP: 0018:ffffa51ae62678f0 EFLAGS: 00010206
[ 1179.862927] RAX: d1141450686f75df RBX: ffff952730fdde00 RCX: 0000000000000000
[ 1179.862929] RDX: 00000000000080f7 RSI: 0000000000000013 RDI: ffff952730fddf38
[ 1179.862931] RBP: ffffa51ae6267938 R08: 0000000001000000 R09: 0000000000020000
[ 1179.862933] R10: 00000000000080f7 R11: 0000000000020000 R12: ffffa51ae6267ab8
[ 1179.862935] R13: 0000000000000000 R14: 0000000000020000 R15: 0000000000000000
[ 1179.862937] FS: 00007fa6a03ac7c0(0000) GS:ffff95298fa80000(0000) knlGS:0000000000000000
[ 1179.862939] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1179.862941] CR2: 0000000000000030 CR3: 000000025a648001 CR4: 00000000001706e0
[ 1179.862943] Call Trace:
[ 1179.862945] ? wait_woken+0x80/0x80
[ 1179.862951] do_dump+0x5da/0x900 [zfs]
[ 1179.863035] ? spl_kmem_free+0x28/0x30 [spl]
[ 1179.863046] dmu_send_impl+0xdf6/0x1570 [zfs]
[ 1179.863129] ? dbuf_rele_and_unlock+0x32e/0x6d0 [zfs]
[ 1179.863204] ? dbuf_rele+0x3d/0x50 [zfs]
[ 1179.863279] ? dmu_buf_rele+0xe/0x10 [zfs]
[ 1179.863354] dmu_send_obj+0x24c/0x360 [zfs]
[ 1179.863435] ? mze_find+0xd4/0xf0 [zfs]
[ 1179.863551] zfs_ioc_send+0x11d/0x2c0 [zfs]
[ 1179.863665] ? zfs_ioc_send+0x2c0/0x2c0 [zfs]
[ 1179.863778] zfsdev_ioctl_common+0x679/0x930 [zfs]
[ 1179.863892] ? __kmalloc_node+0x40d/0x4e0
[ 1179.863899] zfsdev_ioctl+0x57/0xe0 [zfs]
[ 1179.864008] __x64_sys_ioctl+0x91/0xc0
[ 1179.864012] do_syscall_64+0x38/0x90
[ 1179.864018] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1179.864023] RIP: 0033:0x7fa6a09a050b
[ 1179.864026] Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
[ 1179.864028] RSP: 002b:00007ffdb9687fb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1179.864032] RAX: ffffffffffffffda RBX: 000055dd6be41f60 RCX: 00007fa6a09a050b
[ 1179.864033] RDX: 00007ffdb968b5c0 RSI: 0000000000005a1c RDI: 0000000000000003
[ 1179.864035] RBP: 00007ffdb968b580 R08: 0000000000005a1c R09: 0000000000000003
[ 1179.864037] R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffdb968b5c0
[ 1179.864038] R13: 000055dd6be41f70 R14: 0000000000000000 R15: 0000000000000001
[ 1179.864041] Modules linked in: vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock ccm nf_log_ipv6 ip6table_filter ip6table_nat ip6t_rpfilter ip6table_mangle ip6table_raw ip6_tables nf_log_ipv4 nf_log_common xt_owner xt_LOG xt_conntrack iptable_filter xt_MASQUERADE xt_nat xt_tcpudp iptable_nat nf_nat iptable_mangle iptable_raw bpfilter nls_iso8859_1 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) intel_rapl_msr spl(O) intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio x86_pkg_temp_thermal snd_hda_codec_hdmi intel_powerclamp coretemp snd_hda_intel snd_intel_dspcfg mei_hdcp kvm_intel soundwire_intel at24 soundwire_generic_allocation kvm soundwire_cadence snd_hda_codec snd_hda_core soundwire_bus crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_soc_core rapl snd_usb_audio snd_compress ac97_bus intel_cstate snd_pcm_dmaengine iwldvm uvcvideo videobuf2_vmalloc btusb snd_usbmidi_lib btrtl snd_hwdep btbcm mac80211
[ 1179.864103] snd_seq_midi libarc4 efi_pstore videobuf2_memops btintel snd_seq_midi_event videobuf2_v4l2 wmi_bmof snd_rawmidi iwlwifi videobuf2_common bluetooth snd_pcm ecdh_generic snd_seq videodev ecc snd_seq_device mc snd_timer cfg80211 mei_me snd i2c_i801 mei e1000e lpc_ich i2c_smbus soundcore sch_fq_codel nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 msr ip_tables x_tables autofs4 btrfs blake2b_generic libcrc32c xor raid6_pq hid_generic usbhid hid dm_crypt aesni_intel i915 glue_helper i2c_algo_bit crypto_simd drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm ahci libahci xhci_pci wmi xhci_pci_renesas mac_hid video
[ 1179.864157] CR2: 0000000000000030
[ 1179.864159] ---[ end trace 8e7cbf6bc2c0c353 ]---
[ 1179.864524] RIP: 0010:dmu_dump_write+0x22c/0x320 [zfs]
[ 1179.864610] Code: d0 48 89 43 58 49 8b 45 60 48 89 43 38 49 8b 45 68 48 89 43 40 49 8b 45 70 48 89 43 48 49 8b 45 78 48 89 43 50 e9 9f fe ff ff <49> 8b 45 30 45 85 c0 74 39 48 85 c0 78 04 80 4b 31 02 48 8d 53 70
[ 1179.864613] RSP: 0018:ffffa51ac8f878f0 EFLAGS: 00010206
[ 1179.864616] RAX: d1141450686f75df RBX: ffff9526877d4a00 RCX: 0000000000000000
[ 1179.864618] RDX: 00000000000080f7 RSI: 0000000000000013 RDI: ffff9526877d4b38
[ 1179.864620] RBP: ffffa51ac8f87938 R08: 0000000001000000 R09: 0000000000020000
[ 1179.864622] R10: 00000000000080f7 R11: 0000000000020000 R12: ffffa51ac8f87ab8
[ 1179.864623] R13: 0000000000000000 R14: 0000000000020000 R15: 0000000000000000
[ 1179.864625] FS: 00007fa6a03ac7c0(0000) GS:ffff95298fa80000(0000) knlGS:0000000000000000
[ 1179.864628] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1179.864630] CR2: 0000000000000030 CR3: 000000025a648001 CR4: 00000000001706e0

Additional information:

$ lsb_release -rd
Description: Ubuntu 20.04.2 LTS
Release: 20.04

$ zfs --version
zfs-0.8.3-1ubuntu12.12
zfs-kmod-2.0.2-1ubuntu5

$ apt-cache policy linux-image-5.11.0-27-generic zfsutils-linux
linux-image-5.11.0-27-generic:
  Installed: 5.11.0-27.29~20.04.1
  Candidate: 5.11.0-27.29~20.04.1
  Version table:
 *** 5.11.0-27.29~20.04.1 500
        500 http://us.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
        100 /var/lib/dpkg/status
zfsutils-linux:
  Installed: 0.8.3-1ubuntu12.12
  Candidate: 0.8.3-1ubuntu12.12
  Version table:
 *** 0.8.3-1ubuntu12.12 500
        500 http://us.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     0.8.3-1ubuntu12.9 500
        500 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages
     0.8.3-1ubuntu12 500
        500 http://us.archive.ubuntu.com/ubuntu focal/main amd64 Packages
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.18
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC2: sdeziel 2406 F.... pulseaudio
 /dev/snd/pcmC2D0p: sdeziel 2406 F...m pulseaudio
 /dev/snd/controlC0: sdeziel 2406 F.... pulseaudio
 /dev/snd/controlC1: sdeziel 2406 F.... pulseaudio
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2021-05-02 (109 days ago)
InstallationMedia: Ubuntu 20.04.2 LTS "Focal Fossa" - Release amd64 (20210421)
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
Package: linux (not installed)
ProcEnviron:
 LANGUAGE=en_CA:en
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_CA.UTF-8
 SHELL=/bin/bash
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.11.0-27-generic root=UUID=4c80434d-cb91-4b02-8a1d-c5e8951ad8a6 ro rootflags=subvol=@ quiet splash kaslr nmi_watchdog=0 pti=on vsyscall=none vt.handoff=7
ProcVersionSignature: Ubuntu 5.11.0-27.29~20.04.1-generic 5.11.22
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.11.0-27-generic N/A
 linux-backports-modules-5.11.0-27-generic N/A
 linux-firmware 1.187.16
Tags: focal
Uname: Linux 5.11.0-27-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 09/02/2019
dmi.bios.release: 4.6
dmi.bios.vendor: Intel Corp.
dmi.bios.version: WYLPT10H.86A.0054.2019.0902.1752
dmi.board.name: D54250WYK
dmi.board.vendor: Intel Corporation
dmi.board.version: H13922-303
dmi.chassis.type: 3
dmi.modalias: dmi:bvnIntelCorp.:bvrWYLPT10H.86A.0054.2019.0902.1752:bd09/02/2019:br4.6:svn:pn:pvr:rvnIntelCorporation:rnD54250WYK:rvrH13922-303:cvn:ct3:cvr:

affects: zfs-linux (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
importance: Undecided → High
status: New → In Progress
Revision history for this message
Colin Ian King (colin-king) wrote :

I've read of issues such as this occurring because of raw sends being used when the large_blocks feature is not enabled. Can you attach the output of the following commnand:

sudo zpool get all POOLNAME

where POOLNAME is the name of your pool.

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Simon Déziel (sdeziel) wrote :
Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1940083

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: hirsute
Revision history for this message
Simon Déziel (sdeziel) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected focal
description: updated
Revision history for this message
Simon Déziel (sdeziel) wrote : CRDA.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : IwConfig.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : Lspci.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : Lspci-vt.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : Lsusb.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : Lsusb-t.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : Lsusb-v.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : ProcModules.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : RfKill.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : UdevDb.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : WifiSyslog.txt

apport information

Revision history for this message
Simon Déziel (sdeziel) wrote : acpidump.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Simon Déziel (sdeziel) wrote :

You didn't ask but I found https://github.com/openzfs/zfs/issues/12275 that seemed similar (to my untrained eye) and tried to pipe the problematic stream to zstreamdump -vvvvv, maybe there is something useful in there? Let me know if there is anything I can do to help, thanks!

Revision history for this message
Colin Ian King (colin-king) wrote :

Good find. I'll try and get onto this first thing next week; I'm currently on vacation.

Changed in linux (Ubuntu):
assignee: Colin Ian King (colin-king) → nobody
assignee: nobody → Dimitri John Ledkov (xnox)
Revision history for this message
Simon Déziel (sdeziel) wrote :

Another of my FSes has a snapshot affected by this I/O error:

root@sdeziel-lemur:~# zfs --version
zfs-0.8.3-1ubuntu12.13
zfs-kmod-2.0.2-1ubuntu5.1

root@sdeziel-lemur:~# zfs send -vw -I 'data/rootfs-backup'@'syncoid_sdeziel-lemur_2021-10-08:09:17:03' 'data/rootfs-backup'@'syncoid_sdeziel-lemur_2021-10-13:15:22:46' | cat >/dev/null
send from @syncoid_sdeziel-lemur_2021-10-08:09:17:03 to data/rootfs-backup@syncoid_sdeziel-lemur_2021-10-09:12:17:04 estimated size is 318M
send from @syncoid_sdeziel-lemur_2021-10-09:12:17:04 to data/rootfs-backup@syncoid_sdeziel-lemur_2021-10-13:15:22:46 estimated size is 1.23G
total estimated size is 1.54G
TIME SENT SNAPSHOT data/rootfs-backup@syncoid_sdeziel-lemur_2021-10-09:12:17:04
warning: cannot send 'data/rootfs-backup@syncoid_sdeziel-lemur_2021-10-09:12:17:04': Input/output error
TIME SENT SNAPSHOT data/rootfs-backup@syncoid_sdeziel-lemur_2021-10-13:15:22:46
16:07:54 1.06G data/rootfs-backup@syncoid_sdeziel-lemur_2021-10-13:15:22:46

Revision history for this message
Simon Déziel (sdeziel) wrote :

Oops, the above comment might be unrelated, please ignore it as there seems to be something else going on:

root@sdeziel-lemur:~# zpool status -v
  pool: data
 state: ONLINE
status: One or more devices has experienced an error resulting in data
 corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
 entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 0 days 00:00:23 with 0 errors on Sun Aug 15 18:13:40 2021
config:

 NAME STATE READ WRITE CKSUM
 data ONLINE 0 0 0
   5b3ea18d-deb8-4267-8b74-0f9227bf30a9 ONLINE 0 0 24

errors: Permanent errors have been detected in the following files:

        data/rootfs-backup@syncoid_sdeziel-lemur_2021-10-09:12:17:04:<0x10f54>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.