Attempt to map rbd image from ceph jewel/luminous hangs

Bug #1728739 reported by Billy Olsen
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Xenial
Fix Released
Medium
Unassigned

Bug Description

[Impact]

Attempting to map an rbd image using the 4.4 LTS (Xenial) kernel client from a Jewel or Luminous Ceph cluster with optimal tunables fails due to feature set mismatch.

The Jewel release of Ceph introduced a new set of CRUSH tunables. These tunables were first introduced in the 4.5 Linux kernel and are thus not available in the 16.04 LTS 4.4 Linux Kernel. Attempting to map RBD images as block devices will fail due to not being able to understand these new tunables:

(from kern.log)

Oct 30 21:19:05 ceph-7 kernel: [ 815.674075] Key type ceph registered
Oct 30 21:19:05 ceph-7 kernel: [ 815.676862] libceph: loaded (mon/osd proto 15/24)
Oct 30 21:19:05 ceph-7 kernel: [ 815.678970] rbd: loaded (major 251)
Oct 30 21:19:05 ceph-7 kernel: [ 815.689556] libceph: mon0 10.5.0.19:6789 feature set mismatch, my 106b84a842a42 < server's 40106b84a842a42, missing 400000000000000
Oct 30 21:19:05 ceph-7 kernel: [ 815.692897] libceph: mon0 10.5.0.19:6789 missing required protocol features

Support for the new CRUSH tunables were added in upstream kernel 4.5 in http://www.spinics.net/lists/ceph-devel/msg28421.html.

[Test Case]

1. Deploy a Jewel or Luminous Ceph cluster.
2. Create rbd image suitable for the kernel client:
  $ rbd create --pool rbd --image-feature layering --size 1G test
3. Map the rbd image to the local server:
  $ rbd map --pool rbd test

[Regression Potential]

Minimal. Code is limited to kernel rbd driver and new code should primarily affect clients connecting to clusters with the new tunables options.

[Additional Info]

A workaround is to change the crush tunables configured for the Ceph cluster to a legacy version (hammer or lower) via:

$ ceph osd crush tunables hammer

However, changing the tunables to hammer fails to allow the cluster to take advantage of newer placement strategies which reduces the amount of data movement throughout the cluster.
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Oct 31 01:23 seq
 crw-rw---- 1 root audio 116, 33 Oct 31 01:23 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.10
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
DistroRelease: Ubuntu 16.04
Ec2AMI: ami-00000001
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.small
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: OpenStack Foundation OpenStack Nova
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-98-generic root=UUID=d7006b2f-ace6-464d-8b21-17180b3ed360 ro console=tty1 console=ttyS0
ProcVersionSignature: Ubuntu 4.4.0-98.121-generic 4.4.90
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-98-generic N/A
 linux-backports-modules-4.4.0-98-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial ec2-images
Uname: Linux 4.4.0-98-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.10.1-1ubuntu1~cloud0
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-zesty
dmi.modalias: dmi:bvnSeaBIOS:bvr1.10.1-1ubuntu1~cloud0:bd04/01/2014:svnOpenStackFoundation:pnOpenStackNova:pvr15.0.6:cvnQEMU:ct1:cvrpc-i440fx-zesty:
dmi.product.name: OpenStack Nova
dmi.product.version: 15.0.6
dmi.sys.vendor: OpenStack Foundation

CVE References

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1728739

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Billy Olsen (billy-olsen) wrote : CurrentDmesg.txt

apport information

tags: added: apport-collected ec2-images xenial
description: updated
Revision history for this message
Billy Olsen (billy-olsen) wrote : JournalErrors.txt

apport information

Revision history for this message
Billy Olsen (billy-olsen) wrote : Lspci.txt

apport information

Revision history for this message
Billy Olsen (billy-olsen) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Billy Olsen (billy-olsen) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Billy Olsen (billy-olsen) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Billy Olsen (billy-olsen) wrote : ProcModules.txt

apport information

Revision history for this message
Billy Olsen (billy-olsen) wrote : UdevDb.txt

apport information

Revision history for this message
Billy Olsen (billy-olsen) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
status: Confirmed → Triaged
importance: Undecided → Medium
Changed in linux (Ubuntu Xenial):
importance: Undecided → Medium
status: New → Triaged
Stefan Bader (smb)
Changed in linux (Ubuntu):
status: Triaged → Fix Released
Stefan Bader (smb)
Changed in linux (Ubuntu Xenial):
status: Triaged → Fix Committed
Revision history for this message
Khaled El Mously (kmously) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Successfully verified this with kernel 4.4.0-102-generic in the xenial-proposed repository.

I verified mapping rbd devices against Ceph versions:

0.94.10 Hammer release (from trusty-kilo cloud-archive)
10.2.7 Jewel release (from xenial-updates)
12.2.0 Luminous release (from xenial-pike cloud-archive)

All versions had optimal tunables set, which previously prevented rbd mapping from completing for Jewel and Luminous releases due to missing TUNABLES5 support. All tests successfully mapped, formatted, and mounted.

Relevant kernel log snippets:

Nov 29 18:59:29 juju-49feb5-ceph-6 kernel: [ 0.000000] Linux version 4.4.0-102-generic (buildd@lgw01-amd64-055) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5) ) #125-Ubuntu SMP Tue Nov 21 15:15:11 UTC 2017 (Ubuntu 4.4.0-102.125-generic 4.4.98)

==== RBD Mapping to Ceph Hammer Cluster v0.94.10 (trusty-kilo cloud-archive) ====
Nov 29 19:01:45 juju-49feb5-ceph-6 kernel: [ 148.314991] Key type ceph registered
Nov 29 19:01:45 juju-49feb5-ceph-6 kernel: [ 148.317382] libceph: loaded (mon/osd proto 15/24)
Nov 29 19:01:45 juju-49feb5-ceph-6 kernel: [ 148.323276] rbd: loaded (major 251)
Nov 29 19:01:45 juju-49feb5-ceph-6 kernel: [ 148.343492] libceph: client44139 fsid 6547bd3e-1397-11e2-82e5-53567c8d32dc
Nov 29 19:01:45 juju-49feb5-ceph-6 kernel: [ 148.346357] libceph: mon2 10.5.0.3:6789 session established
Nov 29 19:01:45 juju-49feb5-ceph-6 kernel: [ 148.415717] rbd: rbd0: added with size 0x280000000
Nov 29 19:02:04 juju-49feb5-ceph-6 kernel: [ 167.519366] EXT4-fs (rbd0): mounted filesystem with ordered data mode. Opts: (null)

==== RBD Mapping to Ceph Jewel Cluster v10.2.7 (xenial/trusty-mitaka cloud-archive) ====
Nov 29 19:17:48 juju-49feb5-ceph-6 kernel: [ 1111.874853] libceph: client64310 fsid 6547bd3e-1397-11e2-82e5-53567c8d32dc
Nov 29 19:17:48 juju-49feb5-ceph-6 kernel: [ 1111.876730] libceph: mon1 10.5.0.14:6789 session established
Nov 29 19:17:48 juju-49feb5-ceph-6 kernel: [ 1111.924995] rbd: rbd0: added with size 0x280000000
Nov 29 19:17:54 juju-49feb5-ceph-6 kernel: [ 1117.704119] EXT4-fs (rbd0): mounted filesystem with ordered data mode. Opts: (null)

==== RBD Mapping to Ceph Luminous Cluster v12.2.0 (xenial-ocata cloud-archive) ====
Nov 29 19:41:42 juju-49feb5-ceph-6 kernel: [ 2545.672870] libceph: client4216 fsid 6547bd3e-1397-11e2-82e5-53567c8d32dc
Nov 29 19:41:42 juju-49feb5-ceph-6 kernel: [ 2545.679350] libceph: mon1 10.5.0.12:6789 session established
Nov 29 19:41:42 juju-49feb5-ceph-6 kernel: [ 2545.717272] rbd: rbd0: added with size 0x280000000
Nov 29 19:42:04 juju-49feb5-ceph-6 kernel: [ 2567.088206] EXT4-fs (rbd0): mounted filesystem with ordered data mode. Opts: (null)

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (9.5 KiB)

This bug was fixed in the package linux - 4.4.0-103.126

---------------
linux (4.4.0-103.126) xenial; urgency=low

  * linux: 4.4.0-103.126 -proposed tracker (LP: #1736181)

  * CVE-2017-1000405
    - mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d()

  * CVE-2017-16939
    - netlink: add a start callback for starting a netlink dump
    - ipsec: Fix aborted xfrm policy dump crash

linux (4.4.0-102.125) xenial; urgency=low

  * linux: 4.4.0-102.125 -proposed tracker (LP: #1733541)

  * tar -x sometimes fails on overlayfs (LP: #1728489)
    - ovl: check if all layers are on the same fs
    - ovl: persistent inode number for directories

  * NVMe timeout is too short (LP: #1729119)
    - nvme: update timeout module parameter type

  * Set PANIC_TIMEOUT=10 on Power Systems (LP: #1730660)
    - [Config]: Set PANIC_TIMEOUT=10 on ppc64el

  * Cannot pair BLE remote devices when using combo BT SoC (LP: #1731467)
    - Bluetooth: increase timeout for le auto connections

  * CIFS errors on 4.4.0-98, but not on 4.4.0-97 with same config (LP: #1729337)
    - SMB3: Validate negotiate request must always be signed

  * Plantronics P610 does not support sample rate reading (LP: #1719853)
    - ALSA: usb-audio: Add sample rate quirk for Plantronics P610

  * Invalid btree pointer causes the kernel NULL pointer dereference
    (LP: #1729256)
    - xfs: reinit btree pointer on attr tree inactivation walk

  * Samba mount/umount in docker container triggers kernel Oops (LP: #1729637)
    - ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER
    - ipv6: fix NULL dereference in ip6_route_dev_notify()

  * [kernel] tty/hvc: Use opal irqchip interface if available (LP: #1728098)
    - tty/hvc: Use opal irqchip interface if available

  * Device hotplugging with MPT SAS cannot work for VMWare ESXi (LP: #1730852)
    - scsi: mptsas: Fixup device hotplug for VMWare ESXi

  * NMI watchdog: BUG: soft lockup on Guest upon boot (KVM) (LP: #1727331)
    - KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread

  * Attempt to map rbd image from ceph jewel/luminous hangs (LP: #1728739)
    - crush: ensure bucket id is valid before indexing buckets array
    - crush: ensure take bucket value is valid
    - crush: add chooseleaf_stable tunable
    - crush: decode and initialize chooseleaf_stable
    - libceph: advertise support for TUNABLES5
    - libceph: MOSDOpReply v7 encoding

  * Xenial update to 4.4.98 stable release (LP: #1732698)
    - adv7604: Initialize drive strength to default when using DT
    - video: fbdev: pmag-ba-fb: Remove bad `__init' annotation
    - PCI: mvebu: Handle changes to the bridge windows while enabled
    - xen/netback: set default upper limit of tx/rx queues to 8
    - drm: drm_minor_register(): Clean up debugfs on failure
    - KVM: PPC: Book 3S: XICS: correct the real mode ICP rejecting counter
    - iommu/arm-smmu-v3: Clear prior settings when updating STEs
    - powerpc/corenet: explicitly disable the SDHC controller on kmcoge4
    - ARM: omap2plus_defconfig: Fix probe errors on UARTs 5 and 6
    - crypto: vmx - disable preemption to enable vsx in aes_ctr.c
    - iio: trigger: free trigger...

Read more...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
David Coronel (davecore) wrote :

FYI, this patch seems to have introduced a regression, see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1737033

Revision history for this message
Syed Armani (dce3062) wrote :

I am facing this issue in kernel version: "4.4.0-104"

root@cc-swi01-tky1:~# uname -r
4.4.0-104-generic

root@cc-swi01-tky1:~# ceph --version
ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable)

root@cc-swi01-tky1:~# sudo rbd create image01 --size 1024 --pool testpool

root@cc-swi01-tky1:~# rbd ls -p testpool
image01

root@cc-swi01-tky1:~# sudo rbd map image01 --pool testpool --name client.admin
rbd: sysfs write failed
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (110) Connection timed out

# dmesg | tail
[516822.815270] libceph: mon0 10.X.8.201:6789 feature set mismatch, my 106b84a842a42 < server's 40106b84a842a42, missing 400000000000000
[516822.842737] libceph: mon0 10.X.8.201:6789 missing required protocol features

Revision history for this message
Aaron (lloyd-peterson) wrote :

Same issue for me:

root@myserver:~# uname -r
4.4.0-112-generic
root@myserver:~# ceph version
ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable)
root@myserver:~# rbd create testimage --size 10G
root@myserver:~# rbd ls -l
NAME SIZE PARENT FMT PROT LOCK
testimage 10240M 2
root@myserver:~# rbd map testimage
rbd: sysfs write failed
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (110) Connection timed out
root@myserver:~# dmesg | tail
[ 1392.958472] libceph: mon0 51.254.215.26:6789 feature set mismatch, my 106b84a842a42 < server's 40106b84a842a42, missing 400000000000000
[ 1392.958721] libceph: mon0 51.254.215.26:6789 missing required protocol features

... same last two messages repeat

Revision history for this message
Aaron (lloyd-peterson) wrote :

Actually, upgrading my kernel to 4.13 fixed the issue for me.

root@myserver:~$ uname -r
4.13.0-32-generic
root@myserver:~$ rbd ls
testimage
root@myserver:~$ rbd map testimage
/dev/rbd0
root@myserver:~$

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.