ceph-osd fails to initialize when encrypt is enabled

Bug #1604501 reported by Chris Holcombe
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Won't Fix
Medium
Unassigned
OpenStack Ceph Charm (Retired)
Won't Fix
High
Unassigned
ceph (Juju Charms Collection)
Invalid
High
Unassigned
ceph (Ubuntu)
Fix Released
Medium
Unassigned
Xenial
Triaged
Medium
Unassigned
Yakkety
Won't Fix
High
Unassigned
Zesty
Triaged
Medium
Unassigned
Artful
Fix Released
Medium
Unassigned
Bionic
Fix Released
Medium
Unassigned
ceph-osd (Juju Charms Collection)
Invalid
High
Unassigned

Bug Description

The config-key put command is called without a cephx user which causes the command to fail. Error log information is:
2016-07-16 05:07:14 INFO mon-relation-changed 2016-07-16 05:07:14.918436 7f67d2797700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring: (2) No such file or directory
2016-07-16 05:07:14 INFO mon-relation-changed 2016-07-16 05:07:14.919173 7f67d2797700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication
2016-07-16 05:07:14 INFO mon-relation-changed 2016-07-16 05:07:14.919315 7f67d2797700 0 librados: client.admin initialization error (2) No such file or directory
2016-07-16 05:07:14 INFO mon-relation-changed Error connecting to cluster: ObjectNotFound
2016-07-16 05:07:14 INFO mon-relation-changed Traceback (most recent call last):
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/sbin/ceph-disk", line 9, in <module>
2016-07-16 05:07:14 INFO mon-relation-changed load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4965, in run
2016-07-16 05:07:14 INFO mon-relation-changed main(sys.argv[1:])
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4918, in main
2016-07-16 05:07:14 INFO mon-relation-changed main_catch(args.func, args)
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4943, in main_catch
2016-07-16 05:07:14 INFO mon-relation-changed func(args)
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 1774, in main
2016-07-16 05:07:14 INFO mon-relation-changed Prepare.factory(args).prepare()
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 1762, in prepare
2016-07-16 05:07:14 INFO mon-relation-changed self.prepare_locked()
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 1793, in prepare_locked
2016-07-16 05:07:14 INFO mon-relation-changed self.lockbox.prepare()
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 2360, in prepare
2016-07-16 05:07:14 INFO mon-relation-changed self.populate()
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 2305, in populate
2016-07-16 05:07:14 INFO mon-relation-changed self.create_key()
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 2264, in create_key
2016-07-16 05:07:14 INFO mon-relation-changed base64_key,
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 439, in command_check_call
2016-07-16 05:07:14 INFO mon-relation-changed return subprocess.check_call(arguments)
2016-07-16 05:07:14 INFO mon-relation-changed File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
2016-07-16 05:07:14 INFO mon-relation-changed raise CalledProcessError(retcode, cmd)
2016-07-16 05:07:14 INFO mon-relation-changed subprocess.CalledProcessError: Command '['/usr/bin/ceph', 'config-key', 'put', 'dm-crypt/osd/e4d2604d-1e38-4563-88ae-c447ffba95e9/luks', '61H4+6InRku1kvSqg23ckK0EsoFL1csn18ONWL8a+1s7r8wLzPUTrRRmuq4D1o1/GZ9UvFVxytPq4pZeA73ZtCeNPWlbIoKeAhZ/gbK6g1YXPjpICOmxx7aSckIO212faMiHG+jLbIAzeekhK7AKT+rxGWXYh2wYVX3rxn4dKik=']' returned non-zero exit status 1
2016-07-16 05:07:14 INFO worker.uniter.jujuc server.go:172 running hook tool "juju-log" ["-l" "ERROR" "Unable to initialize device: /dev/vdb"]
2016-07-16 05:07:14 DEBUG worker.uniter.jujuc server.go:173 hook context id "ceph-osd/0-mon-relation-changed-4972863589887456037"; dir "/var/lib/juju/agents/unit-ceph-osd-0/charm"
2016-07-16 05:07:14 ERROR juju-log mon:1: Unable to initialize device: /dev/vdb
2016-07-16 05:07:14 INFO mon-relation-changed Traceback (most recent call last):
2016-07-16 05:07:14 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/mon-relation-changed", line 614, in <module>
2016-07-16 05:07:14 INFO mon-relation-changed hooks.execute(sys.argv)
2016-07-16 05:07:14 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/core/hookenv.py", line 715, in execute
2016-07-16 05:07:14 INFO mon-relation-changed self._hooks[hook_name]()
2016-07-16 05:07:14 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/mon-relation-changed", line 545, in mon_relation
2016-07-16 05:07:14 INFO mon-relation-changed prepare_disks_and_activate()
2016-07-16 05:07:14 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/mon-relation-changed", line 454, in prepare_disks_and_activate
2016-07-16 05:07:14 INFO mon-relation-changed config('osd-encrypt'))
2016-07-16 05:07:14 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/ceph.py", line 997, in osdize
2016-07-16 05:07:14 INFO mon-relation-changed reformat_osd, ignore_errors, encrypt)
2016-07-16 05:07:14 INFO mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/ceph.py", line 1052, in osdize_dev
2016-07-16 05:07:14 INFO mon-relation-changed raise e
2016-07-16 05:07:14 INFO mon-relation-changed subprocess.CalledProcessError: Command '['ceph-disk', 'prepare', '--dmcrypt', '--fs-type', u'xfs', u'/dev/vdb']' returned non-zero exit status 1

Tags: cpe-onsite
Changed in ceph-radosgw (Juju Charms Collection):
assignee: nobody → Chris Holcombe (xfactor973)
affects: ceph-radosgw (Juju Charms Collection) → ceph-osd (Juju Charms Collection)
Changed in ceph-osd (Juju Charms Collection):
assignee: Chris Holcombe (xfactor973) → nobody
assignee: nobody → Chris Holcombe (xfactor973)
importance: Undecided → High
assignee: Chris Holcombe (xfactor973) → nobody
assignee: nobody → Chris Holcombe (xfactor973)
Revision history for this message
Chris Holcombe (xfactor973) wrote :

ceph-disk prepare doesn't appear to have a flag to set the cephx user to use. I have to have to create an admin key that only contains the config-key put command. In the mean time I'm going to file a bug against ceph-disk prepare

Ryan Beisner (1chb1n)
Changed in ceph-osd (Juju Charms Collection):
status: New → Confirmed
milestone: none → 16.07
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Update: This is affecting Trusty-Mitaka and Xenial-Mitaka (Jewel). ceph-disk prepare succeeds with <= Hammer (Trusty-Liberty or earlier appear to be OK). Newton is untested.

Ryan Beisner (1chb1n)
Changed in ceph (Juju Charms Collection):
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Chris Holcombe (xfactor973)
milestone: none → 16.07
Revision history for this message
Chris Holcombe (xfactor973) wrote :
Liam Young (gnuoy)
Changed in ceph-osd (Juju Charms Collection):
milestone: 16.07 → 16.10
Changed in ceph (Juju Charms Collection):
milestone: 16.07 → 16.10
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Related, pasting from #openstack-meeting discussion today for reference:

<icey> https://github.com/ceph/ceph/pull/10382 and http://tracker.ceph.com/issues/17421

James Page (james-page)
Changed in ceph-osd (Juju Charms Collection):
milestone: 16.10 → 17.01
Changed in ceph (Juju Charms Collection):
milestone: 16.10 → 17.01
James Page (james-page)
Changed in ceph (Ubuntu):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Chris Holcombe (xfactor973)
Changed in ceph (Ubuntu Yakkety):
status: New → Triaged
Changed in ceph (Ubuntu Xenial):
status: New → Triaged
Changed in ceph (Ubuntu Yakkety):
importance: Undecided → High
Changed in ceph (Ubuntu Xenial):
importance: Undecided → High
James Page (james-page)
Changed in ceph-osd (Juju Charms Collection):
status: Confirmed → Triaged
Changed in ceph (Juju Charms Collection):
status: Confirmed → Triaged
Revision history for this message
James Page (james-page) wrote :

This is a target for v10.2.6 which will be up for release next.

Changed in ceph (Juju Charms Collection):
milestone: 17.01 → none
Changed in ceph-osd (Juju Charms Collection):
milestone: 17.01 → none
James Page (james-page)
Changed in charm-ceph:
assignee: nobody → Chris Holcombe (xfactor973)
importance: Undecided → High
status: New → Triaged
Changed in ceph (Juju Charms Collection):
status: Triaged → Invalid
Changed in charm-ceph-osd:
assignee: nobody → Chris Holcombe (xfactor973)
importance: Undecided → High
status: New → Triaged
Changed in ceph-osd (Juju Charms Collection):
status: Triaged → Invalid
Revision history for this message
James Page (james-page) wrote :

OK so as of Luminous release it would appear that the default bootstrap-osd key profile gets permissions to store/retrive dm-crypt keys from the ceph-mon cluster:

  https://github.com/ceph/ceph/commit/88ce18da901b7e9aad621d22839fd28de0af9c51

Revision history for this message
James Page (james-page) wrote :

(access is scoped to a specific path prefix for keys which is good).

Changed in ceph (Ubuntu):
status: Triaged → Fix Released
assignee: Chris Holcombe (xfactor973) → nobody
Changed in ceph (Juju Charms Collection):
assignee: Chris Holcombe (xfactor973) → nobody
Changed in ceph (Ubuntu Zesty):
assignee: Chris Holcombe (xfactor973) → nobody
Changed in ceph-osd (Juju Charms Collection):
assignee: Chris Holcombe (xfactor973) → nobody
Changed in charm-ceph:
assignee: Chris Holcombe (xfactor973) → nobody
Changed in charm-ceph-osd:
assignee: Chris Holcombe (xfactor973) → nobody
Changed in charm-ceph:
status: Triaged → Won't Fix
Changed in charm-ceph-osd:
importance: High → Medium
Changed in ceph (Ubuntu Yakkety):
status: Triaged → Won't Fix
Revision history for this message
James Page (james-page) wrote :
Download full text (4.6 KiB)

I think the keying permissions are OK now; however ceph-disk tries to double zap the block device resulting in:

2017-10-20 16:07:45 INFO juju-log mon:1: osdize cmd: ['ceph-disk', 'prepare', '--dmcrypt', '--fs-type', u'xfs', '--zap-disk', '--filestore', u'/dev/vdb']
2017-10-20 16:07:46 DEBUG mon-relation-changed Creating new GPT entries.
2017-10-20 16:07:46 DEBUG mon-relation-changed Setting name!
2017-10-20 16:07:46 DEBUG mon-relation-changed partNum is 4
2017-10-20 16:07:46 DEBUG mon-relation-changed REALLY setting name!
2017-10-20 16:07:46 DEBUG mon-relation-changed The operation has completed successfully.
2017-10-20 16:07:46 DEBUG mon-relation-changed mke2fs 1.42.13 (17-May-2015)
2017-10-20 16:07:46 DEBUG mon-relation-changed Creating filesystem with 10240 1k blocks and 2560 inodes
2017-10-20 16:07:46 DEBUG mon-relation-changed Filesystem UUID: d3dd2964-ea6c-4812-8218-3264dbfd6406
2017-10-20 16:07:46 DEBUG mon-relation-changed Superblock backups stored on blocks:
2017-10-20 16:07:46 DEBUG mon-relation-changed 8193
2017-10-20 16:07:46 DEBUG mon-relation-changed
2017-10-20 16:07:46 DEBUG mon-relation-changed Allocating group tables: 0/2^H^H^H ^H^H^Hdone
2017-10-20 16:07:46 DEBUG mon-relation-changed Writing inode tables: 0/2^H^H^H ^H^H^Hdone
2017-10-20 16:07:46 DEBUG mon-relation-changed Creating journal (1024 blocks): done
2017-10-20 16:07:46 DEBUG mon-relation-changed Writing superblocks and filesystem accounting information: 0/2^H^H^H ^H^H^Hdone
2017-10-20 16:07:46 DEBUG mon-relation-changed
2017-10-20 16:07:47 DEBUG mon-relation-changed creating /var/lib/ceph/osd-lockbox/94bad3e2-b9a3-45d8-aa9f-7bae30a36167/keyring
2017-10-20 16:07:47 DEBUG mon-relation-changed added entity client.osd-lockbox.94bad3e2-b9a3-45d8-aa9f-7bae30a36167 auth auth(auid = 18446744073709551615 key=AQBSH+pZetCjNxAAeJ2OsQ8y1dIreKEjdwkCXQ== with 0 caps)
2017-10-20 16:07:47 DEBUG mon-relation-changed creating /var/lib/ceph/osd-lockbox/94bad3e2-b9a3-45d8-aa9f-7bae30a36167/osd_keyring
2017-10-20 16:07:47 DEBUG mon-relation-changed added entity osd.0 auth auth(auid = 18446744073709551615 key=AQBSH+pZpYkSMxAAWXV25MOEBx2ghn2KN9JmkQ== with 0 caps)
2017-10-20 16:07:48 DEBUG mon-relation-changed Warning: The kernel is still using the old partition table.
2017-10-20 16:07:48 DEBUG mon-relation-changed The new table will be used at the next reboot or after you
2017-10-20 16:07:48 DEBUG mon-relation-changed run partprobe(8) or kpartx(8)
2017-10-20 16:07:48 DEBUG mon-relation-changed The operation has completed successfully.
2017-10-20 16:07:48 DEBUG mon-relation-changed wipefs: error: /dev/vdb5: probing initialization failed: Device or resource busy
2017-10-20 16:07:48 DEBUG mon-relation-changed ceph-disk: Error: Command '['/sbin/wipefs', '--all', '/dev/vdb5']' returned non-zero exit status 1
2017-10-20 16:07:48 ERROR juju-log mon:1: Unable to initialize device: /dev/vdb
2017-10-20 16:07:48 DEBUG mon-relation-changed Traceback (most recent call last):
2017-10-20 16:07:48 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/mon-relation-changed", line 541, in <module>
2017-10-20 16:07:48 DEBUG mon-relation-changed hooks.execute(sys.argv)...

Read more...

Revision history for this message
James Page (james-page) wrote :

Looks like the issue is here:

 https://github.com/ceph/ceph/blob/luminous/src/ceph-disk/ceph_disk/main.py#L2085

the lockbox is prepared first; and then the main data prepare tries to re-zap the disk, but /dev/<device>5 is already mounted by the first step.

Changed in ceph (Ubuntu):
status: Fix Released → Triaged
Changed in ceph (Ubuntu Xenial):
importance: High → Medium
Changed in ceph (Ubuntu Zesty):
importance: High → Medium
Changed in ceph (Ubuntu Artful):
importance: High → Medium
Revision history for this message
James Page (james-page) wrote :
Revision history for this message
James Page (james-page) wrote :

Raised a new issue upstream:

  http://tracker.ceph.com/issues/21879

Revision history for this message
James Page (james-page) wrote :

Pre-zapping the block device and not passing --zap-disk works around the issue.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

James,

I've already raised one a few months ago http://tracker.ceph.com/issues/20555 - seems like we have gone through identical investigations.

I can confirm that I had success with pre-zapping via wipefs and not passing zap to ceph-disk as well (documented in 20555).

James Page (james-page)
Changed in ceph (Ubuntu Artful):
status: Triaged → Fix Released
Changed in ceph (Ubuntu Bionic):
status: Triaged → Fix Released
tags: added: cpe-onsite
Revision history for this message
James Page (james-page) wrote :

Later ceph releases (and charms) do support native ceph block device encryption.

Changed in charm-ceph-osd:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.