ephemeral0 of /dev/sda1 triggers 'did not find entry for sda1 in /sys/block'

Bug #1263294 reported by Robert Collins on 2013-12-21
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
cloud-init
Medium
Unassigned
tripleo
High
Unassigned

Bug Description

This is due to line 227 of ./cloudinit/config/cc_mounts.py::

    short_name = os.path.basename(device)
    sys_path = "/sys/block/%s" % short_name

    if not os.path.exists(sys_path):
        LOG.debug("did not find entry for %s in /sys/block", short_name)
        return None

The sys path for /dev/sda1 is /sys/block/sda/sda1.

Robert Collins (lifeless) wrote :

This can calculate the correct /sys/block path.

    short_name = os.path.basename(device)
    if short_name[-1].isdigit():
        for offset in range(1, len(short_name)):base_dev = ''
            if not short_name[-offset].isdigit():
                break
        sys_path = "/sys/block/%s/%s" % (short_name[:-offset+1], short_name)
    else:
        sys_path = "/sys/block/%s" % short_name

Robert Collins (lifeless) wrote :

Well, in some cases. The code needs tests badly ;)

Robert Collins (lifeless) wrote :

I'm failing to understand the current code - I dont' see how it can be even vaguely correct:

    short_name = os.path.basename(device)
    sys_path = "/sys/block/%s" % short_name

    if not os.path.exists(sys_path):
        LOG.debug("did not find entry for %s in /sys/block", short_name)
        return None

    sys_long_path = sys_path + "/" + short_name

if dev = /dev/sda

Then
sys_path = /sys/block/sda
and sys_long_path = /sys/block/sda/sda -- which doesn't ever exist.

and this is then appended to by the 'valid mappings' bit, which means we're expecting partitions to have been split out before calling into this code, but the dev token thing uses an entirely nonstandard '.' separator.

Are we perhaps meant to use '/dev/sda.1' to mean '/dev/sda1' ?

Clint Byrum (clint-fewbar) wrote :

This is blocking ephemeral disk usage on hardware which is critical to our image based updates via rebuild code path.

Changed in tripleo:
status: New → Triaged
importance: Undecided → Critical

Fix proposed to branch: master
Review: https://review.openstack.org/63584

Changed in tripleo:
assignee: nobody → Clint Byrum (clint-fewbar)
status: Triaged → In Progress
Robert Collins (lifeless) wrote :

Using /dev/sda.1:

Dec 21 20:35:55 overcloud-notcompute-4qf5igdly3as [CLOUDINIT] cc_mounts.py[DEBUG]: Attempting to determine the real name of ephemeral0
Dec 21 20:35:55 overcloud-notcompute-4qf5igdly3as [CLOUDINIT] cc_mounts.py[DEBUG]: Ignoring nonexistant default named mount ephemeral0
Dec 21 20:35:55 overcloud-notcompute-4qf5igdly3as [CLOUDINIT] cc_mounts.py[DEBUG]: Attempting to determine the real name of swap
Dec 21 20:35:55 overcloud-notcompute-4qf5igdly3as [CLOUDINIT] DataSourceEc2.py[DEBUG]: Unable to convert swap to a device
Dec 21 20:35:55 overcloud-notcompute-4qf5igdly3as [CLOUDINIT] cc_mounts.py[DEBUG]: Ignoring nonexistant default named mount swap
Dec 21 20:35:55 overcloud-notcompute-4qf5igdly3as [CLOUDINIT] cc_mounts.py[DEBUG]: No modifications to fstab needed.

Reviewed: https://review.openstack.org/63584
Committed: https://git.openstack.org/cgit/openstack/tripleo-image-elements/commit/?id=7d0f9758772c00267581b87018f6081ac806e690
Submitter: Jenkins
Branch: master

commit 7d0f9758772c00267581b87018f6081ac806e690
Author: Clint Byrum <email address hidden>
Date: Sat Dec 21 08:09:46 2013 -0800

    Work around broken cloud-init ephemeral disk code

    Cloud-init cannot handle ephemeral disks like '/dev/sda1'. This prevents
    the nova baremetal default from working properly. So if the ephemeral
    disk isn't already mounted, we mount it just before we need it.

    Change-Id: Ide0e5ed3eff91755aac7d8f1e9c43f723f7bf3d5
    Closes-Bug: #1263294

Changed in tripleo:
status: In Progress → Fix Released
Scott Moser (smoser) wrote :

I agree that code is very hard to understand, and yes, I wrote it.

Changed in cloud-init:
status: New → Triaged
importance: Undecided → Medium

Thanks for responding Scott.

Can I suggest a High importance?

While the number of users impacted is low (OpenStack Nova baremetal/ironic
users who want an ephemeral partition), There is no work-around for it
for us.

Reviewed: https://review.openstack.org/69167
Committed: https://git.openstack.org/cgit/openstack/tripleo-incubator/commit/?id=fdbde78f5df519ce43fa7753e13943077b2c8584
Submitter: Jenkins
Branch: master

commit fdbde78f5df519ce43fa7753e13943077b2c8584
Author: Robert Collins <email address hidden>
Date: Sun Jan 26 17:41:26 2014 +1300

    Workaround bug 1263294 for all images we build.

    The workaround in tripleo-image-elements for bug 1263294 only takes
    effect when an element depends on some of the other got in
    'use-ephemeral'. Explicitly drag use-ephemeral in so that we always
    have the workaround.

    Change-Id: Iacaaba3b71b80292ad8d0d8ad4354f3c8860a07f
    Related-Bug: #1263294

Robert Collins (lifeless) wrote :

Reopening this - once we've configured services to use /mnt, its a fairly fatal error when we nova rebuild the machine (preserving the state partition) but said services start up after cloud init but before our workaround-hack to fix things. E.g. our hack is insufficient.

Changed in tripleo:
status: Fix Released → Triaged
Changed in tripleo:
assignee: Clint Byrum (clint-fewbar) → Michael Kerrin (michael-kerrin-w)
status: Triaged → In Progress
James Polley (tchaypo) on 2014-09-23
Changed in tripleo:
assignee: Michael Kerrin (michael-kerrin-w) → nobody
Clint Byrum (clint-fewbar) wrote :

Bugs with workarounds can be set to 'High'

Changed in tripleo:
importance: Critical → High

Change abandoned by Michael Kerrin (<email address hidden>) on branch: master
Review: https://review.openstack.org/90993
Reason: Moving on

Ben Nemec (bnemec) wrote :

We've moved away from ephemeral partitions in TripleO, so this no longer needs to be fixed there.

Changed in tripleo:
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers