Problem Bootstrapping ceph from Partitions due to stale part. table

Bug #1589309 reported by Steve Hindle
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Undecided
Unassigned

Bug Description

It appears kolla-ceph is having problems bootstrapping partitions due to stale kernel/sys info.
The bootstrap process looks for 'magic names' in the partition table AND CHANGES THEM. This seems
to work fine. However, the next phase (start_osds.yml) looks for the NEW partition names, and gets the old names from /sys. This causes it to fail startup.

As a potential workaround, I'm going to try modifying is_dev_matched_by_name in docker/kolla-toolbox/find_disks.py to shell out to sgdisk to read partition names. also, it appears https://github.com/openstack/kolla/blob/stable/mitaka/ansible/roles/ceph/tasks/start_osds.yml#L21
should be with_items: "{{ osds }}" (needs the wrapping quote-tag )

The output below shows a 're-run' where ceph had been deployed once, and the partition names were manually reset for the next run. Notice that since the box was NOT rebooted, the /sys info is stale:
stack@c1n7:~$ sudo sgdisk -p /dev/sda
Disk /dev/sda: 250069680 sectors, 119.2 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): CEA98805-36A2-4FF6-9357-5999AA267F48
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 250069646
Partitions will be aligned on 1-sector boundaries
Total free space is 16 sectors (8.0 KiB)

Number Start (sector) End (sector) Size Code Name
   1 34 1987 977.0 KiB EF02
   2 1988 160001988 76.3 GiB EF00
   3 160001989 170001989 4.8 GiB 8300 KOLLA_CEPH_OSD_BOOTSTRA
   4 170001990 250069630 38.2 GiB 8300 KOLLA_CEPH_OSD_BOOTSTRA
stack@c1n7:~$ sudo partprobe /dev/sda
stack@c1n7:~$ python ./test_steve.py
checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda') named
checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1') named
checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda2') named
checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda3') named KOLLA_CEPH_DATA_0_J
checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda4') named KOLLA_CEPH_DATA_2
checking Device(u'/sys/devices/virtual/block/loop0') named

Revision history for this message
Steve Hindle (shindle) wrote :

The following patches _seem_ to fix the problem for me - I need to rebuild the cluster from scratch and do a 'bare metal' test.

Blah - have to attach 1 patch at a time :-/

Revision history for this message
Steve Hindle (shindle) wrote :

This patch changes find_disks.py to shell out to sgdisk to get the 'real' partition names...
note, this requires ROOT access

Revision history for this message
Steve Hindle (shindle) wrote :

This patch adds 'sgdisk' to the kolla-toolbox image, and changes it to run as root.

Revision history for this message
Steve Hindle (shindle) wrote :

Note: the patch in comment #1 (https://bugs.launchpad.net/kolla/+bug/1589309/comments/1) is NOT required. I guess ansible screaming at me to wrap stuff in quote-tags is just to annoy me :-P

Revision history for this message
Jeffrey Zhang (jeffrey4l) wrote :

I have no issue for the kolla ceph( at least in the mitaka branch, iirc, we never change
the master branch for this ceph).

So could u show me how to reproduce this?

Revision history for this message
Swapnil Kulkarni (coolsvap-deactivatedaccount) wrote :

Please provide steps and available log if possible.

Revision history for this message
Paul Bourke (pauldbourke) wrote :

Hi Steve, can you post the following info:

* Host OS and version
* Version of udev

I have also seen issues around partition label detection possibly relating to outdated versions of udev on older distros (https://bugs.launchpad.net/kolla/+bug/1585185). Potentially the udev method of accessing disks is unreliable, though yours is the first other complaint I've seen which is why I'm hoping to get more info on the kinds of setups it's not working on.

Revision history for this message
Steve Hindle (shindle) wrote : Re: [Bug 1589309] Re: Problem Bootstrapping ceph from Partitions due to stale part. table
Download full text (4.3 KiB)

Hi Paul,

  It occurs with fresh builds of ubuntu 14.04:
root@c1n1:/home/kolla/kolla/src# dpkg -l | grep udev
ii libudev1:amd64 204-5ubuntu20.19
  amd64 libudev shared library
ii udev 204-5ubuntu20.19
  amd64 /dev/ and hotplug management daemon
root@c1n1:/home/kolla/kolla/src# more /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS"

On Thu, Jun 9, 2016 at 3:37 AM, Paul Bourke <email address hidden> wrote:
> Hi Steve, can you post the following info:
>
> * Host OS and version
> * Version of udev
>
> I have also seen issues around partition label detection possibly
> relating to outdated versions of udev on older distros
> (https://bugs.launchpad.net/kolla/+bug/1585185). Potentially the udev
> method of accessing disks is unreliable, though yours is the first other
> complaint I've seen which is why I'm hoping to get more info on the
> kinds of setups it's not working on.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1589309
>
> Title:
> Problem Bootstrapping ceph from Partitions due to stale part. table
>
> Status in kolla:
> New
>
> Bug description:
>
> It appears kolla-ceph is having problems bootstrapping partitions due to stale kernel/sys info.
> The bootstrap process looks for 'magic names' in the partition table AND CHANGES THEM. This seems
> to work fine. However, the next phase (start_osds.yml) looks for the NEW partition names, and gets the old names from /sys. This causes it to fail startup.
>
> As a potential workaround, I'm going to try modifying is_dev_matched_by_name in docker/kolla-toolbox/find_disks.py to shell out to sgdisk to read partition names. also, it appears https://github.com/openstack/kolla/blob/stable/mitaka/ansible/roles/ceph/tasks/start_osds.yml#L21
> should be with_items: "{{ osds }}" (needs the wrapping quote-tag )
>
>
> The output below shows a 're-run' where ceph had been deployed once, and the partition names were manually reset for the next run. Notice that since the box was NOT rebooted, the /sys info is stale:
> stack@c1n7:~$ sudo sgdisk -p /dev/sda
> Disk /dev/sda: 250069680 sectors, 119.2 GiB
> Logical sector size: 512 bytes
> Disk identifier (GUID): CEA98805-36A2-4FF6-9357-5999AA267F48
> Partition table holds up to 128 entries
> First usable sector is 34, last usable sector is 250069646
> Partitions will be aligned on 1-sector boundaries
> Total free space is 16 sectors (8.0 KiB)
>
> Number Start (sector) End (sector) Size Code Name
> 1 34 1987 977.0 KiB EF02
> 2 1988 160001988 76.3 GiB EF00
> 3 160001989 170001989 4.8 GiB 8300 KOLLA_CEPH_OSD_BOOTSTRA
> 4 170001990 250069630 38.2 GiB 8300 KOLLA_CEPH_OSD_BOOTSTRA
> stack@c1n7:~$ sudo partprobe /dev/sda
> stack@c1n7:~$ python ./test_steve.py
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda') named
> checking Device(u'/sys/device...

Read more...

Revision history for this message
Jeffrey Zhang (jeffrey4l) wrote :

Is anyone using ubuntu as host OS? Have u ever see this kind of issue?
I am using CentOS 7

$ rpm -qa | grep udev
python-pyudev-0.15-7.el7_2.1.noarch
libgudev1-219-19.el7_2.9.x86_64

I never hit such issue.

Revision history for this message
Steve Hindle (shindle) wrote :
Download full text (4.1 KiB)

Yes, I use ubuntu as the host OS. You will see the issue if you try
to deploy kolla 2 times without rebooting the nodes - deploy,
cleanup-containers, cleanup-images, cleanup-host, RESET the partition
names, and deploy again...

another thing you can try is running sgdisk and changing the partition
names - then reading them back from /dev/disk/by-partlabel/ - notice
it shows the stale partition names? run partprobe on the device - now
look at /dev/disk/by-partlable again - still stale...

On Fri, Jun 10, 2016 at 5:01 PM, Jeffrey Zhang
<email address hidden> wrote:
> Is anyone using ubuntu as host OS? Have u ever see this kind of issue?
> I am using CentOS 7
>
> $ rpm -qa | grep udev
> python-pyudev-0.15-7.el7_2.1.noarch
> libgudev1-219-19.el7_2.9.x86_64
>
> I never hit such issue.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1589309
>
> Title:
> Problem Bootstrapping ceph from Partitions due to stale part. table
>
> Status in kolla:
> New
>
> Bug description:
>
> It appears kolla-ceph is having problems bootstrapping partitions due to stale kernel/sys info.
> The bootstrap process looks for 'magic names' in the partition table AND CHANGES THEM. This seems
> to work fine. However, the next phase (start_osds.yml) looks for the NEW partition names, and gets the old names from /sys. This causes it to fail startup.
>
> As a potential workaround, I'm going to try modifying is_dev_matched_by_name in docker/kolla-toolbox/find_disks.py to shell out to sgdisk to read partition names. also, it appears https://github.com/openstack/kolla/blob/stable/mitaka/ansible/roles/ceph/tasks/start_osds.yml#L21
> should be with_items: "{{ osds }}" (needs the wrapping quote-tag )
>
>
> The output below shows a 're-run' where ceph had been deployed once, and the partition names were manually reset for the next run. Notice that since the box was NOT rebooted, the /sys info is stale:
> stack@c1n7:~$ sudo sgdisk -p /dev/sda
> Disk /dev/sda: 250069680 sectors, 119.2 GiB
> Logical sector size: 512 bytes
> Disk identifier (GUID): CEA98805-36A2-4FF6-9357-5999AA267F48
> Partition table holds up to 128 entries
> First usable sector is 34, last usable sector is 250069646
> Partitions will be aligned on 1-sector boundaries
> Total free space is 16 sectors (8.0 KiB)
>
> Number Start (sector) End (sector) Size Code Name
> 1 34 1987 977.0 KiB EF02
> 2 1988 160001988 76.3 GiB EF00
> 3 160001989 170001989 4.8 GiB 8300 KOLLA_CEPH_OSD_BOOTSTRA
> 4 170001990 250069630 38.2 GiB 8300 KOLLA_CEPH_OSD_BOOTSTRA
> stack@c1n7:~$ sudo partprobe /dev/sda
> stack@c1n7:~$ python ./test_steve.py
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda') named
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1') named
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda2') named
> checking Device(u'/sys/dev...

Read more...

Revision history for this message
Jeffrey Zhang (jeffrey4l) wrote :

see my test using sgdisk: I can not reproduce this.

root@ubuntu:~# sgdisk -p /dev/vdb
Creating new GPT entries.
Disk /dev/vdb: 41943040 sectors, 20.0 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): AB2B3575-58E9-4696-9276-DFD92A9FC2BF
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 41943006
Partitions will be aligned on 2048-sector boundaries
Total free space is 41942973 sectors (20.0 GiB)

Number Start (sector) End (sector) Size Code Name
root@ubuntu:~# sgdisk -n1:2048:204800 -c1:KOLLA_BOOTSTRAP /dev/vdb
Creating new GPT entries.
The operation has completed successfully.
root@ubuntu:~# ls -alh /dev/disk/by-partlabel/
total 0
drwxr-xr-x 2 root root 60 Jun 11 10:00 .
drwxr-xr-x 6 root root 120 Jun 11 10:00 ..
lrwxrwxrwx 1 root root 10 Jun 11 10:00 KOLLA_BOOTSTRAP -> ../../vdb1

root@ubuntu:~# sgdisk -Z /dev/vdb
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
root@ubuntu:~# ls -alh /dev/disk/by-partlabel/
ls: cannot access /dev/disk/by-partlabel/: No such file or directory

root@ubuntu:~# sgdisk -n1:20480:204800 -c1:KOLLA_BOOTSTRAP_NEW /dev/vdb
Creating new GPT entries.
The operation has completed successfully.
root@ubuntu:~# sgdisk -p /dev/vdb
Disk /dev/vdb: 41943040 sectors, 20.0 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 26B03061-E3E4-436F-A463-8A37D7BAF939
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 41943006
Partitions will be aligned on 2048-sector boundaries
Total free space is 41758652 sectors (19.9 GiB)

Number Start (sector) End (sector) Size Code Name
   1 20480 204800 90.0 MiB 8300 KOLLA_BOOTSTRAP_NEW
root@ubuntu:~# ls -alh /dev/disk/by-partlabel/
total 0
drwxr-xr-x 2 root root 60 Jun 11 10:01 .
drwxr-xr-x 6 root root 120 Jun 11 10:01 ..
lrwxrwxrwx 1 root root 10 Jun 11 10:01 KOLLA_BOOTSTRAP_NEW -> ../../vdb1

Revision history for this message
Jeffrey Zhang (jeffrey4l) wrote :

btw, the test env is

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.2 LTS
Release: 14.04
Codename: trusty

Revision history for this message
Steve Hindle (shindle) wrote :
Download full text (3.8 KiB)

Umm - is /dev/sdb actually in use? I didn't see the kernel whining
that it couldn't reload the partition table and changes would take
effect on next boot, etc etc

This problem manifests in that situation (eg when the kernel refuses
to re-read the partition table because the device is in use or
whatever)

On Sat, Jun 11, 2016 at 1:04 AM, Jeffrey Zhang
<email address hidden> wrote:
> btw, the test env is
>
> $ lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 14.04.2 LTS
> Release: 14.04
> Codename: trusty
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1589309
>
> Title:
> Problem Bootstrapping ceph from Partitions due to stale part. table
>
> Status in kolla:
> New
>
> Bug description:
>
> It appears kolla-ceph is having problems bootstrapping partitions due to stale kernel/sys info.
> The bootstrap process looks for 'magic names' in the partition table AND CHANGES THEM. This seems
> to work fine. However, the next phase (start_osds.yml) looks for the NEW partition names, and gets the old names from /sys. This causes it to fail startup.
>
> As a potential workaround, I'm going to try modifying is_dev_matched_by_name in docker/kolla-toolbox/find_disks.py to shell out to sgdisk to read partition names. also, it appears https://github.com/openstack/kolla/blob/stable/mitaka/ansible/roles/ceph/tasks/start_osds.yml#L21
> should be with_items: "{{ osds }}" (needs the wrapping quote-tag )
>
>
> The output below shows a 're-run' where ceph had been deployed once, and the partition names were manually reset for the next run. Notice that since the box was NOT rebooted, the /sys info is stale:
> stack@c1n7:~$ sudo sgdisk -p /dev/sda
> Disk /dev/sda: 250069680 sectors, 119.2 GiB
> Logical sector size: 512 bytes
> Disk identifier (GUID): CEA98805-36A2-4FF6-9357-5999AA267F48
> Partition table holds up to 128 entries
> First usable sector is 34, last usable sector is 250069646
> Partitions will be aligned on 1-sector boundaries
> Total free space is 16 sectors (8.0 KiB)
>
> Number Start (sector) End (sector) Size Code Name
> 1 34 1987 977.0 KiB EF02
> 2 1988 160001988 76.3 GiB EF00
> 3 160001989 170001989 4.8 GiB 8300 KOLLA_CEPH_OSD_BOOTSTRA
> 4 170001990 250069630 38.2 GiB 8300 KOLLA_CEPH_OSD_BOOTSTRA
> stack@c1n7:~$ sudo partprobe /dev/sda
> stack@c1n7:~$ python ./test_steve.py
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda') named
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1') named
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda2') named
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda3') named KOLLA_CEPH_DATA_0_J
> checking Device(u'/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda4') named KOLLA...

Read more...

Revision history for this message
Steve Hindle (shindle) wrote :
Download full text (5.4 KiB)

Here's an example:
root@c1n7:~# sgdisk -i 3 /dev/sda
Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
Partition unique GUID: BCBA3F76-9E2A-4BE0-9CF8-1B05BCCCA9E3
First sector: 160001989 (at 76.3 GiB)
Last sector: 170001989 (at 81.1 GiB)
Partition size: 10000001 sectors (4.8 GiB)
Attribute flags: 0000000000000000
Partition name: 'KOLLA_CEPH_OSD_BOOTSTRAP_1_J'
root@c1n7:~# sgdisk -c 3:FOO /dev/sda
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
root@c1n7:~# sgdisk -i 3 /dev/sda
Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
Partition unique GUID: BCBA3F76-9E2A-4BE0-9CF8-1B05BCCCA9E3
First sector: 160001989 (at 76.3 GiB)
Last sector: 170001989 (at 81.1 GiB)
Partition size: 10000001 sectors (4.8 GiB)
Attribute flags: 0000000000000000
Partition name: 'FOO'
root@c1n7:~# s -alh /dev/disk/by-partlabel/
s: command not found
root@c1n7:~# ls -alh /dev/disk/by-partlabel/
total 0
drwxr-xr-x 2 root root 80 Jun 11 03:44 .
drwxr-xr-x 6 root root 120 Jun 10 02:52 ..
lrwxrwxrwx 1 root root 10 Jun 11 03:44 KOLLA_CEPH_DATA_0_J -> ../../sda3
lrwxrwxrwx 1 root root 10 Jun 11 02:39 KOLLA_CEPH_OSD_BOOTSTRAP_1 -> ../../sda4
root@c1n7:~#

On Sat, Jun 11, 2016 at 3:47 AM, Stephen Hindle <email address hidden> wrote:
> Umm - is /dev/sdb actually in use? I didn't see the kernel whining
> that it couldn't reload the partition table and changes would take
> effect on next boot, etc etc
>
> This problem manifests in that situation (eg when the kernel refuses
> to re-read the partition table because the device is in use or
> whatever)
>
>
> On Sat, Jun 11, 2016 at 1:04 AM, Jeffrey Zhang
> <email address hidden> wrote:
>> btw, the test env is
>>
>> $ lsb_release -a
>> No LSB modules are available.
>> Distributor ID: Ubuntu
>> Description: Ubuntu 14.04.2 LTS
>> Release: 14.04
>> Codename: trusty
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1589309
>>
>> Title:
>> Problem Bootstrapping ceph from Partitions due to stale part. table
>>
>> Status in kolla:
>> New
>>
>> Bug description:
>>
>> It appears kolla-ceph is having problems bootstrapping partitions due to stale kernel/sys info.
>> The bootstrap process looks for 'magic names' in the partition table AND CHANGES THEM. This seems
>> to work fine. However, the next phase (start_osds.yml) looks for the NEW partition names, and gets the old names from /sys. This causes it to fail startup.
>>
>> As a potential workaround, I'm going to try modifying is_dev_matched_by_name in docker/kolla-toolbox/find_disks.py to shell out to sgdisk to read partition names. also, it appears https://github.com/openstack/kolla/blob/stable/mitaka/ansible/roles/ceph/tasks/start_osds.yml#L21
>> should be with_items: "{{ osds }}" (needs the wrapping quote-tag )
>>
>>
>> The output below shows a 're-run' where ceph had been deployed once, and the partition names were manually reset for the next run. Notice that since the box was NOT rebooted, the /sys info is stale:
>> stack@c1n7:~$ sudo...

Read more...

Revision history for this message
Steve Hindle (shindle) wrote :
Download full text (6.4 KiB)

btw - running partprobe doesn't fix it either:
root@c1n7:~# ls -alh /dev/disk/by-partlabel/
total 0
drwxr-xr-x 2 root root 80 Jun 11 03:44 .
drwxr-xr-x 6 root root 120 Jun 10 02:52 ..
lrwxrwxrwx 1 root root 10 Jun 11 03:44 KOLLA_CEPH_DATA_0_J -> ../../sda3
lrwxrwxrwx 1 root root 10 Jun 11 02:39 KOLLA_CEPH_OSD_BOOTSTRAP_1 -> ../../sda4
root@c1n7:~# partprobe /dev/sda
root@c1n7:~# ls -alh /dev/disk/by-partlabel/
total 0
drwxr-xr-x 2 root root 80 Jun 11 03:44 .
drwxr-xr-x 6 root root 120 Jun 10 02:52 ..
lrwxrwxrwx 1 root root 10 Jun 11 03:44 KOLLA_CEPH_DATA_0_J -> ../../sda3
lrwxrwxrwx 1 root root 10 Jun 11 02:39 KOLLA_CEPH_OSD_BOOTSTRAP_1 -> ../../sda4

On Sat, Jun 11, 2016 at 3:54 AM, Stephen Hindle <email address hidden> wrote:
> Here's an example:
> root@c1n7:~# sgdisk -i 3 /dev/sda
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
> Partition unique GUID: BCBA3F76-9E2A-4BE0-9CF8-1B05BCCCA9E3
> First sector: 160001989 (at 76.3 GiB)
> Last sector: 170001989 (at 81.1 GiB)
> Partition size: 10000001 sectors (4.8 GiB)
> Attribute flags: 0000000000000000
> Partition name: 'KOLLA_CEPH_OSD_BOOTSTRAP_1_J'
> root@c1n7:~# sgdisk -c 3:FOO /dev/sda
> Warning: The kernel is still using the old partition table.
> The new table will be used at the next reboot.
> The operation has completed successfully.
> root@c1n7:~# sgdisk -i 3 /dev/sda
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
> Partition unique GUID: BCBA3F76-9E2A-4BE0-9CF8-1B05BCCCA9E3
> First sector: 160001989 (at 76.3 GiB)
> Last sector: 170001989 (at 81.1 GiB)
> Partition size: 10000001 sectors (4.8 GiB)
> Attribute flags: 0000000000000000
> Partition name: 'FOO'
> root@c1n7:~# s -alh /dev/disk/by-partlabel/
> s: command not found
> root@c1n7:~# ls -alh /dev/disk/by-partlabel/
> total 0
> drwxr-xr-x 2 root root 80 Jun 11 03:44 .
> drwxr-xr-x 6 root root 120 Jun 10 02:52 ..
> lrwxrwxrwx 1 root root 10 Jun 11 03:44 KOLLA_CEPH_DATA_0_J -> ../../sda3
> lrwxrwxrwx 1 root root 10 Jun 11 02:39 KOLLA_CEPH_OSD_BOOTSTRAP_1 -> ../../sda4
> root@c1n7:~#
>
> On Sat, Jun 11, 2016 at 3:47 AM, Stephen Hindle <email address hidden> wrote:
>> Umm - is /dev/sdb actually in use? I didn't see the kernel whining
>> that it couldn't reload the partition table and changes would take
>> effect on next boot, etc etc
>>
>> This problem manifests in that situation (eg when the kernel refuses
>> to re-read the partition table because the device is in use or
>> whatever)
>>
>>
>> On Sat, Jun 11, 2016 at 1:04 AM, Jeffrey Zhang
>> <email address hidden> wrote:
>>> btw, the test env is
>>>
>>> $ lsb_release -a
>>> No LSB modules are available.
>>> Distributor ID: Ubuntu
>>> Description: Ubuntu 14.04.2 LTS
>>> Release: 14.04
>>> Codename: trusty
>>>
>>> --
>>> You received this bug notification because you are subscribed to the bug
>>> report.
>>> https://bugs.launchpad.net/bugs/1589309
>>>
>>> Title:
>>> Problem Bootstrapping ceph from Partitions due to stale part. table
>>>
>>> Status in kolla:
>>> New
>>>
>>> Bug description:
>>>
>>> It appears kolla-ceph is having problems bootstrapping partitions due to stale kernel/sys info.
>>> The bootstrap pr...

Read more...

Revision history for this message
Jeffrey Zhang (jeffrey4l) wrote :

after talked with Steve Hindle in IRC. Get some clue.

Steve is trying to change a online drive( root disk ) partition name. The label is
not updated in the kernel, because it is in used. But after run `udevadm trigger`
the new label is shown up.

even if it do not work. I think it is a bug in udev. we(kolla) should not handle this case. 
if the udev still no idea for the new label. It may cause other issue in the future.

Changed in kolla:
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/334970

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/335046

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on kolla (master)

Change abandoned by Paul Bourke (<email address hidden>) on branch: master
Review: https://review.openstack.org/335046
Reason: dupe of 334970

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Paul Bourke (<email address hidden>) on branch: master
Review: https://review.openstack.org/334970
Reason: Thanks for the reviews on this all.

After discussing all the different partition schemes allowed by the existing label based approach, I've concluded this patch is not a good substitute. What Steve has in https://review.openstack.org/#/c/326609/ solves my particular problem, though we may need to revisit this if udev continues to cause problems.

Changed in kolla:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.