ceph-disk-prepare command always fails; new partition table not avaliable until reboot

Bug #1371526 reported by James Page
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ceph (Ubuntu)
Invalid
Undecided
Unassigned
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

$ sudo ceph-disk-prepare --fs-type xfs --zap-disk /dev/vdb
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.

Warning! One or more CRCs don't match. You should repair the disk!

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
mkfs.xfs: cannot open /dev/vdb1: Device or resource busy
ceph-disk: Error: Command '['/sbin/mkfs', '-t', 'xfs', '-f', '-i', 'size=2048', '--', '/dev/vdb1']' returned non-zero exit status 1

I can reproduce this consistently across ceph nodes; also impacts on the way we use swift for storage as well.

ProblemType: Bug
DistroRelease: Ubuntu 14.10
Package: ceph 0.80.5-1
ProcVersionSignature: User Name 3.16.0-16.22-generic 3.16.2
Uname: Linux 3.16.0-16-generic x86_64
ApportVersion: 2.14.7-0ubuntu2
Architecture: amd64
Date: Fri Sep 19 09:39:18 2014
Ec2AMI: ami-00000084
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.small
Ec2Kernel: aki-00000002
Ec2Ramdisk: ari-00000002
SourcePackage: ceph
UpgradeStatus: No upgrade log present (probably fresh install)
---
ApportVersion: 2.14.7-0ubuntu2
Architecture: amd64
DistroRelease: Ubuntu 14.10
Ec2AMI: ami-00000084
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.small
Ec2Kernel: aki-00000002
Ec2Ramdisk: ari-00000002
Package: linux
PackageArchitecture: amd64
ProcVersionSignature: Ubuntu 3.16.0-16.22-generic 3.16.2
Tags: utopic ec2-images
Uname: Linux 3.16.0-16-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True

Revision history for this message
James Page (james-page) wrote :
Revision history for this message
James Page (james-page) wrote :

This is what things should look like:

sudo ceph-disk-prepare --fs-type xfs --zap-disk /dev/vdb
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

Caution! After loading partitions, the CRC doesn't check out!
Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.

Warning! One or more CRCs don't match. You should repair the disk!

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.
meta-data=/dev/vdb1 isize=2048 agcount=4, agsize=1245119 blks
         = sectsz=512 attr=2, projid32bit=1
         = crc=0 finobt=0
data = bsize=4096 blocks=4980475, imaxpct=25
         = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=2560, version=2
         = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.

(that was after a reboot).

Revision history for this message
James Page (james-page) wrote :

Its probably worth noting that this is in a cloud-instance which automatically formats and mounts /dev/vdb on first boot; however we do umount it prior to performing the ceph-disk-prepare command.

Revision history for this message
James Page (james-page) wrote :

I poked at this with some additional volumes attached to the cloud instance:

Offending device:

ubuntu@juju-t-machine-22:~$ sudo umount /dev/vdb
ubuntu@juju-t-machine-22:~$ sudo lsof | grep vdb
jbd2/vdb- 1268 root cwd DIR 253,1 4096 2 /
jbd2/vdb- 1268 root rtd DIR 253,1 4096 2 /
jbd2/vdb- 1268 root txt unknown /proc/1268/exe

Additional device:

ubuntu@juju-t-machine-22:~$ sudo mount /dev/vdc /mnt2
ubuntu@juju-t-machine-22:~$ sudo lsof | grep vdc
jbd2/vdc- 16058 root cwd DIR 253,1 4096 2 /
jbd2/vdc- 16058 root rtd DIR 253,1 4096 2 /
jbd2/vdc- 16058 root txt unknown /proc/16058/exe
ubuntu@juju-t-machine-22:~$ sudo umount /dev/vdc
ubuntu@juju-t-machine-22:~$ sudo lsof | grep vdc

As you can see, the jbd2 process for vdb appears to hang around, which I think is what is keeping the partition table locked in kernel and hence stale.

Revision history for this message
James Page (james-page) wrote :

sudo lshw -class disk -class storage
  *-ide
       description: IDE interface
       product: 82371SB PIIX3 IDE [Natoma/Triton II]
       vendor: Intel Corporation
       physical id: 1.1
       bus info: pci@0000:00:01.1
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: ide bus_master
       configuration: driver=ata_piix latency=0
       resources: irq:0 ioport:1f0(size=8) ioport:3f6 ioport:170(size=8) ioport:376 ioport:c0e0(size=16)
  *-scsi:0
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: 4
       bus info: pci@0000:00:04.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:11 ioport:c000(size=64) memory:febd2000-febd2fff
  *-scsi:1
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: 5
       bus info: pci@0000:00:05.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:10 ioport:c040(size=64) memory:febd3000-febd3fff
  *-scsi:2
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: 7
       bus info: pci@0000:00:07.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:10 ioport:1000(size=64) memory:80000000-80000fff

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1371526

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
James Page (james-page) wrote :

This only appears to happen with the device on first boot; after a reboot mount/umount drops all jdb2 processes as I think it should.

Revision history for this message
James Page (james-page) wrote : Dependencies.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
James Page (james-page) wrote : ProcEnviron.txt

apport information

Revision history for this message
Stefan Bader (smb) wrote :

So to gather more pointers I tried a Trusty host and Utopic KVM guests. Either manually created (with virt-manager and not involving cloud-init then) and also using uvtool which is at least using a vdb for cloud-init data (in some way, though the image is a ro iso). Both ways the jbd2 process goes away after unmount.

Revision history for this message
Stefan Bader (smb) wrote :

We would really need information about setting up a system that runs into issues. In particular, how is the cloud-init ephemeral disk created? I still cannot reproduce this (making an ext3 fs outside and put something into it, then mount it by label from the guest, unmount it, mount it again and write something, all works ok). So we need as much info about what is done to the ephemeral disk from the start to the point where umount fails.

Revision history for this message
James Page (james-page) wrote :

I can't reproduce this problem any longer, so I'm assuming that something changed to fix this as I was seeing it 100% of the time.

Revision history for this message
shawnggraham (shawnggraham) wrote :

I believe this is affecting live migration of ubuntu 14.04 kvm guests from one kvm host to another. When i attempt this now I get read write errors on the guest, and the file system goes into read only mode. I have a test environment where I can verify this.

James Page (james-page)
Changed in ceph (Ubuntu):
status: New → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.