sgdisk zap/clear doesn't wipe all GPT tables

Bug #1303903 reported by Ryan Harper
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
charm-helpers (Ubuntu)
New
Undecided
Unassigned
gdisk (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

1)
root@oil-maas-node-2:~# lsb_release -rd
Description: Ubuntu 12.04.4 LTS
Release: 12.04

2)
root@oil-maas-node-2:~# apt-cache policy gdisk
gdisk:
  Installed: 0.8.1-1build1
  Candidate: 0.8.1-1build1
  Version table:
 *** 0.8.1-1build1 0
        500 http://archive.ubuntu.com/ubuntu/ precise/universe amd64 Packages
        100 /var/lib/dpkg/status

3) sgdisk --zap-all --clear --mbrtogpt should remove all partitions and GPT table signatures

4) partitions are removed, but the GPT table signature is still present as detected by fdisk and pvcreate

5) Verbose details here:

I'm reusing real disks between ceph and cinder in openstack installs, ceph creates GPT tables for large disks (>2TB); when we reinstall with cinder, it uses pvcreate to make a large volume. pvcreate fails due to a GPT signature remaining on the disk. The bug is in sgdisk which is used in the ceph/cinder charms to clear the disk. While sgdisk does clean out the GPT tables, it doesn't remove the GTP signatures, fdisk and pvcreate detect this.

Using dd and some math to calculate where the GPT tables live, I can manually clear them up to allow fdisk and pvcreate see a clean disk. Here is the workflow:

# ceph install will run:
% ceph-disk-prepare /dev/vdb
Information: Moved requested sector from 34 to 2048 in
order to align on 2048-sector boundaries.
The operation has completed successfully.
Information: Moved requested sector from 2097153 to 2099200 in
order to align on 2048-sector boundaries.
The operation has completed successfully.
meta-data=/dev/vdb1 isize=2048 agcount=4, agsize=1245119 blks
         = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=4980475, imaxpct=25
         = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=2560, version=2
         = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.
% parted /dev/vdb print

Model: Virtio Block Device (virtblk)
Disk /dev/vdb: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number Start End Size File system Name Flags
 2 1049kB 1074MB 1073MB ceph journal
 1 1075MB 21.5GB 20.4GB xfs ceph data

# here you can see fdisk detect the GTP table after a successful ceph-prepare-disk
% fdisk -l /dev/vdb

WARNING: GPT (GUID Partition Table) detected on '/dev/vdb'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/vdb: 21.5 GB, 21474836480 bytes
256 heads, 63 sectors/track, 2600 cylinders, total 41943040 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

   Device Boot Start End Blocks Id System
/dev/vdb1 1 41943039 20971519+ ee GPT

# let's use the sgdisk command in the ceph/cinder charm to wipe out the disk
% cat zapdisk.sh
#!/bin/bash -x

sgdisk --zap-all --clear --mbrtogpt $1
% bash -x zapdisk.sh /dev/vdb

+ sgdisk --zap-all --clear --mbrtogpt /dev/vdb
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
The operation has completed successfully.

# parted shows that it's call clear
% parted /dev/vdb print

Model: Virtio Block Device (virtblk)
Disk /dev/vdb: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number Start End Size File system Name Flags

# But fdisk still sees the GPT signature.
% fdisk -l /dev/vdb
WARNING: GPT (GUID Partition Table) detected on '/dev/vdb'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/vdb: 21.5 GB, 21474836480 bytes
256 heads, 63 sectors/track, 2600 cylinders, total 41943040 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

   Device Boot Start End Blocks Id System
/dev/vdb1 1 41943039 20971519+ ee GPT

# now let's try to create a PV with the same disk; this is what cinder charm would do to initialize storage.
# note the 'Partition table signature found'
% pvcreate -d -vvv /dev/vdb

        Processing: pvcreate -d -vvv /dev/vdb
        O_DIRECT will be used
      Setting global/locking_type to 1
      Setting global/wait_for_locks to 1
      File-based locking selected.
      Setting global/locking_dir to /var/lock/lvm
      metadata/pvmetadatasize not found in config: defaulting to 255
      metadata/pvmetadatacopies not found in config: defaulting to 1
      Locking /var/lock/lvm/P_orphans WB
        _do_flock /var/lock/lvm/P_orphans:aux WB
        _do_flock /var/lock/lvm/P_orphans WB
        _undo_flock /var/lock/lvm/P_orphans:aux
        Opened /dev/vdb RO
      /dev/vdb: size is 41943040 sectors
        /dev/vdb: block size is 4096 bytes
        /dev/vdb: Skipping: Partition table signature found
        Closed /dev/vdb
        /dev/vdb: Skipping (cached)
        Matcher built with 3 dfa states
      Setting devices/ignore_suspended_devices to 0
      Setting devices/cache_dir to /etc/lvm/cache
      Setting devices/write_cache_state to 1
        Opened /dev/vdb RO
      /dev/vdb: size is 41943040 sectors
        /dev/vdb: block size is 4096 bytes
        /dev/vdb: Skipping: Partition table signature found
        Closed /dev/vdb
  Device /dev/vdb not found (or ignored by filtering).
      Unlocking /var/lock/lvm/P_orphans
        _undo_flock /var/lock/lvm/P_orphans

# Here is a script that will properly wipe the GPT tables and signatures
% cat ddwipe.sh

#!/bin/bash

DEV="${1}"
END="$(sudo blockdev --getsz ${DEV})"
GPT_END=$(($END - 100))
dd if=/dev/zero of=${DEV} bs=1M count=1
dd if=/dev/zero of=${DEV} bs=512 seek=${GPT_END}

% bash -x ddwipe.sh /dev/vdb

+ DEV=/dev/vdb
++ sudo blockdev --getsz /dev/vdb
+ END=41943040
+ GPT_END=41942940
+ dd if=/dev/zero of=/dev/vdb bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00194759 s, 538 MB/s
+ dd if=/dev/zero of=/dev/vdb bs=512 seek=41942940
dd: writing `/dev/vdb': No space left on device
101+0 records in
100+0 records out
51200 bytes (51 kB) copied, 0.00659633 s, 7.8 MB/s

# confirm the wipe in parted...
% parted /dev/vdb print
Error: /dev/vdb: unrecognised disk label

# fdisk doesn't print the GPT warning any more
% fdisk -l /dev/vdb
Disk /dev/vdb doesn't contain a valid partition table

Disk /dev/vdb: 21.5 GB, 21474836480 bytes
16 heads, 63 sectors/track, 41610 cylinders, total 41943040 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

# and pvcreate now works correctly.
% pvcreate -d -vvv /dev/vdb

        Processing: pvcreate -d -vvv /dev/vdb
        O_DIRECT will be used
      Setting global/locking_type to 1
      Setting global/wait_for_locks to 1
      File-based locking selected.
      Setting global/locking_dir to /var/lock/lvm
      metadata/pvmetadatasize not found in config: defaulting to 255
      metadata/pvmetadatacopies not found in config: defaulting to 1
      Locking /var/lock/lvm/P_orphans WB
        _do_flock /var/lock/lvm/P_orphans:aux WB
        _do_flock /var/lock/lvm/P_orphans WB
        _undo_flock /var/lock/lvm/P_orphans:aux
        Opened /dev/vdb RO
      /dev/vdb: size is 41943040 sectors
        /dev/vdb: block size is 4096 bytes
        Closed /dev/vdb
      /dev/vdb: size is 41943040 sectors
        Opened /dev/vdb RO O_DIRECT
        /dev/vdb: block size is 4096 bytes
        Closed /dev/vdb
        Using /dev/vdb
        Opened /dev/vdb RO O_DIRECT
        /dev/vdb: block size is 4096 bytes
      /dev/vdb: No label detected
        Closed /dev/vdb
        Opened /dev/vdb RW O_EXCL O_DIRECT
        Closed /dev/vdb
      /dev/vdb: size is 41943040 sectors
        Opened /dev/vdb RO O_DIRECT
        /dev/vdb: block size is 4096 bytes
        Closed /dev/vdb
      /dev/vdb: size is 41943040 sectors
        Opened /dev/vdb RW O_DIRECT
        /dev/vdb: block size is 4096 bytes
        Closed /dev/vdb
      /dev/vdb: size is 41943040 sectors
      Setting devices/data_alignment to 0
      Device /dev/vdb queue/minimum_io_size is 512 bytes.
      Device /dev/vdb queue/optimal_io_size is 0 bytes.
      /dev/vdb: Setting PE alignment to 128 sectors.
      Device /dev/vdb alignment_offset is 0 bytes.
      /dev/vdb: Setting PE alignment offset to 0 sectors.
        Opened /dev/vdb RW O_DIRECT
        Wiping /dev/vdb at 4096 length 1
        /dev/vdb: block size is 4096 bytes
        Closed /dev/vdb
    Set up physical volume for "/dev/vdb" with 41943040 available sectors
      Scanning for labels to wipe from /dev/vdb
        Opened /dev/vdb RW O_DIRECT
        /dev/vdb: block size is 4096 bytes
        Closed /dev/vdb
    Zeroing start of device /dev/vdb
        Opened /dev/vdb RW O_DIRECT
        Wiping /dev/vdb at sector 0 length 4 sectors
        /dev/vdb: block size is 4096 bytes
        Closed /dev/vdb
      Writing physical volume data to disk "/dev/vdb"
        lvmcache: /dev/vdb: now in VG #orphans_lvm2 (#orphans_lvm2)
        Creating metadata area on /dev/vdb at sector 8 size 376 sectors
      /dev/vdb: setting pe_start=384 (orig_pe_start=384, pe_align=128, pe_align_offset=0, adjustment=0)
        Opened /dev/vdb RW O_DIRECT
        /dev/vdb: block size is 4096 bytes
        /dev/vdb: Preparing PV label header Crb9aw-5dC5-c2OG-GUbO-8Fdv-Tp0S-HaeUSB size 21474836480 with da1 (384s, 0s) mda1 (8s, 376s)
      /dev/vdb: Writing label to sector 1 with stored offset 32.
      Unlocking /var/lock/lvm/P_orphans
        _undo_flock /var/lock/lvm/P_orphans
        Closed /dev/vdb
  Physical volume "/dev/vdb" successfully created

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: gdisk 0.8.1-1build1
ProcVersionSignature: Ubuntu 3.2.0-54.82-generic 3.2.50
Uname: Linux 3.2.0-54-generic x86_64
ApportVersion: 2.0.1-0ubuntu17.6
Architecture: amd64
Date: Mon Apr 7 16:14:51 2014
Ec2AMI: ami-0000002d
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: serverstack-az-1
Ec2InstanceType: m1.medium
Ec2Kernel: aki-00000002
Ec2Ramdisk: ari-00000002
MarkForUpload: True
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: gdisk
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Ryan Harper (raharper) wrote :
Chris Glass (tribaal)
tags: added: landscape
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in gdisk (Ubuntu):
status: New → Confirmed
tags: added: cloud-installer
Revision history for this message
James Page (james-page) wrote :
Revision history for this message
Rod Smith (rodsmith) wrote :

This is not a bug in sgdisk; it's either a bug in the charm or an incorrect use of same. Specifically, the sgdisk command shown is:

sgdisk --zap-all --clear --mbrtogpt /dev/vdb

This command does four things, in sequence:

- It zaps all GPT and MBR data structures (--zap-all).
- It creates an empty GPT data structure (--clear).
- It OKs the conversion of any MBR data structure to GPT form (--mbrtogpt).
- It writes the resulting changes to disk. (This is implicit in most uses of sgdisk.)

The first and second of those options are both used to wipe data, but in different ways -- --zap-all zeroes out all the sectors of the disk used by the GPT data structures, whereas --clear erases the partitions but leaves the data structures intact. Using --clear after --zap-all should therefore have the same effect as using --clear alone. (There may be cases where --zap-all would be necessary if you're dealing with a damaged disk, but I'd need to study this some more to be sure.) In any event, the end result of those two commands is a GPT disk with no partitions defined, not a disk without a partition table.

The --mbrtogpt option is useless in this context. It should be used when you want to convert an MBR disk to GPT form, but as the preceding options set the disk up as GPT, --mbrtogpt does nothing.

If the goal is to completely erase all partition data, including the partition table itself, the following command should be used:

sgdisk --zap-all /dev/vdb

Adding --clear and --mbrtogpt will be useless at best, and as you've discovered, --clear adds an (empty) partition table back. Note also that parted does *NOT* show that the disk is "all clear," as described in the bug report:

# parted shows that it's call clear
% parted /dev/vdb print

Model: Virtio Block Device (virtblk)
Disk /dev/vdb: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number Start End Size File system Name Flags

Note the line that reads "Partition Table: gpt," which indicates that a valid GPT is present on the disk. No partitions are listed because that was the effect of the --clear option to sgdisk. Naturally, fdisk also notes the GPT data structures in the protective MBR.

Revision history for this message
Ryan Harper (raharper) wrote : Re: [Bug 1303903] Re: sgdisk zap/clear doesn't wipe all GPT tables

On Wed, Mar 18, 2015 at 12:59 PM, Roderick Smith <email address hidden>
wrote:

> This is not a bug in sgdisk; it's either a bug in the charm or an
> incorrect use of same. Specifically, the sgdisk command shown is:
>
> sgdisk --zap-all --clear --mbrtogpt /dev/vdb
>
> This command does four things, in sequence:
>
> - It zaps all GPT and MBR data structures (--zap-all).
> - It creates an empty GPT data structure (--clear).
> - It OKs the conversion of any MBR data structure to GPT form (--mbrtogpt).
> - It writes the resulting changes to disk. (This is implicit in most uses
> of sgdisk.)
>
> The first and second of those options are both used to wipe data, but in
> different ways -- --zap-all zeroes out all the sectors of the disk used
> by the GPT data structures, whereas --clear erases the partitions but
> leaves the data structures intact. Using --clear after --zap-all should
> therefore have the same effect as using --clear alone. (There may be
> cases where --zap-all would be necessary if you're dealing with a
> damaged disk, but I'd need to study this some more to be sure.) In any
> event, the end result of those two commands is a GPT disk with no
> partitions defined, not a disk without a partition table.
>
> The --mbrtogpt option is useless in this context. It should be used when
> you want to convert an MBR disk to GPT form, but as the preceding
> options set the disk up as GPT, --mbrtogpt does nothing.
>
> If the goal is to completely erase all partition data, including the
> partition table itself, the following command should be used:
>
> sgdisk --zap-all /dev/vdb
>

Thanks for the clarification. Looking into the charm-helpers history it
appears there
was some creep in use of the command.

Originally it was as above, --zap-all DEVICE,

This bug was encountered:

https://bugs.launchpad.net/ubuntu-advantage/+bug/1257491

Which then introduced the use of --mbrtogpt

Further errors were encountered and --clear was added, which resulted in
re-writing of an empty, but clear partition.

Knowing now that it writes out an empty, but present GPT table then issues
manifested itself in a separate case; when this disk was re-used to create
an
LVM, the empty but present GPT table prevented LVM from using the device.

Backing up then, the question is why doesn't --zap-all work for bug 1257491?

If that can be resolved then we can remove the mbrtogpt and clear
altogether
and ensure that nothing is present on the disk so it can be used for ceph
or lvm
block services.

Ryan

Revision history for this message
Rod Smith (rodsmith) wrote :

Examining the code, there are several sgdisk calls, but two are relevant for this discussion. The first is the one reproduced in this initial bug report:

sgdisk --zap-all --clear --mbrtogpt

This call does not seem to have been changed as a result of bug #1257491. Upon further review, I think I see why it's written this way: When "sgdisk --zap-all --clear" is fed an MBR disk, sgdisk wipes the disk but then refuses to create a blank GPT because it still thinks it's dealing with an MBR disk. This is a bug and I'll fix it soon. Adding --mbrtogpt works around this bug and results in a blank GPT disk.

The call that bug #1257491 resulted in changing was elsewhere in the script, from lines 840 to 856 at http://ceph.com/git/?p=ceph.git;a=blob;f=src/ceph-disk;h=f771a68c2f9d873710cbe380d702fd0baa725da9;hb=HEAD#l840:

sgdisk --new={part} --change-name={num}:ceph journal --partition-guid={num}:{journal_uuid} --typecode={num}:{uuid} journal

It was to THIS call that --mbrtogpt was added, if I've backtracked it correctly. Again, I don't understand the context, but my guess is that sgdisk was being called on an MBR disk. By default, sgdisk will refuse to write data to an MBR disk. This is a safety feature, in case it's called carelessly on an MBR disk that should NOT be converted to GPT form; but of course if a script doesn't know whether the input will be GPT or MBR but the intent of the script is to write out a GPT disk, having sgdisk convert the data structures makes sense, so --mbrtogpt exists.

So in sum, sgdisk does have a bug, but it's not the one assumed. The charm described here was written to work around the bug, and I believe you've misinterpreted the expected output of the relevant sgdisk command.

Revision history for this message
Ryan Harper (raharper) wrote :

On Fri, Mar 20, 2015 at 1:02 PM, Roderick Smith <email address hidden>
wrote:

> Examining the code, there are several sgdisk calls, but two are relevant
> for this discussion. The first is the one reproduced in this initial bug
> report:
>
> sgdisk --zap-all --clear --mbrtogpt
>
> This call does not seem to have been changed as a result of bug
> #1257491. Upon further review, I think I see why it's written this way:
> When "sgdisk --zap-all --clear" is fed an MBR disk, sgdisk wipes the
> disk but then refuses to create a blank GPT because it still thinks it's
> dealing with an MBR disk. This is a bug and I'll fix it soon. Adding
> --mbrtogpt works around this bug and results in a blank GPT disk.
>

OK. There wasn't much context in the changelog to charm-helpers for that
change, but
I'm assuming one of the ceph tools didn't like getting a non-blank GPT disk
and
appending mbrtogpt resolved that. But that left the case where a GPT disk
was
fed to lvm2 pvcreate which will refuse to work with a disk that has any
partition table (MBR or GPT)

Which led to filing this bug.

>
> The call that bug #1257491 resulted in changing was elsewhere in the
> script, from lines 840 to 856 at
> http://ceph.com/git/?p=ceph.git;a=blob;f=src/ceph-
> disk;h=f771a68c2f9d873710cbe380d702fd0baa725da9;hb=HEAD#l840:
>
> sgdisk --new={part} --change-name={num}:ceph journal --partition-
> guid={num}:{journal_uuid} --typecode={num}:{uuid} journal
>
> It was to THIS call that --mbrtogpt was added, if I've backtracked it
> correctly. Again, I don't understand the context, but my guess is that
> sgdisk was being called on an MBR disk. By default, sgdisk will refuse
> to write data to an MBR disk. This is a safety feature, in case it's
> called carelessly on an MBR disk that should NOT be converted to GPT
> form; but of course if a script doesn't know whether the input will be
> GPT or MBR but the intent of the script is to write out a GPT disk,
> having sgdisk convert the data structures makes sense, so --mbrtogpt
> exists.
>

Indeed. In our use-case the physical disks on the machine get reused for
different purposes from run to run, so
we definitely encountered the case where sgdisk performed as requested, but
ultimately we needed
something to handle clearing the entire disk regardless of initial state
and leaving nothing behind (no MBR, no GPT).

Is there such a call to sgdisk to do so?

>
> So in sum, sgdisk does have a bug, but it's not the one assumed. The
> charm described here was written to work around the bug, and I believe
> you've misinterpreted the expected output of the relevant sgdisk
> command.
>

OK. The end goal was to have a call to sgdisk (or something else) that
would
ensure that the disk looked blank/unused such that the different tools used
to
initialize the disk all worked correctly.

If there is a correct call to sgdisk to handle both MBR and GPT disks and
to
cleaning wipe the drive then we can mark this bug as invalid (I'll leave you
to open the other bug you mentioned)

Ryan

Changed in gdisk (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.