curtin miscalculates partition offsets when logical_block_size is 4k
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Invalid
|
Undecided
|
Unassigned | ||
curtin |
Fix Released
|
High
|
Unassigned | ||
curtin (Ubuntu) |
Fix Released
|
Medium
|
Unassigned |
Bug Description
We experience a very high rate of failure in deploying ppc64el via the MAAS web ui, for any Ubuntu daily image release (Trusty, Wily, Xenial).
In 15 attempts, I was able to get 3 machines to a 'deployed' state. We have 4 Power8 machines and they all exhibit this behavior.
With Trusty and Wily, when the unit fails to deploy, the machines are accessible.
With Xenial, when the unit fails to deploy, the machine is not reachable via network.
In all cases, the oobm serial console shows the machine at the login prompt.
We are also observing that the ipmi version, although initially set to version 2.0, is reverting to 1.5 after a unit fails deployment and is released. (https:/
What is interesting/odd in the machine output, is that it's trying very hard to do operations on /dev/sde when /dev/sda is what I would expect.
# Storage looks like this:
File systems
Name Size Mountpoint File system
sda-part2 283.8 GB / ext4
Available disks and partitions
Name Model Serial BootSizeDevice TypeFile systemTags
sdb
283.8 GB Physical rotary
sdc
283.8 GB Physical rotary
sdd
283.8 GB Physical rotary
sde
283.8 GB Physical rotary
Used disks and partitions
Name Model Serial BootDevice typeUsed for
sda PhysicalGPT partitioned with 1 partition
sda-part2 Partitionext4 formatted filesystem mounted at /
# Network looks like this:
eth0
Physical 10.X.X.X (Auto assign)
eth1
Physical (Unconfigured)
eth2
Physical (Unconfigured)
eth3
Physical (Unconfigured)
eth4
Physical (Unconfigured)
eth5
Physical (Unconfigured)
# Machine output
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/block/
Running command ['sgdisk', '--zap-all', '/dev/sde1'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
clear_holders running on '/sys/block/sde', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sde'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
labeling device: '/dev/sde' with 'gpt' partition table
Running command ['sgdisk', '--clear', '/dev/sde'] with allowed return codes [0] (shell=False, capture=False)
Creating new GPT entries.
The operation has completed successfully.
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sde'] with allowed return codes [0, 2] (shell=False, capture=True)
Running command ['partprobe', '/dev/sdb'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/block/sdb', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sdb'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
Running command ['partprobe', '/dev/sdb'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sdb'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: sdb. Skipping dname.
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/block/sda', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sda'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sda'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: sdc. Skipping dname.
Running command ['partprobe', '/dev/sdd'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/block/sdd', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sdd'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
Running command ['partprobe', '/dev/sdd'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sdd'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: sdd. Skipping dname.
Running command ['partprobe', '/dev/sdc'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/block/sdc', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sdc'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
Running command ['partprobe', '/dev/sdc'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sdc'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: sde. Skipping dname.
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
adding partition 'sda-part1' to disk 'sda'
Running command ['sgdisk', '--new', '1:256.0:2303.0', '--typecode=
The operation has completed successfully.
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['dd', 'bs=512', 'if=/dev/zero', 'of=/dev/sde1'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sde1'] with allowed return codes [0, 2] (shell=False, capture=True)
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sde'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
previous partition number for 'sda-part2' found to be '1'
adding partition 'sda-part2' to disk 'sda'
Running command ['sgdisk', '--new', '2:18432:69300223', '--typecode=
Could not create partition 2 from 18432 to 69300223
Could not change partition 2's type code to 8300!
Error encountered; not saving changes.
An error occured handling 'sda-part2': ProcessExecutio
Command: ['sgdisk', '--new', '2:18432:69300223', '--typecode=
Exit code: 4
Reason: -
Stdout: ''
Stderr: ''
Traceback (most recent call last):
File "/curtin/
ret = args.func(args)
File "/curtin/
meta_
File "/curtin/
handler(
File "/curtin/
util.subp(cmd)
File "/curtin/
return _subp(*args, **kwargs)
File "/curtin/
cmd=args)
curtin.
Command: ['sgdisk', '--new', '2:18432:69300223', '--typecode=
Exit code: 4
Reason: -
Stdout: ''
Stderr: ''
Unexpected error while running command.
Command: ['sgdisk', '--new', '2:18432:69300223', '--typecode=
Exit code: 4
Reason: -
Stdout: ''
Stderr: ''
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: b'Running command [\'partprobe\', \'/dev/sde\'] with allowed return codes [0, 1] (shell=False, capture=
Stderr: ''
Related branches
- Server Team CI bot: Approve (continuous-integration)
- Scott Moser: Pending requested
-
Diff: 947 lines (+381/-112)9 files modifiedcurtin/block/__init__.py (+15/-2)
curtin/block/mkfs.py (+17/-1)
curtin/commands/block_meta.py (+28/-10)
examples/tests/uefi_basic.yaml (+1/-1)
tests/unittests/test_block_mkfs.py (+8/-1)
tests/vmtests/__init__.py (+55/-15)
tests/vmtests/test_uefi_basic.py (+53/-0)
tools/launch (+61/-27)
tools/xkvm (+143/-55)
Changed in curtin: | |
importance: | Undecided → High |
status: | New → Triaged |
tags: | added: sts |
Changed in curtin: | |
status: | Triaged → In Progress |
Also, if the machine is in Failed Deployment, can we please get the curtin config of that node:
maas <user> node get-curtin-config <system-id>