Trusty Deployment w/ RAID storage config fails because Trusty images do not contain RAID kernel modules
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Invalid
|
Critical
|
Unassigned | ||
maas-images |
Fix Released
|
Medium
|
Scott Moser |
Bug Description
I have a server with 2x 2TB SATA disks that I am attempting to deploy with a software RAID 1 scheme via MAAS 1.9 RC2 installed this morning from the maas next ppa.
maas:
Installed: 1.9.0~rc2+
Candidate: 1.9.0~rc2+
Version table:
*** 1.9.0~rc2+
500 http://
100 /var/lib/
To create this schema, I deleted all LVM schema stuff from the MAAS Storage config area for my node.
Next, I selected the two disks (sda, sdb) and clicked Create RAID
I set a RAID 1 partitioned as ext4 and mounted as root.
I then attempted to deploy 15.10 from the Release stream using this schema.
This failed for some reason. It always fails when I attempt to create a RAID scheme. If I use the default LVM scheme, I can deploy this server with 15.10.
However, when I try a RAID scheme, it lays down the filesystem and the last message on the console is about generating ssh keys followed by:
cloud-init[1153]: Cloud-init v. 0.7.7 finished at Tue, 24 Nov 2015 20:12:19 +0000. Datasource DataSourceMAAS [http://
I am attaching the full dump of /var/log from the node in a failed state.
Related branches
Jeff Lane (bladernr) wrote : | #1 |
Jeff Lane (bladernr) wrote : | #2 |
Jeff Lane (bladernr) wrote : | #3 |
That second/fourth attempt also failed. So so far, the ONLY way I can successfully deploy is using either the default LVM schema or the default Flat schema.
I have not attempted a more complex flat or LVM custom scheme yet, I really just wanted to set up RAID.
Blake Rouse (blake-rouse) wrote : | #4 |
Please provide the installation log from the MAAS UI for the deploying node.
Also if you could provide the output of "maas <session> node read <system_id>" for the node that would be helpful as well.
Thanks,
Blake
Changed in maas: | |
status: | New → Incomplete |
Jeff Lane (bladernr) wrote : | #5 |
Node status event - 'cloudinit' running modules for final Wed, 25 Nov. 2015 14:44:27
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-phone-home ran successfully Wed, 25 Nov. 2015 14:44:26
Node status event - 'cloudinit' running config-phone-home with frequency once-per-instance Wed, 25 Nov. 2015 14:44:26
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' running config-scripts-user with frequency once-per-instance Wed, 25 Nov. 2015 14:44:26
Node changed status - From 'Deploying' to 'Failed deployment' Wed, 25 Nov. 2015 14:44:26
Node installation - 'curtin' curtin command install Wed, 25 Nov. 2015 14:44:26
Node installation - 'curtin' configuring installed system Wed, 25 Nov. 2015 14:44:24
Node installation - 'curtin' running 'builtin' Wed, 25 Nov. 2015 14:44:24
Node installation - 'curtin' curtin command curthooks Wed, 25 Nov. 2015 14:44:24
Node installation - 'curtin' curtin command curthooks Wed, 25 Nov. 2015 14:43:33
Node installation - 'curtin' running 'builtin' Wed, 25 Nov. 2015 14:43:32
Node installation - 'curtin' configuring installed system Wed, 25 Nov. 2015 14:43:32
Node installation - 'curtin' writing install sources to disk Wed, 25 Nov. 2015 14:43:32
Node installation - 'curtin' running 'builtin' Wed, 25 Nov. 2015 14:43:32
Node installation - 'curtin' curtin command extract Wed, 25 Nov. 2015 14:43:32
Node installation - 'curtin' curtin command extract Wed, 25 Nov. 2015 14:43:18
Node installation - 'curtin' running 'builtin' Wed, 25 Nov. 2015 14:43:18
Node installation - 'curtin' writing install sources to disk Wed, 25 Nov. 2015 14:43:18
Node installation - 'curtin' configuring network Wed, 25 Nov. 2015 14:43:18
Node installation - 'curtin' running 'builtin' Wed, 25 Nov. 2015 14:43:18
Node installation - 'curtin' curtin command net-meta Wed, 25 Nov. 2015 14:43:17
Node installation - 'curtin' curtin command net-meta Wed, 25 Nov. 2015 14:43:17
Node installation - 'curtin' running 'builtin' Wed, 25 Nov. 2015 14:43:17
Node installation - 'curtin' configuring network Wed, 25 Nov. 2015 14:43:17
Node installation - 'curtin' configuring storage Wed, 25 Nov. 2015 14:43:17
Node installation - 'curtin' running 'builtin' Wed, 25 Nov. 2015 14:43:17
Node installation - 'curtin' curtin command block-meta Wed, 25 Nov. 2015 14:43:16
Node installation - 'curtin' curtin command block-meta Wed, 25 N...
Jeff Lane (bladernr) wrote : | #6 |
From maas <session> nodes list
{
],
"owner": "bladernr",
{
},
{
}
],
"zone": {
"name": "Rack2",
},
"hostname": "supermicro.maas",
"storage": 4000797,
"memory": 4096,
"status": 6,
"routers": [
],
{
],
]
},
{
Blake Rouse (blake-rouse) wrote : | #7 |
You provided the node event log. I need the node installation log which is all the way at the bottom of the page.
Jeff Lane (bladernr) wrote : | #8 |
Here is the log from a successful deployment after re-commissioning and having it reset the storage config to the default flat config:
Node status event - 'cloudinit' running modules for final Wed, 25 Nov. 2015 15:11:09
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-phone-home ran successfully Wed, 25 Nov. 2015 15:11:08
Node status event - 'cloudinit' running config-phone-home with frequency once-per-instance Wed, 25 Nov. 2015 15:11:08
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-scripts-user ran successfully Wed, 25 Nov. 2015 15:11:08
Node status event - 'cloudinit' running config-scripts-user with frequency once-per-instance Wed, 25 Nov. 2015 15:11:08
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' config-
Node status event - 'cloudinit' running config-
Node status event - 'cloudinit' running modules for config Wed, 25 Nov. 2015 15:11:06
Node status event - 'cloudinit' config-byobu ran successfully Wed, 25 Nov. 2015 15:11:06
Node status event - 'cloudinit' running config-byobu with frequency once-per-instance Wed, 25 Nov. 2015 15:11:06
Node status event - 'cloudinit' config-runcmd ran successfully Wed, 25 Nov. 2015 15:11:06
Node status event - 'cloudinit' running config-runcmd with frequency once-per-instance Wed, 25 Nov. 2015 15:11:06
Node status event - 'cloudinit' config-
Jeff Lane (bladernr) wrote : | #9 |
{u'hwe_kernel': u'hwe-w', u'ip_addresses': [u'10.0.0.128'], u'cpu_count': 4, u'power_type': u'ipmi', u'tag_names': [], u'swap_size': None, u'owner': u'bladernr', u'macaddress_set': [{u'mac_address': u'00:30:
Jeff Lane (bladernr) wrote : | #10 |
ahhh... crap... ok. give me a few
Jeff Lane (bladernr) wrote : | #11 |
mdadm: No arrays found in config file or automatically
mdadm: No arrays found in config file or automatically
Creating new GPT entries.
The operation has completed successfully.
The operation has completed successfully.
mdadm: Unrecognised md component device - /dev/sda1
mdadm: Unrecognised md component device - /dev/sdb1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
--2015-11-25 20:29:31-- http://
Connecting to 10.0.0.1:5248... connected.
HTTP request sent, awaiting response... 200 OK
Length: 389850048 (372M) [text/html]
Saving to: 'STDOUT'
0K ........ ........ ........ ........ ........ ........ 0% 9.60M 38s
3072K ........ ........ ........ ........ ........ ........ 1% 33.0M 25s
6144K ........ ........ ........ ........ ........ ........ 2% 39.0M 19s
9216K ........ ........ ........ ........ ........ ........ 3% 42.3M 17s
12288K ........ ........ ........ ........ ........ ........ 4% 33.1M 15s
15360K ........ ........ ........ ........ ........ ........ 4% 29.5M 15s
18432K ........ ........ ........ ........ ........ ........ 5% 28.6M 14s
21504K ........ ........ ........ ........ ........ ........ 6% 30.6M 14s
24576K ........ ........ ........ ........ ........ ........ 7% 29.3M 13s
27648K ........ ........ ........ ........ ........ ........ 8% 28.4M 13s
30720K ........ ........ ........ ........ ........ ........ 8% 30.9M 13s
33792K ........ ........ ........ ........ ........ ........ 9% 31.5M 13s
36864K ........ ........ ........ ........ ........ ........ 10% 25.7M 12s
39936K ........ ........ ........ ........ ........ ........ 11% 29.4M 12s
43008K ........ ........ ........ ........ ........ ........ 12% 27.9M 12s
46080K ........ ........ ........ ........ ........ ........ 12% 29.3M 12s
49152K ........ ........ ........ ........ ........ ........ 13% 30.6M 12s
52224K ........ ........ ........ ........ ........ ........ 14% 28.9M 12s
55296K ........ ........ ........ ........ ........ ........ 15% 29.4M 11s
58368K ........ ........ ........ ........ ........ ........ 16% 31.0M 11s
61440K ........ ........ ........ ........ ........ ........ 16% 31.4M 11s
64512K ........ ........ ........ ........ ........ ........ 17% 33.8M 11s
67584K ........ ........ ........ ........ ........ ........ 18% 27.7M 11s
70656K ........ ........ ........ ........ ........ ........ 19% 32.0M 11s
73728K ........ ........ ........ ........ ........ ........ 20% 56.9M 10s
76800K ........ ........ ........ ........ ........ ........ 20% 40.1M 10s
79872K ........ ........ ........ ........ ........ ........ 21% 39.6M 10s
82944K ........ ........ ........ ........ ........ ........ 22% 40.0M 10s
86016K ........ ........ ........ ........ ........ ........ 23% 37.5M 10s
89088K ........ ........ ........ ........ ........ ........ 24% 38.7M 9s
92160K ........ ........ ........ ........ ........ ........ 25% 46.2M 9s
95232K ........ ........ ........ ........ ........ ........ 25% 50.7M 9s
98304K ........ ........ ........ ........ ........ ........ 26% 29.9M 9s
101376K ........ ........ ........ ........ ........ ......
Jeff Lane (bladernr) wrote : | #12 |
This is on the same machine, immediately following the failure above with a recommission and using the default flat scheme:
mdadm: /dev/md/0 has been started with 2 drives.
mdadm: stopped /dev/md0
mdadm: error opening /dev/md0: No such file or directory
mdadm: stopped /dev/md0
mdadm: error opening /dev/md0: No such file or directory
--2015-11-25 20:41:27-- http://
Connecting to 10.0.0.1:5248... connected.
HTTP request sent, awaiting response... 200 OK
Length: 389850048 (372M) [text/html]
Saving to: 'STDOUT'
0K ........ ........ ........ ........ ........ ........ 0% 11.1M 33s
3072K ........ ........ ........ ........ ........ ........ 1% 30.5M 22s
6144K ........ ........ ........ ........ ........ ........ 2% 36.4M 18s
9216K ........ ........ ........ ........ ........ ........ 3% 41.9M 16s
12288K ........ ........ ........ ........ ........ ........ 4% 33.0M 15s
15360K ........ ........ ........ ........ ........ ........ 4% 32.3M 14s
18432K ........ ........ ........ ........ ........ ........ 5% 29.1M 14s
21504K ........ ........ ........ ........ ........ ........ 6% 29.9M 13s
24576K ........ ........ ........ ........ ........ ........ 7% 30.7M 13s
27648K ........ ........ ........ ........ ........ ........ 8% 30.1M 13s
30720K ........ ........ ........ ........ ........ ........ 8% 34.9M 12s
33792K ........ ........ ........ ........ ........ ........ 9% 31.0M 12s
36864K ........ ........ ........ ........ ........ ........ 10% 29.4M 12s
39936K ........ ........ ........ ........ ........ ........ 11% 29.1M 12s
43008K ........ ........ ........ ........ ........ ........ 12% 29.0M 12s
46080K ........ ........ ........ ........ ........ ........ 12% 29.0M 11s
49152K ........ ........ ........ ........ ........ ........ 13% 26.4M 11s
52224K ........ ........ ........ ........ ........ ........ 14% 29.5M 11s
55296K ........ ........ ........ ........ ........ ........ 15% 29.9M 11s
58368K ........ ........ ........ ........ ........ ........ 16% 30.8M 11s
61440K ........ ........ ........ ........ ........ ........ 16% 31.9M 11s
64512K ........ ........ ........ ........ ........ ........ 17% 33.0M 11s
67584K ........ ........ ........ ........ ........ ........ 18% 25.1M 11s
70656K ........ ........ ........ ........ ........ ........ 19% 30.5M 10s
73728K ........ ........ ........ ........ ........ ........ 20% 47.5M 10s
76800K ........ ........ ........ ........ ........ ........ 20% 45.4M 10s
79872K ........ ........ ........ ........ ........ ........ 21% 40.6M 10s
82944K ........ ........ ........ ........ ........ ........ 22% 40.3M 10s
86016K ........ ........ ........ ........ ........ ........ 23% 36.1M 9s
89088K ........ ........ ........ ........ ........ ........ 24% 35.8M 9s
92160K ........ ........ ........ ........ ........ ........ 25% 43.5M 9s
95232K ........ ........ ........ ........ ........ ........ 25% 44.9M 9s
98304K ........ ........ ........ ........ ........ ........ 26% 48.5M 9s
101376K ........ ........ ........ ........ ........ ........ 27% 39.8M 9s
104448K ........ ........ ........ ...........
Changed in maas: | |
status: | Incomplete → New |
Blake Rouse (blake-rouse) wrote : | #13 |
Looks to be an issue with curtin and apt-get installing grub and chocking on /dev/md0.
Changed in maas: | |
status: | New → Triaged |
importance: | Undecided → Critical |
Changed in curtin: | |
importance: | Undecided → Critical |
Changed in maas: | |
milestone: | none → 1.9.0 |
Blake Rouse (blake-rouse) wrote : | #14 |
Since this is a curtin issue could you please provide the curtin config for the node. Once the node has failed the deployment and is in failed state please provide the output of the following:
maas <session> node get-curtin-config <system-id>
Also have you tried other Ubuntu releases or only wily?
Christian Ehrhardt (paelzer) wrote : | #15 |
I've successfully run mdadm with 15.10 on raid-10, raid-5 and raid-6 as / devices - I doubt raid-1 should be any different.
Blake Rouse - can you guide Jeff to provide a full install log with "-vvv" as we currently use it in the other bug we work on together?
That would be great.
That together with the config Jeff already provided in comment #6 should help me recreating the issue to debug it.
Changed in curtin: | |
status: | New → Triaged |
Christian Ehrhardt (paelzer) wrote : | #16 |
I checked it once more, we even had raid level 1 tested all the time.
I quickly also added Trusty just to be sure, but T/V/W are all working in my case - so we have to find the difference of your definition.
I relaized the config listed above in comment #6 is a bit short, didn't see that this morning.
Blake - could you also guide him to get and upload a yaml config of his case?
As a reference here my test yaml http://
Jeff Lane (bladernr) wrote : | #17 |
Blake:
Machine-readable output follows:
apt_mirrors:
ubuntu_archive: http://
ubuntu_security: http://
apt_proxy: http://
debconf_selections:
maas: 'cloud-init cloud-init/
cloud-init cloud-init/
cloud-init cloud-init/
cloud-init cloud-init/
true\
true\
GBLKfyTX7Ey
webhook}
{primary: ''http://
[''http://
arches: [default]\n failsafe: {primary: ''http://
security: ''http://
[''http://
'
install:
log_file: /tmp/install.log
post_files:
- /tmp/install.log
kernel:
mapping: {}
package: linux-generic
late_commands:
maas:
- wget
- --no-proxy
- http://
- --post-data
- op=netboot_off
- -O
- /dev/null
network:
config:
- id: eth0
mac_address: 00:30:48:65:5e:0c
mtu: 1500
name: eth0
subnets:
- address: 10.0.0.128/24
dns_
gateway: 10.0.0.1
type: static
type: physical
- id: eth1
mac_address: 00:30:48:65:5e:0d
mtu: 1500
name: eth1
subnets:
- type: manual
type: physical
- address: 10.0.0.1
search:
- maas
type: nameserver
version: 1
network_commands:
builtin:
- curtin
- net-meta
- custom
partitioning_
builtin:
- curtin
- block-meta
- custom
power_state:
mode: reboot
reporting:
maas:
consumer_key: NfUA6K5QhdM3uuwkGg
endpoint: http://
token_key: GBLKfyTX7EynSfXHzf
token_secret: NGzpKe8ynYqDB9N
type: webhook
storage:
config:
- grub_device: true
id: sda
model: WDC WD2004FBYZ-0
name: sda
ptable: msdos
serial: WD-WMC6N0D4Y76V
type: disk
wipe: superblock
- id: sdb
model: WDC WD2004FBYZ-0
name: sdb
ptable: gpt
serial: WD-WMC6N0D8AS39
type: disk
wipe: superblock
- device: sda
id: sda-part1
name: sda-part1
number: 1
offset: 4194304B
size: 2000393601024B
ty...
Jeff Lane (bladernr) wrote : | #18 |
Blake, I've now tried with Trusty, Vivid and Wily from Releases using the exact same config, and it fails on every one.
Jeff Lane (bladernr) wrote : | #19 |
If you could tell me how to get logs using -vvv (Christian's suggestion) I'll do that and post them to the bug. I'm also going to re-try with raid0.
Jeff Lane (bladernr) wrote : | #20 |
Also tried RAID0 with wily and that too failed.
Machine-readable output follows:
apt_mirrors:
ubuntu_archive: http://
ubuntu_security: http://
apt_proxy: http://
debconf_selections:
maas: 'cloud-init cloud-init/
cloud-init cloud-init/
cloud-init cloud-init/
cloud-init cloud-init/
true\
true\
BXbzu9PEvBB
webhook}
{primary: ''http://
[''http://
arches: [default]\n failsafe: {primary: ''http://
security: ''http://
[''http://
'
install:
log_file: /tmp/install.log
post_files:
- /tmp/install.log
kernel:
mapping: {}
package: linux-generic
late_commands:
maas:
- wget
- --no-proxy
- http://
- --post-data
- op=netboot_off
- -O
- /dev/null
network:
config:
- id: eth0
mac_address: 00:30:48:65:5e:0c
mtu: 1500
name: eth0
subnets:
- address: 10.0.0.128/24
dns_
gateway: 10.0.0.1
type: static
type: physical
- id: eth1
mac_address: 00:30:48:65:5e:0d
mtu: 1500
name: eth1
subnets:
- type: manual
type: physical
- address: 10.0.0.1
search:
- maas
type: nameserver
version: 1
network_commands:
builtin:
- curtin
- net-meta
- custom
partitioning_
builtin:
- curtin
- block-meta
- custom
power_state:
mode: reboot
reporting:
maas:
consumer_key: fRkhnNk3kzQgUQvJ7s
endpoint: http://
token_key: BXbzu9PEvBBb4mDdLU
token_secret: 8SquwnRT8ptAaJc
type: webhook
storage:
config:
- grub_device: true
id: sda
model: WDC WD2004FBYZ-0
name: sda
ptable: msdos
serial: WD-WMC6N0D4Y76V
type: disk
wipe: superblock
- id: sdb
model: WDC WD2004FBYZ-0
name: sdb
ptable: gpt
serial: WD-WMC6N0D8AS39
type: disk
wipe: superblock
- device: sda
id: sda-part1
name: sda-part1
number: 1
offset: ...
Blake Rouse (blake-rouse) wrote : | #21 |
Jeff,
maas admin maas set-config name=curtin_verbose value=True
Will enable verbose output. You will see the output of the installation log contains a lot more information.
Jeff Lane (bladernr) wrote : | #22 |
Here's the log from the UI from another failed deployment after turning on verbose per your comment #21
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=False)
mdadm: /dev/md/0 has been started with 2 drives.
clear_holders running on '/sys/block/
clear_holders running on '/sys/devices/
stopping: /dev/md0
Running command ['mdadm', '--stop', '/dev/md0'] with allowed return codes [0, 1] (shell=False, capture=False)
mdadm: stopped /dev/md0
Running command ['mdadm', '--remove', '/dev/md0'] with allowed return codes [0, 1] (shell=False, capture=False)
mdadm: error opening /dev/md0: No such file or directory
Running command ['sgdisk', '--zap-all', '/dev/sda1'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
clear_holders running on '/sys/block/sda', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sda'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
labeling device: '/dev/sda' with 'msdos' partition table
Running command ['parted', '/dev/sda', '--script', 'mklabel', 'msdos'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sda'] with allowed return codes [0, 2] (shell=False, capture=True)
Running command ['partprobe', '/dev/sdb'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=False)
clear_holders running on '/sys/block/
clear_holders running on '/sys/devices/
stopping: /dev/md0
Running command ['mdadm', '--stop', '/dev/md0'] with allowed return codes [0, 1] (shell=False, capture=False)
mdadm: stopped /dev/md0
Running command ['mdadm', '--remove', '/dev/md0'] with allowed return codes [0, 1] (shell=False, capture=False)
mdadm: error opening /dev/md0: No such file or directory
Running command ['sgdisk', '--zap-all', '/dev/sdb1'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
clear_holders running on '/sys/block/sdb', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sdb'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
labeling device: '/dev/sdb' with 'gpt' partition table
Running command ['sgdisk', '--clear', '/dev/sdb'] with allowed return codes [0] (shell=False, capture=False)
Creating new GPT entries.
The operation has completed successfully.
Running command ['partprobe', '/dev/sdb'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle...
summary: |
- Deployment always fails when creating a RAID storage config + Deployment always fails when creating a custom storage config |
Jeff Lane (bladernr) wrote : Re: Deployment always fails when creating a custom storage config | #23 |
- flat-partition-failure-node-logs.tgz Edit (64.0 KiB, application/x-tar)
Ok, New data point... I got another failure with a customized flat config. The following is the current custom flat config I've just attempted:
File systems
Name Size Mountpoint File system
sda-part1 2.0 TB /windows vfat
sdb-part1 2.0 TB / ext4
Used disks and partitions
Name|Model|Serial Boot Device type Used for
sda Physical MBR partitioned with 1 partition
sda-part1 Partition vfat formatted filesystem mounted at /windows
sdbPhysical GPT partitioned with 1 partition
sdb-part1 Partition ext4 formatted filesystem mounted at /
There was no install log in the web UI, unfortunately, I will try to recreate this and see if I can get one to appear. In the meanwhile, I have added the contents of /var/log from this new failure case
Jeff Lane (bladernr) wrote : | #24 |
I tried three more times but got no install info in the web UI from this fail case :(
Andres Rodriguez (andreserl) wrote : | #25 |
I wonder if the reason is related to creating partitions so large?
sda-part1 2.0 TB /windows vfat
sdb-part1 2.0 TB / ext4
?
Andres Rodriguez (andreserl) wrote : | #26 |
Or using multiple disks?
Jeff Lane (bladernr) wrote : | #27 |
Tried again with the latest 1.9 bits from maas/proposed.
Attempted a RAID0 once more after zeroing the two disks manually, then attempting to deploy. This is the install log from the maas UI:
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Error: /dev/sda: unrecognised disk label
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=False)
mdadm: No arrays found in config file or automatically
clear_holders running on '/sys/block/sda', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sda'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
labeling device: '/dev/sda' with 'msdos' partition table
Running command ['parted', '/dev/sda', '--script', 'mklabel', 'msdos'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sda'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: sda. Skipping dname.
Running command ['partprobe', '/dev/sdb'] with allowed return codes [0, 1] (shell=False, capture=False)
Error: /dev/sdb: unrecognised disk label
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=False)
mdadm: No arrays found in config file or automatically
clear_holders running on '/sys/block/sdb', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sdb'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
labeling device: '/dev/sdb' with 'gpt' partition table
Running command ['sgdisk', '--clear', '/dev/sdb'] with allowed return codes [0] (shell=False, capture=False)
Creating new GPT entries.
The operation has completed successfully.
Running command ['partprobe', '/dev/sdb'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sdb'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: sdb. Skipping dname.
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
adding partition 'sda-part1' to disk 'sda'
Running command ['parted', '/dev/sda', '--script', 'mkpart', 'primary', '2048s', '3907020799s'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sda'] with allowed return codes ...
Jeff Lane (bladernr) wrote : | #28 |
bah, truncated, here it is from the traceback down:
Traceback (most recent call last):
File "/curtin/
ret = args.func(args)
File "curtin/
meta_
File "curtin/
handler(
File "curtin/
util.subp(" ".join(cmd), shell=True)
File "curtin/util.py", line 99, in subp
return _subp(*args, **kwargs)
File "curtin/util.py", line 70, in _subp
cmd=args)
ProcessExecutio
Command: mdadm --create /dev/md0 --run --level=0 --raid-devices=2 /dev/sda1 /dev/sdb1
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Unexpected error while running command.
Command: mdadm --create /dev/md0 --run --level=0 --raid-devices=2 /dev/sda1 /dev/sdb1
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Jeff Lane (bladernr) wrote : | #29 |
I then ssh'd to the node and attempted to manually build the array using the same command curtin was using: (Note I added -v to see if I could get some additional info from the failure)
ubuntu@
sudo: unable to resolve host supermicro
mdadm: chunk size defaults to 512K
mdadm: /dev/sda1 appears to be part of a raid array:
level=raid0 devices=2 ctime=Thu Jan 14 19:29:50 2016
mdadm: /dev/sdb1 appears to be part of a raid array:
level=raid0 devices=2 ctime=Thu Jan 14 19:29:50 2016
mdadm: creation continuing despite oddities due to --run
mdadm: Defaulting to version 1.2 metadata
mdadm: RUN_ARRAY failed: Invalid argument
Jeff Lane (bladernr) wrote : | #30 |
Interestingly, I then shut the node down and removed the RAID0 scheme and replaced it with a custom flat scheme. This time, not only was the deployment successful, BUT it appears that the installer mounted my OLD devices as a software RAID0 before clearing them:
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=False)
mdadm: failed to RUN_ARRAY /dev/md/0: Invalid argument
clear_holders running on '/sys/block/
clear_holders running on '/sys/devices/
stopping: /dev/md0
Running command ['mdadm', '--stop', '/dev/md0'] with allowed return codes [0, 1] (shell=False, capture=False)
mdadm: stopped /dev/md0
Running command ['mdadm', '--remove', '/dev/md0'] with allowed return codes [0, 1] (shell=False, capture=False)
mdadm: error opening /dev/md0: No such file or directory
Running command ['sgdisk', '--zap-all', '/dev/sda1'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
clear_holders running on '/sys/block/sda', with holders '[]'
Running command ['sgdisk', '--zap-all', '/dev/sda'] with allowed return codes [0, 1, 2, 5] (shell=False, capture=True)
labeling device: '/dev/sda' with 'msdos' partition table
Running command ['parted', '/dev/sda', '--script', 'mklabel', 'msdos'] with allowed return codes [0] (shell=False, capture=False)
Running command ['partprobe', '/dev/sda'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['blkid', '-o', 'export', '/dev/sda'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: sda. Skipping dname.
Running command ['partprobe', '/dev/sdb'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=False)
mdadm: /dev/md/0 assembled from 1 drive - not enough to start the array.
clear_holders running on '/sys/block/
clear_holders running on '/sys/devices/
stopping: /dev/md0
Running command ['mdadm', '--stop', '/dev/md0'] with allowed return codes [0, 1] (shell=False, capture=False)
mdadm: stopped /dev/md0
Running command ['mdadm', '--remove', '/dev/md0'] with allowed return codes [0, 1] (shell=False, capture=False)
mdadm: error opening /dev/md0: No such file or directory
So the raid members are being created successfully, but for whatever reason, they can't be used when I actually WANT a RAID setup.
Ryan Harper (raharper) wrote : Re: [Bug 1519470] Re: Deployment always fails when creating a custom storage config | #32 |
On Thu, Jan 14, 2016 at 1:34 PM, Jeff Lane <email address hidden>
wrote:
> I then ssh'd to the node and attempted to manually build the array using
> the same command curtin was using: (Note I added -v to see if I could
> get some additional info from the failure)
>
> ubuntu@
> --raid-devices=2 /dev/sda1 /dev/sdb1
> sudo: unable to resolve host supermicro
> mdadm: chunk size defaults to 512K
> mdadm: /dev/sda1 appears to be part of a raid array:
> level=raid0 devices=2 ctime=Thu Jan 14 19:29:50 2016
> mdadm: /dev/sdb1 appears to be part of a raid array:
> level=raid0 devices=2 ctime=Thu Jan 14 19:29:50 2016
> mdadm: creation continuing despite oddities due to --run
> mdadm: Defaulting to version 1.2 metadata
> mdadm: RUN_ARRAY failed: Invalid argument
>
Manually we really need to run:
1. mdadm --stop /dev/md0; mdadm --zero-superblock /dev/sda1; mdadm
--zero-superblock /dev/sdb1
2. then re-run the create; and optionally run it without --run.
Jeff Lane (bladernr) wrote : | #31 |
In any case, the flat filesystem is successful now. But software RAID is broken still.
Ryan Harper (raharper) wrote : | #33 |
Also can you confirm raid module is loaded?
On Thu, Jan 14, 2016 at 1:34 PM, Jeff Lane <email address hidden>
wrote:
> I then ssh'd to the node and attempted to manually build the array using
> the same command curtin was using: (Note I added -v to see if I could
> get some additional info from the failure)
>
> ubuntu@
> --raid-devices=2 /dev/sda1 /dev/sdb1
> sudo: unable to resolve host supermicro
> mdadm: chunk size defaults to 512K
> mdadm: /dev/sda1 appears to be part of a raid array:
> level=raid0 devices=2 ctime=Thu Jan 14 19:29:50 2016
> mdadm: /dev/sdb1 appears to be part of a raid array:
> level=raid0 devices=2 ctime=Thu Jan 14 19:29:50 2016
> mdadm: creation continuing despite oddities due to --run
> mdadm: Defaulting to version 1.2 metadata
> mdadm: RUN_ARRAY failed: Invalid argument
>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https:/
>
> Title:
> Deployment always fails when creating a custom storage config
>
> To manage notifications about this bug go to:
> https:/
>
Jeff Lane (bladernr) wrote : Re: Deployment always fails when creating a custom storage config | #34 |
Ok, I did as you suggested and it still fails:
root@supermicro:~# mdadm --stop /dev/md0; mdadm --zero-superblock /dev/sda1; mdadm --zero-superblock /dev/sdb1
mdadm: error opening /dev/md0: No such file or directory
mdadm: Unrecognised md component device - /dev/sda1
root@supermicro:~# mdadm -v -v --create /dev/md0 --run --level=0 --raid-devices=2 /dev/sda1 /dev/sdb1
mdadm: chunk size defaults to 512K
mdadm: Defaulting to version 1.2 metadata
mdadm: RUN_ARRAY failed: Invalid argument
FWIW, this is the version of mdadm used by the ephemeral
root@supermicro:~# mdadm --version
mdadm - v3.2.5 - 18th May 2012
and finally, I do not believe the RAID module is actually loaded.
root@supermicro:~# lsmod
Module Size Used by
dm_crypt 24576 0
overlay 45056 1
iscsi_tcp 20480 2
libiscsi_tcp 28672 1 iscsi_tcp
libiscsi 57344 2 libiscsi_
scsi_transport_
hid_logitech_dj 20480 0
hid_generic 16384 0
i2c_algo_bit 16384 0
ttm 94208 0
drm_kms_helper 126976 0
psmouse 114688 0
drm 344064 2 ttm,drm_kms_helper
ahci 36864 0
libahci 32768 1 ahci
pata_acpi 16384 0
usbhid 53248 0
e1000e 237568 0
hid 110592 3 hid_generic,
ptp 20480 1 e1000e
pps_core 20480 1 ptp
And finally, THAT seems to be the root cause. There are no software RAID modules that I can find on the ephemeral:
root@supermicro
bcache dm-crypt.ko
Compared to my desktop running Trusty:
bladernr@
bcache dm-crypt.ko dm-multipath.ko dm-snapshot.ko linear.ko raid456.ko
dm-bio-prison.ko dm-delay.ko dm-queue-length.ko dm-switch.ko multipath.ko
dm-bufio.ko dm-flakey.ko dm-raid.ko dm-thin-pool.ko persistent-data
dm-cache-cleaner.ko dm-log.ko dm-region-hash.ko dm-verity.ko raid0.ko
dm-cache.ko dm-log-userspace.ko dm-round-robin.ko dm-zero.ko raid10.ko
dm-cache-mq.ko dm-mirror.ko dm-service-time.ko faulty.ko raid1.ko
but they DO exist on the image mounted on the tmpfs:
from the output of mount:
/dev/sdc on /media/root-ro type ext4 (ro)
root@supermicro
bcache dm-crypt.ko dm-multipath.ko dm-snapshot.ko linear.ko raid456.ko
dm-bio-prison.ko dm-delay.ko dm-queue-length.ko dm-switch.ko multipath.ko
dm-bufio.ko dm-flakey.ko dm-raid.ko dm-thin-pool.ko persistent-data
dm-cache-cleaner.ko dm-log.ko dm-region-hash.ko dm-verity.ko raid0.ko
dm-cache.ko dm-log-userspace.ko dm-round-robin.ko dm-zero.ko raid10.ko
dm-cache-mq.ko dm-mirror.ko dm-service-time.ko faulty.ko raid1.ko
But t...
Ryan Harper (raharper) wrote : Re: [Bug 1519470] Re: Deployment always fails when creating a custom storage config | #35 |
On Thu, Jan 14, 2016 at 3:27 PM, Jeff Lane <email address hidden>
wrote:
> Ok, I did as you suggested and it still fails:
> root@supermicro:~# mdadm --stop /dev/md0; mdadm --zero-superblock
> /dev/sda1; mdadm --zero-superblock /dev/sdb1
> mdadm: error opening /dev/md0: No such file or directory
> mdadm: Unrecognised md component device - /dev/sda1
> root@supermicro:~# mdadm -v -v --create /dev/md0 --run --level=0
> --raid-devices=2 /dev/sda1 /dev/sdb1
> mdadm: chunk size defaults to 512K
> mdadm: Defaulting to version 1.2 metadata
> mdadm: RUN_ARRAY failed: Invalid argument
>
> FWIW, this is the version of mdadm used by the ephemeral
> root@supermicro:~# mdadm --version
> mdadm - v3.2.5 - 18th May 2012
>
> and finally, I do not believe the RAID module is actually loaded.
>
> root@supermicro:~# lsmod
> Module Size Used by
> dm_crypt 24576 0
> overlay 45056 1
> iscsi_tcp 20480 2
> libiscsi_tcp 28672 1 iscsi_tcp
> libiscsi 57344 2 libiscsi_
> scsi_transport_
> hid_logitech_dj 20480 0
> hid_generic 16384 0
> i2c_algo_bit 16384 0
> ttm 94208 0
> drm_kms_helper 126976 0
> psmouse 114688 0
> drm 344064 2 ttm,drm_kms_helper
> ahci 36864 0
> libahci 32768 1 ahci
> pata_acpi 16384 0
> usbhid 53248 0
> e1000e 237568 0
> hid 110592 3 hid_generic,
> ptp 20480 1 e1000e
> pps_core 20480 1 ptp
>
>
> And finally, THAT seems to be the root cause. There are no software RAID
> modules that I can find on the ephemeral:
>
> root@supermicro
> bcache dm-crypt.ko
>
> Compared to my desktop running Trusty:
> bladernr@
> bcache dm-crypt.ko dm-multipath.ko
> dm-snapshot.ko linear.ko raid456.ko
> dm-bio-prison.ko dm-delay.ko dm-queue-length.ko
> dm-switch.ko multipath.ko
> dm-bufio.ko dm-flakey.ko dm-raid.ko
> dm-thin-pool.ko persistent-data
> dm-cache-cleaner.ko dm-log.ko dm-region-hash.ko
> dm-verity.ko raid0.ko
> dm-cache.ko dm-log-userspace.ko dm-round-robin.ko dm-zero.ko
> raid10.ko
> dm-cache-mq.ko dm-mirror.ko dm-service-time.ko faulty.ko
> raid1.ko
>
> but they DO exist on the image mounted on the tmpfs:
>
> from the output of mount:
> /dev/sdc on /media/root-ro type ext4 (ro)
>
> root@supermicro
> ls
> bcache dm-crypt.ko dm-multipath.ko
> dm-snapshot.ko linear.ko raid456.ko
> dm-bio-prison.ko dm-delay.ko dm-queue-length.ko
> dm-switch.ko multipath.ko
> dm-bufio.ko dm-flakey.ko dm-raid.ko
> dm-thin-pool.ko persistent-data
> dm-cache-cleaner.ko dm-log.ko dm-region-hash.ko
> dm-verity.ko raid0.ko
> ...
Jeff Lane (bladernr) wrote : | #36 |
On Thu, Jan 14, 2016 at 6:02 PM, Ryan Harper <email address hidden> wrote:
> If you reset like I suggested and then modprobe raid0, can you re-run the
> create command successfully?
No, as I said, the image that is running during deployment has no RAID
modules in /lib/modules.
> And finally, THAT seems to be the root cause. There are no software RAID
> modules that I can find on the ephemeral:
>
> root@supermicro
> bcache dm-crypt.ko
>
The ONLY contents of the md directory is for bcache and the dm-crypt
module. The raidX.ko modules are completely missing. So I think this
is an image problem, RAID is probably OK.
matthew F (matthew-f1989) wrote : Re: Deployment always fails when creating a custom storage config | #37 |
Jeff, if you have resolved this problem I would be curious to know how you did it. I've been having exactly the same problem and while I really want a RAID configuration I've been forced not to because I can't get past this problem.
Jeff Lane (bladernr) wrote : | #38 |
Update:
I just saw Matthew F's comment and decided to retry. First, I deployed with a RAID0 config using today's Xenial from the daily image stream. This seems to have successfully installed, AND the raid modules are present in the deployment ephemeral:
ubuntu@Y-Wing:~$ lsmod |grep raid
raid10 49152 0
raid456 98304 0
async_raid6_recov 20480 1 raid456
async_memcpy 16384 2 raid456,
async_pq 16384 2 raid456,
async_xor 16384 3 async_pq,
async_tx 16384 5 async_pq,
raid6_pq 102400 4 async_pq,
raid1 36864 0
raid0 20480 1
ubuntu@Y-Wing:~$ uname -a
Linux Y-Wing 4.3.0-7-generic #18-Ubuntu SMP Tue Jan 19 15:46:45 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
** NOTE that I am also using a custom built version of Curtin from trunk, because of bug 1533846, I'm not sure if the fix for the Xenial deployment issue has landed in regular curtin packaging yet.
ubuntu@Y-Wing:~$ sudo mdadm --misc --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Tue Jan 26 19:14:47 2016
Raid Level : raid0
Array Size : 3906756608 (3725.77 GiB 4000.52 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Tue Jan 26 19:14:47 2016
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Chunk Size : 512K
Name : Y-Wing:0 (local to host Y-Wing)
UUID : b5f181f4:
Events : 0
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
Jeff Lane (bladernr) wrote : | #39 |
Update part 2:
I repeated this with the other images I have on my MAAS server from the daily stream:
Wily: Successfully installed
Trusty: Failed, there are no RAID modules available in the Trusty ephemeral
http://
Precise: Successfully installed.
Verdict then is that the issue is specific to the Trusty image (at least of the ones I tested).
Jeff Lane (bladernr) wrote : | #40 |
Matthew F:
You'll need to do the following to get an installable system:
Try Wily or Precise instead of Trusty for now.
Try Xenial. To install Xenial, for the moment, you'll need to grab the trunk for curtin:
bzr branch lp:curtin
cd <path to curtin trunk local branch>
./tools/build-deb
then copy the debs to your MAAS server and
sudo dpkg -i <debs>
then you should be able to successfully deploy Xenial images.
Note I'm using images from the Daily stream, not the Release stream to test all this. YMMV. (I think I remembered everything necessary above)
summary: |
- Deployment always fails when creating a custom storage config + Trusty Deployment w/ RAID storage config fails because Trusty images do + not contain RAID kernel modules |
Changed in maas-images: | |
status: | New → Confirmed |
Changed in curtin: | |
status: | Triaged → Invalid |
Jeff Lane (bladernr) wrote : | #41 |
Marking Curtin invalid, this is not a curtin issue. Leaving MAAS for now but that's probably invalid as well. Added the maas-images project because this is a broken image.
Changed in maas: | |
status: | Triaged → Invalid |
no longer affects: | curtin |
Changed in maas-images: | |
importance: | Undecided → Medium |
assignee: | nobody → Scott Moser (smoser) |
status: | Confirmed → Fix Committed |
Scott Moser (smoser) wrote : | #42 |
- initramfs lists compared from old to new Edit (5.7 KiB, text/plain)
Hi.
I'm pretty sure that the issue here is that you were installing hwe-w (or any hwe-* would have done the same).
Trusty's ephemeral image contains the 'linux-generic' package (hwe-t) installed.
So when we boot the ephemearl image with a hwe-t kernel it boots and finds all modules available that would be available if linux-generic were installed.
However, when you boot into the ephemeral environment with any hwe-* kernel, you have only the modules available that are inside the initramfs. Generally speaking, we've attempted to make the initramfs "fat" and include anything we might need.
The change I made to maas-images was to install mdadm (and also lvm2) into the environment that the 'boot-initrd' initramfs is generated. The result is that the mdadm and lvm2 initramfs hooks add the modules they think are necessary.
I'm attaching the changes that each package added.
Scott Moser (smoser) wrote : | #43 |
maas daily images:
trusty 20160217.1
wily 20160217.1
should have the fixes. you can see which ephemeral image versions you have with: https:/
Please test and re-open if you find this does not solve your problem.
Changed in maas: | |
milestone: | 1.9.0 → none |
Scott Moser (smoser) wrote : | #44 |
latest precise images (20160218) should also have mdadm and lvm in their initramfs.
Jeff Lane (bladernr) wrote : | #45 |
Yes, confirmed lated dailies resolved this. I can now deploy systems using a software RAID storage config.
tags: | removed: hwcert-server |
Changed in maas-images: | |
status: | Fix Committed → Fix Released |
OK, next I set the default schema to flat and re-commissioned and did a deployment. That too was successful. So after testing that, I "unmounted" and removed the formatting for sda-part1.
I created sdb-part1 and then selected both sda-part1 and sdb-part1 and clicked "Create RAID".
Then I created a RAID0 using those partititons, formatted as ext4 and mounted at /
That left me with this:
Used disks and partitions
Name|Model|Serial Boot Device type Used for
md0 RAID 0 ext4 formatted filesystem mounted at /
sda Physical MBR partitioned with 1 partition
sda-part1 Partition Active raid-0 device for md0
sdb Physical GPT partitioned with 1 partition
sdb-part1 Partition Active raid-0 device for md0
Not sure why it opts for GPT on the one I just created, but MBR when it automatically cretaes a flat system, that seems discongruous.