[2.0b5] MAAS 2.0 fails to format all bcache partitions

Bug #1587201 reported by Brad Marshall
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned

Bug Description

I'm trying to deploy Openstack Mitaka on Xenial using Juju 2.0 beta7 and MAAS 2.0 beta 5.

As part of this, I'm deploying ceph using bcache on a SSD. I've configured my storage nodes to have one cache set on a SSD, and 5 extra drives as seperate bcache devices, currently one for swift, and 5 for ceph.

I've made the cache sets and bcache devices via the webui, and have setup the 5 ceph devices to format as XFS and mount to our desired location, /srv/ceph/cephx.

Unfortunately, it only appears that occasionally not all devices get formatted, and it seems random which ones do. When the systems boot, they give:

Welcome to emergPress Enter for maintenance
(or press Control-D to continue):

And when I log in and check, I can see only 4 drives mounted:

Node 1:
/dev/bcache3 838G 33M 838G 1% /srv/ceph/ceph2
/dev/bcache2 838G 33M 838G 1% /srv/ceph/ceph1
/dev/bcache4 838G 33M 838G 1% /srv/ceph/ceph3
/dev/bcache1 838G 33M 838G 1% /srv/ceph/ceph0

Node 2:
/dev/bcache3 838G 33M 838G 1% /srv/ceph/ceph2
/dev/bcache2 838G 33M 838G 1% /srv/ceph/ceph1
/dev/bcache4 838G 33M 838G 1% /srv/ceph/ceph3
/dev/bcache5 838G 33M 838G 1% /srv/ceph/ceph4

Node 3:
/dev/bcache2 838G 33M 838G 1% /srv/ceph/ceph1
/dev/bcache5 838G 33M 838G 1% /srv/ceph/ceph4
/dev/bcache4 838G 33M 838G 1% /srv/ceph/ceph3
/dev/bcache3 838G 33M 838G 1% /srv/ceph/ceph2

Fstab files look right, and when I run mount -a:

# mount -a
mount: wrong fs type, bad option, bad superblock on /dev/bcache1,
     missing codepage or helper program, or other error

     In some cases useful info is found in syslog - tr
     dmesg | tail or so.

If I manually make the filesystem on the one that was missed, it mounts fine.

If I then tell it to continue after manually making the fs and mounting, I get the following for both config and final state:

  Can not apply stage config, no datasource found! Likely bad things to come!

Full error is at https://paste.ubuntu.com/16856028/.

Sometimes the deploy process will go flawlessly, so I'm fairly sure there isn't a configuration problem with the devices and the way I've setup the mounts.

$ dpkg-query -W juju-2.0
juju-2.0 2.0-beta7-0ubuntu1~16.04.2~juju1

$ dpkg-query -W maas
maas 2.0.0~beta5+bzr5026-0ubuntu1~xenial1

Please let me know if you need any further information.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Brad,

Can you please provide the output of:

maas <user> node get-curtin-config <systemd_id>

And provide a *full* installation log from a machine that failed.

Also, please provide the cloud-init/cloud-init-output logs for the deploymentment (try looking under /var/log/maas/rsyslog/<machine-name>/.

Changed in maas:
status: New → Incomplete
milestone: none → 2.0.0
summary: - MAAS 2.0 fails to format all bcache partitions
+ [2.0b5] MAAS 2.0 fails to format all bcache partitions
Revision history for this message
Brad Marshall (brad-marshall) wrote :

Unfortunately I no longer have access to the setup to get the logs required, sorry. I'll see what I can do about reproducing it in a staging environment of some kind.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi All,

Does this continue to be an issue? it doens't seem to be the case. As such I'm marking this as invalid. If you find the issue again, please file a new bug or re-open this one.

Also, this could have already been fixed in a newer version of curtin!

tags: added: internal
Changed in maas:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.