Fuel fails to offer mdXXX devices as install option

Bug #1617071 reported by MercSniper on 2016-08-25
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Alexander Gordeev
8.0.x
High
MOS Maintenance
Mitaka
High
Rodion Tikunov

Bug Description

Detailed bug description:
 Attempting to install "os" group on to a mdXXX is impossible as fuel fails to present the raid device as an option to install, but detects and presents the individual drives as options.
Steps to reproduce:
 Configure RAID1 with Intel BIOS on slave.
 Boot slave to Fuel 9 Master instance
 Attempt Configure Disk (compute/storage/controller)
Expected results:
 Device MDXXX is presented as an option
Actual result:
 Bare drives are presented
Reproducibility:
 Every time
Workaround:
 N/A
Impact:
 inability to utilize RAID 1 functionality for boot drives presents a risk to the system failing and not recognizing the ceph volumes. Rebuilding of CEPH volumes introduces a risk to a highly available system.
Description of the environment:
 Operation system: Ubuntu 14.04.5
 Versions of components: Fuel 9.0
 Reference architecture: X86
 Network model: N/A
 Related projects installed: Fuel

Changed in fuel:
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
tags: added: feature
Changed in fuel:
importance: Undecided → High
status: New → Confirmed
importance: High → Medium
milestone: none → 10.0
tags: added: area-python
Alexander Gordeev (a-gordeev) wrote :

Hello,

@MercSniper, could you attach some log files?

at least, remote nailgun-agent.log (or just agent.log) and fuel-agent.log log files from the nodes both are necessary. You can find them under /var/log/remote/<node-id>/bootstrap/ directory on fuel-master node.

In addition, i'd like to request outputs from any of node with configured RAID1:

1) $ cat /proc/mdstat

and then the output from
2) $ mdadm -D <for each device from /proc/mdstat>

Right now the bug report is incomplete, therefore we can't proceed further with RCA investigation.

Alexander Gordeev (a-gordeev) wrote :

Moving to Incomplete, waiting for the information from the reporter.

@MercSniper, feel free to assign the bug back to me, once you're done with logs.

Changed in fuel:
status: Confirmed → Incomplete
importance: Medium → High
assignee: Fuel Sustaining (fuel-sustaining-team) → MercSniper (mercsniper)
Alexander Gordeev (a-gordeev) wrote :

Raised to High just because it really affects the deployment and fuel is capable of dealing with such RAID1 arrays. So, the major feature may be broken. And there's no workaround so far

MercSniper (mercsniper) wrote :
MercSniper (mercsniper) wrote :

root@bootstrap:/var/log# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md126 : active (auto-read-only) raid1 sda[1] sdb[0]
      74235904 blocks super external:/md127/0 [2/2] [UU]

md127 : inactive sda[1](S) sdb[0](S)
      6306 blocks super external:imsm

unused devices: <none>
root@bootstrap:/var/log# mdadm -D md126
mdadm: cannot open md126: No such file or directory
root@bootstrap:/var/log# mdadm -D /dev/md126
/dev/md126:
      Container : /dev/md/imsm0, member 0
     Raid Level : raid1
     Array Size : 74235904 (70.80 GiB 76.02 GB)
  Used Dev Size : 74236036 (70.80 GiB 76.02 GB)
   Raid Devices : 2
  Total Devices : 2

          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 49e8c0a5:189a7490:de585028:43c52dd9
    Number Major Minor RaidDevice State
       1 8 0 0 active sync /dev/sda
       0 8 16 1 active sync /dev/sdb
root@bootstrap:/var/log# mdadm -D /dev/md127
/dev/md127:
        Version : imsm
     Raid Level : container
  Total Devices : 2

Working Devices : 2

           UUID : 60ac932c:3f119e5c:bf748c30:82c0548b
  Member Arrays : /dev/md/Volume0_0

    Number Major Minor RaidDevice

       0 8 16 - /dev/sdb
       1 8 0 - /dev/sda

Changed in fuel:
assignee: MercSniper (mercsniper) → Alexander Gordeev (a-gordeev)
status: Incomplete → Confirmed
Alexander Gordeev (a-gordeev) wrote :

From first look something went wrong with nailgun-agent. It didn't include /dev/md* to the list of appropriate disks.

Confirmed.

tags: added: module-nailgun-agent
removed: module-nailgun
Changed in fuel:
assignee: Alexander Gordeev (a-gordeev) → Fuel Sustaining (fuel-sustaining-team)
Alexander Gordeev (a-gordeev) wrote :

Temporary assigned to Sustaining team. Will take it back later if nobody takes it.

Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Alexander Gordeev (a-gordeev)

Fix proposed to branch: master
Review: https://review.openstack.org/373448

Changed in fuel:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/373448
Committed: https://git.openstack.org/cgit/openstack/fuel-nailgun-agent/commit/?id=60173537665464db260fb93829eb8affced33535
Submitter: Jenkins
Branch: master

commit 60173537665464db260fb93829eb8affced33535
Author: Alexander Gordeev <email address hidden>
Date: Tue Sep 20 19:11:01 2016 +0300

    Add 'Container' field for MD parser

    The logic of filtering of fake RAID MD heavily relies on the
    presence of of 'Container' field inside of parsed data.
    If this field is missing, it will never figure out the name of any
    fake RAID devices and its component.

    This patch adds this field to a parser.

    In addition to that, it also logs all found fake RAIDs and
    their components for the sake of easy debugging.

    Change-Id: I2066c5a0e995e542271cd308c9d83e2373787be4
    Closes-Bug: #1617071

Changed in fuel:
status: In Progress → Fix Committed
Alexander Gordeev (a-gordeev) wrote :

This feature was merged into fuel 8.0, so all versions starting 8.0 must be fixed too.

For proper verification specific h/w required: any node with intel fake RAID configured to Level1 is necessary.

Alexander Gordeev (a-gordeev) wrote :

Assigned to Maintenance Team for all affected releases which are 8.0 and 9.x

This issue was fixed in the openstack/fuel-nailgun-agent 10.0.0rc1 release candidate.

Reviewed: https://review.openstack.org/377319
Committed: https://git.openstack.org/cgit/openstack/fuel-nailgun-agent/commit/?id=13fb4009d3f7c46222791bb9623cb05f8ba42ad8
Submitter: Jenkins
Branch: stable/mitaka

commit 13fb4009d3f7c46222791bb9623cb05f8ba42ad8
Author: Alexander Gordeev <email address hidden>
Date: Tue Sep 20 19:11:01 2016 +0300

    Add 'Container' field for MD parser

    The logic of filtering of fake RAID MD heavily relies on the
    presence of of 'Container' field inside of parsed data.
    If this field is missing, it will never figure out the name of any
    fake RAID devices and its component.

    This patch adds this field to a parser.

    In addition to that, it also logs all found fake RAIDs and
    their components for the sake of easy debugging.

    Change-Id: I2066c5a0e995e542271cd308c9d83e2373787be4
    Closes-Bug: #1617071
    (cherry picked from commit 60173537665464db260fb93829eb8affced33535)

tags: added: on-verification

Verified on 9.2 snapshot #541.

Actual results:
root@bootstrap:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
`-md126 9:126 0 1.7T 0 raid1
sdb 8:16 0 1.8T 0 disk
`-md126 9:126 0 1.7T 0 raid1
loop0 7:0 0 273.3M 0 loop /lib/live/mount/rootfs/root.squashfs
root@bootstrap:~#

From Fuel UI:
Disks 1 drive, 1.7 TB total
  name md126
  model
  size 1.7 TB
  paths
  removable 0
  disk md126

tags: removed: on-verification

This issue was fixed in the openstack/fuel-nailgun-agent 10.0.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers