cinder does not ignore devices already in use

Bug #1581221 reported by Francis Ginther
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Landscape Server
High
Björn Tillenius
16.06
High
Björn Tillenius
OpenStack cinder charm
Medium
Unassigned
cinder (Juju Charms Collection)
Medium
Unassigned

Bug Description

This is from a landscape OSA deployment. Several of my hardware nodes have a /dev/nvme0n1 device, which is a PCIe SSD. For two of my nodes, I've configured it for bcache, as a result on these nodes, /dev/nvme0n1 is in use. However, on one node /dev/nvme0n1 is still unused. As a result landscape selects this as a usable cinder disk and passes it as a cinder block-device. When cinder is deployed to one of the nodes with bcache, the config-changed hook fails.

2016-05-12 20:19:20 INFO config-changed No physical volume label read from /dev/nvme0n1
2016-05-12 20:19:20 INFO config-changed Failed to read physical volume "/dev/nvme0n1"
2016-05-12 20:19:20 INFO config-changed No physical volume label read from /dev/nvme0n1
2016-05-12 20:19:20 INFO config-changed Failed to read physical volume "/dev/nvme0n1"
2016-05-12 20:19:21 INFO config-changed Creating new GPT entries.
2016-05-12 20:19:21 INFO config-changed GPT data structures destroyed! You may now partition the disk using fdisk or
2016-05-12 20:19:21 INFO config-changed other utilities.
2016-05-12 20:19:23 INFO config-changed Creating new GPT entries.
2016-05-12 20:19:23 INFO config-changed The operation has completed successfully.
2016-05-12 20:19:23 INFO config-changed 1+0 records in
2016-05-12 20:19:23 INFO config-changed 1+0 records out
2016-05-12 20:19:23 INFO config-changed 1048576 bytes (1.0 MB) copied, 0.00217673 s, 482 MB/s
2016-05-12 20:19:23 INFO config-changed 100+0 records in
2016-05-12 20:19:23 INFO config-changed 100+0 records out
2016-05-12 20:19:23 INFO config-changed 51200 bytes (51 kB) copied, 0.00078845 s, 64.9 MB/s
2016-05-12 20:19:23 INFO config-changed Can't open /dev/nvme0n1 exclusively. Mounted filesystem?
2016-05-12 20:19:23 INFO config-changed Traceback (most recent call last):
...
2016-05-12 20:19:23 INFO config-changed subprocess.CalledProcessError: Command '['pvcreate', u'/dev/nvme0n1']' returned non-zero exit status 5
2016-05-12 20:19:23 ERROR juju.worker.uniter.operation runhook.go:107 hook "config-changed" failed: exit status 1

Revision history for this message
Francis Ginther (fginther) wrote :
Revision history for this message
Francis Ginther (fginther) wrote :

Output from "juju get cinder".

Revision history for this message
Francis Ginther (fginther) wrote :

The nvme0n1 device does not have a 'by-id' entry in the device tree but it does have a by-uuid entry:

ubuntu@omaha:/dev/disk$ ll by-dname/
total 0
drwxr-xr-x 2 root root 60 May 12 21:28 ./
drwxr-xr-x 5 root root 100 May 12 21:28 ../
lrwxrwxrwx 1 root root 13 May 12 21:28 bcache0 -> ../../bcache0
ubuntu@omaha:/dev/disk$ ll by-id/
total 0
drwxr-xr-x 2 root root 200 May 12 21:28 ./
drwxr-xr-x 5 root root 100 May 12 21:28 ../
lrwxrwxrwx 1 root root 9 May 12 21:28 ata-MB1000GCEEK_WCAW37040132 -> ../../sda
lrwxrwxrwx 1 root root 10 May 12 21:28 ata-MB1000GCEEK_WCAW37040132-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 May 12 21:28 ata-MB1000GCEEK_WCAW37040132-part2 -> ../../sda2
lrwxrwxrwx 1 root root 9 May 12 21:28 ata-MB1000GCEEK_WCAW37073471 -> ../../sdb
lrwxrwxrwx 1 root root 9 May 12 21:28 wwn-0x50014ee25eda851b -> ../../sdb
lrwxrwxrwx 1 root root 9 May 12 21:28 wwn-0x50014ee25edb5d76 -> ../../sda
lrwxrwxrwx 1 root root 10 May 12 21:28 wwn-0x50014ee25edb5d76-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 May 12 21:28 wwn-0x50014ee25edb5d76-part2 -> ../../sda2
ubuntu@omaha:/dev/disk$ ll by-uuid/
total 0
drwxr-xr-x 2 root root 120 May 12 21:28 ./
drwxr-xr-x 5 root root 100 May 12 21:28 ../
lrwxrwxrwx 1 root root 10 May 12 21:28 56a8bab5-971f-45a6-9bc0-1d6c2c9daae7 -> ../../sda2
lrwxrwxrwx 1 root root 10 May 12 21:28 5f294c60-fd32-48df-86d1-41c5f133b9f2 -> ../../sda1
lrwxrwxrwx 1 root root 13 May 12 21:28 6e032f96-81f3-4a0d-aa31-ed07e747119c -> ../../nvme0n1
lrwxrwxrwx 1 root root 13 May 12 21:28 a5844495-a15d-4b66-aff7-044fb7dce11c -> ../../bcache0

See https://pastebin.canonical.com/156389/ for better formatting.

tags: added: kanban-cross-team
tags: removed: kanban-cross-team
Revision history for this message
Francis Ginther (fginther) wrote :

This can also happen when the device starts out unused, but is listed multiple times in the block-device config option. For example:

    value: /dev/disk/by-id/wwn-0x50014ee25eda851b /dev/disk/by-id/wwn-0x50014ee2b4301798
      /dev/disk/by-id/wwn-0x50014ee2b4302108 /dev/disk/by-id/wwn-0x50014ee2b43021d3
      /dev/disk/by-id/wwn-0x50014ee2b43162b6 /dev/nvme0n1 /dev/nvme0n1 /dev/nvme0n1

Leads to:
2016-05-15 03:22:32 INFO config-changed Physical volume "/dev/nvme0n1" successfully created
2016-05-15 03:22:32 INFO juju-log [cinder] pvscan: PV /dev/nvme0n1 lvm2 [372.61 GiB]
  PV /dev/sdb lvm2 [931.51 GiB]
  Total: 2 [1.27 TiB] / in use: 0 [0 ] / in no VG: 2 [1.27 TiB]

2016-05-15 03:22:32 INFO config-changed Volume group "cinder-volumes" not found
2016-05-15 03:22:32 INFO config-changed Volume group "cinder-volumes" successfully created
2016-05-15 03:22:33 INFO config-changed Volume group "cinder-volumes" successfully extended
2016-05-15 03:22:33 INFO config-changed Physical volume '/dev/nvme0n1' is already in volume group 'cinder-volumes'
2016-05-15 03:22:33 INFO config-changed Unable to add physical volume '/dev/nvme0n1' to volume group 'cinder-volum
es'.
2016-05-15 03:22:33 INFO config-changed Traceback (most recent call last):
...

See more of the juju log here: http://paste.ubuntu.com/16451905/

Full logs are attached.

Revision history for this message
Francis Ginther (fginther) wrote :

"juju get cinder" associated with comment #4.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The cinder charm should be made more resilient to this, and Landscape should not give it the same device more than once.

Changed in landscape:
importance: Undecided → High
Changed in landscape:
status: New → In Progress
assignee: nobody → Björn Tillenius (bjornt)
Changed in landscape:
milestone: none → 16.06
status: In Progress → Fix Committed
James Page (james-page)
Changed in cinder (Juju Charms Collection):
status: New → Triaged
importance: Undecided → Medium
Changed in landscape:
status: Fix Committed → Fix Released
James Page (james-page)
Changed in charm-cinder:
importance: Undecided → Medium
status: New → Triaged
Changed in cinder (Juju Charms Collection):
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers