juju stuck attaching storage to OSD

Bug #1778033 reported by Michał Ajduk
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth
2.3
Won't Fix
Undecided
Unassigned
2.4
Fix Released
High
Ian Booth

Bug Description

When deploying bundle with storage support for ceph OSD. OSD units are stuck in an infinite pending state:
ceph-osd/0 waiting allocating 3 100.80.61.4 agent initializing
ceph-osd/1 waiting allocating 4 100.80.61.10 agent initializing
ceph-osd/2 waiting allocating 5 100.80.41.8 agent initializing
ceph-osd/3 waiting allocating 6 100.80.61.6 agent initializing
ceph-osd/4 waiting allocating 7 100.80.41.2 agent initializing
ceph-osd/5 waiting allocating 8 100.80.31.1 agent initializing
ceph-osd/6 waiting allocating 9 100.80.41.6 agent initializing
ceph-osd/7 waiting allocating 10 100.80.51.2 agent initializing
ceph-osd/8 waiting allocating 11 100.80.51.4 agent initializing
ceph-osd/9 waiting allocating 12 100.80.61.1 agent initializing
ceph-osd/10* waiting allocating 13 100.80.41.1 agent initializing
ceph-osd/11 waiting allocating 14 100.80.51.3

I can see that storage attachment is pending in ceph-osd/0 unit logs:

2018-06-20 00:54:18 DEBUG juju.worker.uniter.storage resolver.go:125 next hook op for storage-osd-devices-0: {Kind:0 Life:alive Attached:false Location:}
2018-06-20 00:54:18 DEBUG juju.worker.uniter.storage resolver.go:125 next hook op for storage-osd-devices-1: {Kind:0 Life:alive Attached:false Location:}
2018-06-20 00:54:18 DEBUG juju.worker.uniter.storage resolver.go:125 next hook op for storage-osd-devices-2: {Kind:0 Life:alive Attached:false Location:}
2018-06-20 00:54:18 DEBUG juju.worker.uniter.storage resolver.go:125 next hook op for storage-osd-journals-3: {Kind:0 Life:alive Attached:false Location:}
2018-06-20 00:54:18 DEBUG juju.worker.uniter.storage resolver.go:91 still pending [storage-osd-devices-0 storage-osd-devices-1 storage-osd-devices-2 storage-osd-journals-3]
2018-06-20 00:58:23 DEBUG juju.worker.uniter.remotestate watcher.go:455 update status timer triggered

Storage pools are created:

deployer@ic-skbrat2-infra1:~$ juju storage-pools
Name Provider Attrs
loop loop
maas maas
rootfs rootfs
rotary maas tags=rotary
ssd maas tags=ssd
tmpfs tmpfs

According to unit ceph-osd/0 log above storage-osd-devices-0 is not attached: {Kind:0 Life:alive Attached:false Location:} but juju shows storage-osd-devices-0 is attached:

deployer@ic-skbrat2-infra1:~$ juju show-storage osd-devices/0
osd-devices/0:
  kind: block
  life: alive
  status:
    current: attached
    since: 19 Jun 2018 23:09:56Z
  persistent: false
  attachments:
    units:
      ceph-osd/0:
        machine: "3"
        life: alive

All storage devices are attached:

deployer@ic-skbrat2-infra1:~$ juju storage
[Storage]
Unit Id Type Pool Provider id Size Status Message
ceph-osd/0 osd-devices/0 block rotary volume-0 1.5TiB attached
ceph-osd/0 osd-devices/1 block rotary volume-1 1.5TiB attached
ceph-osd/0 osd-devices/2 block rotary volume-2 1.5TiB attached
ceph-osd/0 osd-journals/3 block ssd volume-3 1.5TiB attached
ceph-osd/1 osd-devices/4 block rotary volume-4 1.5TiB attached
...

Reproduce steps:
- juju create-storage-pool ssd maas tags=ssd
- juju create-storage-pool rotary maas tags=rotary
- juju deploy =bundle.yaml

Relevant bundle part:
  ceph-osd:
    charm: ../charms/ceph-osd
    num_units: *ceph-osd-count
    constraints: tags=storage-all
    storage:
      osd-devices: rotary,3
      osd-journals: ssd,1
    bindings:
      "": *oam-space
      public: *ceph-public-space
      cluster: *ceph-cluster-space
    options:
      osd-devices: *osd-devices
      osd-journal: *osd-journal
      source: *source
      customize-failure-domain: True
      autotune: true
      aa-profile-mode: enforce
      key: *key

Software version:
~$ juju version
juju 2.3.7-xenial-amd64

~$ juju status
Model Controller Cloud/Region Version SLA
openstack foundations-maas foundations-maas 2.3.8.1 unsupported

MAAS:
2.3.3 6498-ge4db91d-0ubuntu1~16.04.1

Attached:
- output of maas node read for machine 3 hosting ceph-osd/0
- machine log and unit log from machine 3 hosting ceph-osd/0

Tags: 4010 sts
Revision history for this message
Michał Ajduk (majduk) wrote :
Revision history for this message
Ian Booth (wallyworld) wrote :

The symptoms seem to indicate that Juju has not recorded the block volumes as being provisioned. This is a prerequisite for running the attached hooks.

Could you please attach the machine-0.log file with debug level logging (ie we need to see the controller logs). This will allow inspection of the storage provisioning operations.

Revision history for this message
Gábor Mészáros (gabor.meszaros) wrote :

I've attached all the logs (juju ha)

Revision history for this message
Ian Booth (wallyworld) wrote :

The logs are huge :-) Nothing immediately jumps out. The issue may well be specific to this particular deployment. If would be great to get access to the maas so we could deploy the minimum bundle needed to reproduce the issue with an instrumented version of juju to capture the extra info needed to figure out what's happening. Is that possible?

Changed in juju:
status: New → Incomplete
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

I hit the same issue just now on 2.3.8.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju 2.4 because there has been no activity for 60 days.]

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju 2.3 because there has been no activity for 60 days.]

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
Revision history for this message
Ante Karamatić (ivoks) wrote :

What we see from the trace is:

2018-11-20 09:55:05 DEBUG juju.worker.uniter.remotestate watcher.go:451 got storage change: [osd-devices/38 osd-devices/39] ok=true
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:225 -> {"request-id":78,"type":"Uniter","version":8,"request":"StorageAttachmentLife","params":{"ids":[{"storage-tag":"storage-osd-devices-38","unit-tag":"uni
t-testing-osd-0"},{"storage-tag":"storage-osd-devices-39","unit-tag":"unit-testing-osd-0"}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:120 <- {"request-id":78,"response":{"results":[{"life":"alive"},{"life":"alive"}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:225 -> {"request-id":79,"type":"Uniter","version":8,"request":"WatchStorageAttachments","params":{"ids":[{"storage-tag":"storage-osd-devices-38","unit-tag":"u
nit-testing-osd-0"}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:120 <- {"request-id":79,"response":{"results":[{"NotifyWatcherId":"20"}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:225 -> {"request-id":80,"type":"Uniter","version":8,"request":"StorageAttachments","params":{"ids":[{"storage-tag":"storage-osd-devices-38","unit-tag":"unit-t
esting-osd-0"}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:225 -> {"request-id":81,"type":"NotifyWatcher","version":1,"id":"20","request":"Next","params":{}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:120 <- {"request-id":80,"response":{"results":[{"result":{"storage-tag":"","owner-tag":"","unit-tag":"","kind":0,"location":"","life":""},"error":{"message":"
storage osd-devices/38 not provisioned","code":"not provisioned"}}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:225 -> {"request-id":82,"type":"Uniter","version":8,"request":"WatchStorageAttachments","params":{"ids":[{"storage-tag":"storage-osd-devices-39","unit-tag":"unit-testing-osd-0"}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:120 <- {"request-id":82,"response":{"results":[{"NotifyWatcherId":"21"}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:225 -> {"request-id":83,"type":"Uniter","version":8,"request":"StorageAttachments","params":{"ids":[{"storage-tag":"storage-osd-devices-39","unit-tag":"unit-testing-osd-0"}]}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:225 -> {"request-id":84,"type":"NotifyWatcher","version":1,"id":"21","request":"Next","params":{}}
2018-11-20 09:55:05 TRACE juju.rpc.jsoncodec codec.go:120 <- {"request-id":83,"response":{"results":[{"result":{"storage-tag":"","owner-tag":"","unit-tag":"","kind":0,"location":"","life":""},"error":{"message":"storage osd-devices/39 not provisioned","code":"not provisioned"}}]}}

Changed in juju:
status: Expired → Confirmed
Revision history for this message
Ante Karamatić (ivoks) wrote :

lsblk from the machine:

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdf 8:80 0 1.5T 0 disk
sdd 8:48 0 1.5T 0 disk
sdb 8:16 0 1.5T 0 disk
├─sdb2 8:18 0 500M 0 part
│ └─md0 9:0 0 499.7M 0 raid1 /boot
├─sdb3 8:19 0 204.9G 0 part
│ └─md1 9:1 0 204.8G 0 raid1 /
└─sdb1 8:17 0 96M 0 part
sdi 8:128 0 1.8T 0 disk
sdg 8:96 0 1.5T 0 disk
sde 8:64 0 1.5T 0 disk
sdc 8:32 0 1.5T 0 disk
sda 8:0 0 1.5T 0 disk
├─sda2 8:2 0 500M 0 part
│ └─md0 9:0 0 499.7M 0 raid1 /boot
├─sda3 8:3 0 204.9G 0 part
│ └─md1 9:1 0 204.8G 0 raid1 /
└─sda1 8:1 0 96M 0 part /boot/efi
sdj 8:144 0 1.8T 0 disk
sdh 8:112 0 1.5T 0 disk

Revision history for this message
Gábor Mészáros (gabor.meszaros) wrote :
Revision history for this message
Gábor Mészáros (gabor.meszaros) wrote :
Revision history for this message
Gábor Mészáros (gabor.meszaros) wrote :
Revision history for this message
Gábor Mészáros (gabor.meszaros) wrote :
Revision history for this message
Felipe Reyes (freyes) wrote :

machine-115.log is logging about a non matching path for the partitions sdn{1,2,3} and sdn{1,2,3}[0] which according to lsblk[1] they are used to build a raid, specifically:

sdn1 -> used for '/boot/efi'
sdo1 -> unused
sdn2 + sdo2 -> md0 used for '/boot'
sdn3 + sdo3 -> md1 used for '/'

The debug message "non matching DEVPATH for ..."[2] doesn't look like it's related to the stuck state of the hooks, because it's simply indicating that a scsi path couldn't be found based on the device name.

The reason why the DEVPATH message is being printed over and over again giving the idea juju is stuck it's because this is part of the diskmanager worker which polls disks every 30 seconds[3]

The sequence of calls is:

diskmanager worker every 30 secs:
-> listBlockDevices()
-> addHardwareInfo()
-> logger.Debugf("non matching DEVPATH for %q: %q", dev.DeviceName, devpath)

[0] $ grep DEVPATH machine-115.log | awk -F' ' '{print $10}' | sort | uniq | cut -d'"' -f2
sdn1
sdn2
sdn3
sdo1
sdo2
sdo3
[1] NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdf 8:80 0 1.5T 0 disk
sdo 8:224 0 222.6G 0 disk
|-sdo3 8:227 0 204.9G 0 part
| `-md1 9:1 0 204.8G 0 raid1 /
|-sdo1 8:225 0 96M 0 part
`-sdo2 8:226 0 500M 0 part
`-md0 9:0 0 499.7M 0 raid1 /boot
sdd 8:48 0 1.5T 0 disk
sdm 8:192 0 1.5T 0 disk
sdb 8:16 0 1.5T 0 disk
sdk 8:160 0 1.5T 0 disk
sdi 8:128 0 1.5T 0 disk
sdg 8:96 0 1.5T 0 disk
sde 8:64 0 1.5T 0 disk
sdn 8:208 0 222.6G 0 disk
|-sdn3 8:211 0 204.9G 0 part
| `-md1 9:1 0 204.8G 0 raid1 /
|-sdn1 8:209 0 96M 0 part /boot/efi
`-sdn2 8:210 0 500M 0 part
`-md0 9:0 0 499.7M 0 raid1 /boot
sdc 8:32 0 1.5T 0 disk
sdl 8:176 0 1.5T 0 disk
sda 8:0 0 1.5T 0 disk
sdj 8:144 0 1.5T 0 disk
sdh 8:112 0 1.5T 0 disk
[2] https://github.com/juju/juju/blob/2.3/worker/diskmanager/lsblk.go#L228-L252
[3] https://github.com/juju/juju/blob/2.3/worker/diskmanager/diskmanager.go#L24

tags: added: sts
Revision history for this message
Felipe Reyes (freyes) wrote :

The machine 115 which hosts the unit testing-osd2/3 is informing that the there
are still pending storages to be processed[0], this message comes from the
uniter/storage/resolver.go[1] after this message there is a check if the state
install hook was already run, if it hasn't, then it will keep waiting.

A fresh "juju status" shows that the machine is in "allocating" state[2], so the
question here is why the agent hasn't reached "ready" state.

[0] $ grep "still pending" juju-{1,2,3}/var/log/juju/logsink.log |grep testing-osd2-3
juju-2/var/log/juju/logsink.log:dc746ae6-fac2-415c-88e8-cad7c6e2c41e: unit-testing-osd2-3 2018-11-20 14:02:28 DEBUG juju.worker.uniter.storage resolver.go:91 still pending [storage-osd-devices-46 storage-osd-devices-47]
juju-2/var/log/juju/logsink.log:dc746ae6-fac2-415c-88e8-cad7c6e2c41e: unit-testing-osd2-3 2018-11-20 14:07:27 DEBUG juju.worker.uniter.storage resolver.go:91 still pending [storage-osd-devices-46 storage-osd-devices-47]
juju-2/var/log/juju/logsink.log:dc746ae6-fac2-415c-88e8-cad7c6e2c41e: unit-testing-osd2-3 2018-11-20 14:12:12 DEBUG juju.worker.uniter.storage resolver.go:91 still pending [storage-osd-devices-46 storage-osd-devices-47]
juju-2/var/log/juju/logsink.log:dc746ae6-fac2-415c-88e8-cad7c6e2c41e: unit-testing-osd2-3 2018-11-20 14:17:19 DEBUG juju.worker.uniter.storage resolver.go:91 still pending [storage-osd-devices-46 storage-osd-devices-47]
juju-2/var/log/juju/logsink.log:dc746ae6-fac2-415c-88e8-cad7c6e2c41e: unit-testing-osd2-3 2018-11-20 14:21:26 DEBUG juju.worker.uniter.storage resolver.go:91 still pending [storage-osd-devices-46 storage-osd-devices-47]
[1] https://github.com/juju/juju/blob/2.3/worker/uniter/storage/resolver.go#L91
[2] testing-osd2/3 waiting allocating 115 100.80.51.1 agent initializing

Revision history for this message
Ian Booth (wallyworld) wrote :

When Juju provisions a machine with storage, depending on the provider, it generally gets back from the start api call information from the provider about each volume that has been provisioned, including:
- wwn
- hardware id

plus info about how that volume is/will be attached to the machine, including:
- bus address
- device link
- device name

The content of the volumes and volumeattachments collections shows that MAAS is providing Juju with the wwn of each volume and no information about the attachment of that volume.

For the uniter to run the storage attached hook, it needs to know that the storage attachment has been completed, and this is done by polling the block devices using lsblk. The info from lsblk is recorded and matched with the volume/attachment info obtained from the maas apis when the machine was provisioned.

The block device info from lsblk that Juju can parse includes device name, size, label uuid, type etc. So just the basic info to record a block device exists.

So we're down to matching my WWN. There is a volume info record for "osd-devices/40" which has a WWN of "0x600605b00d6be9e023586a261060047f". And in lsblk output there is a record of that block device being present:

2018-11-20 11:08:05 INFO juju.worker.diskmanager diskmanager.go:67 block devices changed: [{sda [/dev/disk/by-id/scsi-3600605b00d6be9e023586a261060047f /dev/disk/by-id/wwn-0x600605b00d6be9e023586a261060047f /dev/disk/by-path/pci-0000:0b:00.0-scsi-0:2:0:0]...

Juju can't reliably parse the WWN from this so it then runs for each device "udevadm":

udevadm info -q property --name <deviceName>

It looks for "ID_WWN=xxxxx" in the output.

So what we need to look further is the udevadm output for say "osd-devices/40" and maybe a few others, plus the content of the "blockdevices" collection. We need to find out why Juju cannot match the block devices with the volume/attachment info from MAAS.

Note, udevadm is used to get the bus address of the device. The bus address is expected to be in the format:

.../X.X.X.X/block/<devicename>

where as for this deployment the udevadm output produces:

.../1:2:0:0/block/sdo/sdo3

The extra "sdo" bit means that Juju cannot parse a bus address. That might be an issue but it's not here because the volume attachment data from MAAS doesn't contain a bus address we can match against anyway.

Ian Booth (wallyworld)
Changed in juju:
importance: Undecided → High
Revision history for this message
Ante Karamatić (ivoks) wrote :

Currently I can work only with osd-devices/38. It's located on testing-osd/0. It's WWN is 0x600508b1001cf7030d45af15fa4ea14d. On the machine, this relates to /dev/sdc:

# udevadm info -q property --name sdc
DEVLINKS=/dev/disk/by-id/wwn-0x600508b1001cf7030d45af15fa4ea14d /dev/disk/by-id/scsi-3600508b1001cf7030d45af15fa4ea14d /dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:2
DEVNAME=/dev/sdc
DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:03:00.0/host0/target0:1:0/0:1:0:2/block/sdc
DEVTYPE=disk
ID_BUS=scsi
ID_MODEL=LOGICAL_VOLUME
ID_MODEL_ENC=LOGICAL\x20VOLUME\x20\x20
ID_PATH=pci-0000:03:00.0-scsi-0:1:0:2
ID_PATH_TAG=pci-0000_03_00_0-scsi-0_1_0_2
ID_REVISION=6.06
ID_SCSI=1
ID_SCSI_SERIAL=PDNLH0BRH9630O
ID_SERIAL=3600508b1001cf7030d45af15fa4ea14d
ID_SERIAL_SHORT=600508b1001cf7030d45af15fa4ea14d
ID_TYPE=disk
ID_VENDOR=HP
ID_VENDOR_ENC=HP\x20\x20\x20\x20\x20\x20
ID_WWN=0x600508b1001cf703
ID_WWN_VENDOR_EXTENSION=0x0d45af15fa4ea14d
ID_WWN_WITH_EXTENSION=0x600508b1001cf7030d45af15fa4ea14d
MAJOR=8
MINOR=32
SUBSYSTEM=block
TAGS=:systemd:
USEC_INITIALIZED=6417369

Revision history for this message
Gábor Mészáros (gabor.meszaros) wrote :

Hello Ian,

I've collected all data you asked for, but on a different machine (115).
https://pastebin.canonical.com/p/ZvB6Rg64gm/
Note that ID_WWN matches all the devices, but the ID_WWN_WITH_EXTENSION is the correct match.

Revision history for this message
Ian Booth (wallyworld) wrote :

I looked at this pastebin which has the block device info as recorded by Juju
https://pastebin.canonical.com/p/VM4Gj5m5X4/

Around line 5488

The block device record in Juju (as determined from running udevadm) records the wwn as

0x600508b1001cf703

It is not the wwn with extension. Juju only grabs the unxtended wwn from udevadm.

Yet maas is reporting what Juju expects as the wwn as the extended wwn:

0x600508b1001cf7030d45af15fa4ea14d

So Juju is trying to match an extended wwn with the unextended one.

We need to either fix maas, or extend Juju to handle both wwn and extended wwn. The best approach is probably to enhance juju to handle both so it's more robust.

Changed in juju:
milestone: none → 2.5-beta2
assignee: nobody → Ian Booth (wallyworld)
Revision history for this message
Ian Booth (wallyworld) wrote :

https://github.com/juju/juju/pull/9494

Further research indicates that where there are logical volumes on say a RAID controller, the controller is identified by the WWN and each volume by the extended vendor WWN. So if there's an extended WWN there, we'll use that since it seems that's how the volume is referenced under /dev/disk/by-id and hence is what's reported by MAAS.

Revision history for this message
Gábor Mészáros (gabor.meszaros) wrote :

Initial tests look promising: https://pastebin.canonical.com/p/pQHyPhHj8d/

Revision history for this message
Gábor Mészáros (gabor.meszaros) wrote :

Hello Ian,

I've got the full deployment finished, with success. Thanks for fixing the issue in such a short time!

Ian Booth (wallyworld)
Changed in juju:
status: Confirmed → Fix Committed
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.5-beta2 → 2.5-beta3
Changed in juju:
status: Fix Committed → Fix Released
Revision history for this message
Varun (varunmittal91-gmail) wrote :
Download full text (3.6 KiB)

I am on the latest version of Juju and am still running into the same problem.

juju --version
--> 2.9.31-ubuntu-amd64

udevadm info -q property --name vdb
-->
DEVPATH=/devices/pci0000:00/0000:00:07.0/virtio4/block/vdb
DEVNAME=/dev/vdb
DEVTYPE=disk
MAJOR=252
MINOR=16
SUBSYSTEM=block
USEC_INITIALIZED=2298274
ID_SERIAL=d877dca9-00b7-4507-9
ID_PATH=pci-0000:00:07.0
ID_PATH_TAG=pci-0000_00_07_0
DEVLINKS=/dev/disk/by-path/pci-0000:00:07.0 /dev/disk/by-id/virtio-d877dca9-00b7-4507-9 /dev/disk/by-path/virtio-pci-0000:00:07.0 /dev/disk/by-dname/sdb
TAGS=:systemd:

lsblk
-->
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 61.9M 1 loop /snap/core20/1434
loop1 7:1 0 67.8M 1 loop /snap/lxd/22753
loop2 7:2 0 44.7M 1 loop /snap/snapd/15904
vda 252:0 0 14.9G 0 disk
├─vda1 252:1 0 1M 0 part
└─vda2 252:2 0 14.9G 0 part /
vdb 252:16 0 93.1G 0 disk

Storage config from machine log:
{
 storage.BlockDevice {
  DeviceName: "loop0",
  DeviceLinks: [] string(nil),
  Label: "",
  UUID: "",
  HardwareId: "",
  WWN: "",
  BusAddress: "",
  Size: 0x3d,
  FilesystemType: "squashfs",
  InUse: true,
  MountPoint: "/snap/core20/1434",
  SerialId: ""
 }, storage.BlockDevice {
  DeviceName: "loop1",
  DeviceLinks: [] string(nil),
  Label: "",
  UUID: "",
  HardwareId: "",
  WWN: "",
  BusAddress: "",
  Size: 0x43,
  FilesystemType: "squashfs",
  InUse: true,
  MountPoint: "/snap/lxd/22753",
  SerialId: ""
 }, storage.BlockDevice {
  DeviceName: "loop2",
  DeviceLinks: [] string(nil),
  Label: "",
  UUID: "",
  HardwareId: "",
  WWN: "",
  BusAddress: "",
  Size: 0x2c,
  FilesystemType: "squashfs",
  InUse: true,
  MountPoint: "/snap/snapd/15904",
  SerialId: ""
 }, storage.BlockDevice {
  DeviceName: "vda",
  DeviceLinks: [] string {
   "/dev/disk/by-dname/sda",
   "/dev/disk/by-id/virtio-3c6c2fe2-d72e-4f2d-9",
   "/dev/disk/by-path/pci-0000:00:06.0",
   "/dev/disk/by-path/virtio-pci-0000:00:06.0"
  },
  Label: "",
  UUID: "",
  HardwareId: "",
  WWN: "",
  BusAddress: "",
  Size: 0x3b9a,
  FilesystemType: "",
  InUse: true,
  MountPoint: "",
  SerialId: "3c6c2fe2-d72e-4f2d-9"
 }, storage.BlockDevice {
  DeviceName: "vda1",
  DeviceLinks: [] string {
   "/dev/disk/by-dname/sda-part1",
   "/dev/disk/by-id/virtio-3c6c2fe2-d72e-4f2d-9-part1",
   "/dev/disk/by-partuuid/61ada20b-9951-4481-b320-7ecd2f30556d",
   "/dev/disk/by-path/pci-0000:00:06.0-part1",
   "/dev/disk/by-path/virtio-pci-0000:00:06.0-part1"
  },
  Label: "",
  UUID: "",
  HardwareId: "",
  WWN: "",
  BusAddress: "",
  Size: 0x1,
  FilesystemType: "",
  InUse: false,
  MountPoint: "",
  SerialId: "3c6c2fe2-d72e-4f2d-9"
 }, storage.BlockDevice {
  DeviceName: "vda2",
  DeviceLinks: [] string {
   "/dev/disk/by-dname/sda-part2",
   "/dev/disk/by-id/virtio-3c6c2fe2-d72e-4f2d-9-part2",
   "/dev/disk/by-label/root",
   "/dev/disk/by-partuuid/9dd0d011-b1ce-43e3-a0e6-12c18b41721f",
   "/dev/disk/by-path/pci-0000:00:06.0-part2",
   "/dev/disk/by-path/virtio-pci-0000:00:06.0-part2",
   "/dev/disk/by-uuid/d197c509-a124-48d3-826e-b91630cd3f5b"
  },
  Label: "root",
  UUID: "d197c509-a124-48d3-826e-b91630cd3f5b",
  HardwareId: "",
  WWN: "",
  BusAddress: "",
  Size: 0x3b98...

Read more...

Revision history for this message
Ian Booth (wallyworld) wrote (last edit ):

The root cause here will be a different issue than the WWN matching - can you please raise a new bug and also include the content of these collections from mongo

- volumes
- volumeattachments
- volumeattachmentplan

It would also be good to turn on TRACE level debugging for "juju.apiserver.storagecommon".

The udevadm output is also useful - it was included above but would also be needed again for any subsequent test.

Revision history for this message
Varun (varunmittal91-gmail) wrote :

Thank you Ian doing that right away.

Just a quick question if you dont mind, on my KVM hosts I am using virtio driver and all the disks are get vd* targets. But maas adds disks by labeling them sd*.

I know each virtual drive has an ID but still could this be the root cause?

Revision history for this message
Ian Booth (wallyworld) wrote :

It could be, the logic to reconcile the block devices that show up on the provisioned machine with the placeholder in the juju model is complicated due to the number of ways the match needs to be done depending on provider. Having the data will hopefully show what additional rules need to be added.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.