Re-commissioning doesn't detect storage changes

Bug #1575567 reported by Jacek Nykis on 2016-04-27
40
This bug affects 11 people
Affects Status Importance Assigned to Milestone
MAAS
Critical
Mike Pontillo
1.9
Critical
Mike Pontillo

Bug Description

I have a node that had recently more disks added including new SSD. I re-commissioned it but new storage devices are not showing in MAAS UI (see attachment)

When I look at commissioning output relevant hardware is showing there for example:
         <lshw:node id="disk:1" claimed="true" class="disk" handle="SCSI:02:00:00:01">
           <lshw:description>SCSI Disk</lshw:description>
           <lshw:product>LOGICAL VOLUME</lshw:product>
           <lshw:vendor>HP</lshw:vendor>
           <lshw:physid>0.0.1</lshw:physid>
           <lshw:businfo>scsi@2:0.0.1</lshw:businfo>
           <lshw:logicalname>/dev/sdb</lshw:logicalname>
           <lshw:dev>8:16</lshw:dev>
           <lshw:version>6.00</lshw:version>
           <lshw:serial>XXX</lshw:serial>
           <lshw:size units="bytes">900151926784</lshw:size>
           <lshw:configuration>
            <lshw:setting id="ansiversion" value="5"/>
            <lshw:setting id="sectorsize" value="512"/>
           </lshw:configuration>
           <lshw:capabilities>
            <lshw:capability id="15000rpm">15000 rotations per minute</lshw:capability>
           </lshw:capabilities>
          </lshw:node>

un maas <none> <none> (no description available)
ii maas-cli 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS command line API tool
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS server common files
un maas-dhcp <none> <none> (no description available)
ii maas-dns 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS DNS server
ii maas-proxy 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS Caching Proxy
ii maas-region-controller 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS server complete region controller
ii maas-region-controller-min 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS Server minimum region controller
ii python-django-maas 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS server Django web framework
ii python-maas-client 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS python API client
ii python-maas-provisioningserver 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS server provisioning libraries

Related branches

Jacek Nykis (jacekn) wrote :
Jacek Nykis (jacekn) wrote :

I uploaded full commissioning output here:
https://private-fileshare.canonical.com/~jacek/lp1575567.xml

Changed in maas:
importance: Undecided → Critical
status: New → Triaged
milestone: none → 1.9.3
Jacek Nykis (jacekn) wrote :

Update: after I removed the node and added it back in I was able to commission and hardware was detected properly. Of course re-commissioning should still work as expected.

Blake Rouse (blake-rouse) wrote :

We do not gather the disk information from lshw. Please provide the output of the commissioning script for "block-devices.out" in the "Commissioning Output" on the node details page. The output for initial commissioning followed by the output from the re-commissioning would be best.

Changed in maas:
status: Triaged → Incomplete
Changed in maas:
milestone: 1.9.3 → 2.0.0
status: Incomplete → Triaged
Mike Pontillo (mpontillo) wrote :

I tested this on MAAS 2.0 and didn't see the issue. Checking 1.9 now...

Changed in maas:
status: Triaged → Incomplete
Mike Pontillo (mpontillo) wrote :

After looking at a packet capture, I found that on MAAS 1.9.1 the following error occurs during posting of the storage data:

POST /MAAS/metadata//2012-03-01/ HTTP/1.1
Accept-Encoding: identity
Content-Length: 1444
Connection: close
User-Agent: Python-urllib/2.7
Host: 172.16.100.10
Content-Type: multipart/form-data; boundary=fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Authorization: OAuth realm="", oauth_nonce="4c370cd98ceb4cf5933a7f1191ef5fe8", oauth_timestamp="1462832639", oauth_consumer_key="LjsQhh8TPyBhPkFHc8", oauth_signature_method="PLAINTEXT", oauth_version="1.0", oauth_token="uK7JkPEeY8wvePayhx", oauth_signature="%26ZNc6eeddjSWSFSnq5tC9q5Tgxe6t9VQf"

--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="status"

WORKING
--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="script_result"

0
--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="error"

finished 00-maas-07-block-devices [6/9]: 0
--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="op"

signal
--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="00-maas-07-block-devices.out"; filename="00-maas-07-block-devices.out"
Content-Type: application/octet-stream

[
 {
  "BLOCK_SIZE": "4096",
  "NAME": "sda",
  "ID_PATH": "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-0-0-0",
  "PATH": "/dev/sda",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "QEMU HARDDISK",
  "RO": "0",
  "SERIAL": "drive-scsi1-0-0-0",
  "SIZE": "1073741824"
 },
 {
  "BLOCK_SIZE": "4096",
  "NAME": "sdb",
  "ID_PATH": "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0",
  "PATH": "/dev/sdb",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "QEMU HARDDISK",
  "RO": "0",
  "SERIAL": "drive-scsi0-0-0",
  "SIZE": "8589934592"
 },
 {
  "BLOCK_SIZE": "4096",
  "NAME": "sdc",
  "ID_PATH": "/dev/disk/by-id/wwn-0x3000000100000001",
  "PATH": "/dev/sdc",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "VIRTUAL-DISK",
  "RO": "1",
  "SERIAL": "3000000100000001",
  "SIZE": "1468006400"
 }
]

--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss--
HTTP/1.1 400 BAD REQUEST
Date: Mon, 09 May 2016 22:24:00 GMT
Server: TwistedWeb/13.2.0
Content-Type: application/json
X-Frame-Options: SAMEORIGIN
Vary: Cookie
Connection: close
Transfer-Encoding: chunked

45
{"__all__": ["Block device with this Node and Name already exists."]}
0

Mike Pontillo (mpontillo) wrote :

I think I figured out why this is happening.

In the database, we have a unique constraint on the device name of each storage device. For example, if you have {node1, sda} and you add {node1, sdb}, that's fine.

If you add a disk, and the kernel chooses different names for each device, when we go to insert the new device, its name might clash with the old device name.

Changed in maas:
status: Incomplete → Triaged
assignee: nobody → Mike Pontillo (mpontillo)
Mike Pontillo (mpontillo) wrote :

Note: you've got a 50/50 chance of hitting this bug, depending on if the kernel decides to insert the new device before or after your old one. ;-)

Changed in maas:
status: Triaged → Fix Committed
Trent Lloyd (lathiat) wrote :

Tested on maas/proposed 1.9.3 and the issue is resolved for me, in my case the existing device details were being updated.

Mike Pontillo (mpontillo) wrote :

Glad to hear it; thanks for confirming!

Virginie Dotta (vdotta) wrote :

Hi Mike,

Similar bug in maas 1.9.3

I do not have disk information provided in the GUI

Virginie Dotta (vdotta) wrote :

Here is the output of the 00-maas-07-block-devices.out

[
 {
  "BLOCK_SIZE": "4096",
  "NAME": "sda",
  "ID_PATH": "/dev/disk/by-id/wwn-0x3000000600000001",
  "PATH": "/dev/sda",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "VIRTUAL-DISK",
  "RO": "1",
  "SERIAL": "3000000600000001",
  "SIZE": "1468006400"
 },
 {
  "BLOCK_SIZE": "4096",
  "NAME": "vda",
  "PATH": "/dev/vda",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "",
  "RO": "0",
  "SIZE": "128849018880"
 }
]

Changed in maas:
status: Fix Committed → Fix Released
Aymen Frikha (aym-frikha) wrote :

Also have same issue using MAAS version : MAAS Version 2.1.0+bzr5480-0ubuntu1.
There is disks added to the nodes and others removed from nodes. After recommissioning new devices were not detected.
Work on Proliant DL 380 Gen9

Grant Slater (firefishy) wrote :

Seeing this issue in MAAS 2.2.2-6099-g8751f91-0ubuntu1~16.04.1

"Block device with this Node and Name already exists."

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments