Cannot boot instance from volume snapshot

Bug #1523435 reported by Sergey Arkhipov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Confirmed
High
Sergey Arkhipov

Bug Description

Instances are unbootable from volume snapshot: process crashed with error like "Build of instance 93122a65-6721-4350-8936-6eebd60afd1a aborted: Block Device Mapping is Invalid."

Environment:
    * 50 baremetal nodes
    * Ceph is used
    * Ceilometer, Murano and Sahara are not installed
    * Each node has 32142MB RAM, 12 CPU

Steps to reproduce:
    * Create instance with volume
    * Stop instance
    * Make volume snapshot from volume
    * Try to start instance from created volume snapshot
        * With new disk create
        * With "Delete on terminate" flag

I have been trying to boot instance "93122a65-6721-4350-8936-6eebd60afd1a" from snapshot "b2385aa6-1ffe-4fd8-a352-bb95b17439ce".

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  openstack_version: "2015.1.0-8.0"
  api: "1.0"
  build_number: "199"
  build_id: "199"
  fuel-nailgun_sha: "a65871a955dd0886b9c2e18f7a24381f10753136"
  python-fuelclient_sha: "3e7738fd3fb18a2d5f53b1ecc9706dc53b65a511"
  fuel-agent_sha: "88c993eab990ddb7b5f038490295f52c602aa58b"
  fuel-nailgun-agent_sha: "b56f832abc18aee9a8c603fd6cc2055c5f4287bc"
  astute_sha: "c8400f51b0b92254da206de55ef89d17fdf35393"
  fuel-library_sha: "33c0fa3aada734dc9e6f315197ce0e4a16f5987c"
  fuel-ostf_sha: "11afd5743a12b1006317d3ca7000d1ede77bdae2"
  fuel-createmirror_sha: "994fed9b1ed889718b61a59733275c08c2dd4c64"
  fuelmenu_sha: "d12061b1aee82f81b3d074de74ea27a6e962a686"
  shotgun_sha: "c377d163519f6d10b69a654019d6086ba5f14edc"
  network-checker_sha: "2c62cd52655ea6456ff6294fd63f18d6ea54fe38"
  fuel-upgrade_sha: "1e894e26d4e1423a9b0d66abd6a79505f4175ff6"
  fuelmain_sha: "22fe551f5525d11a1854fd87dbc8c77fae8fec08"

Revision history for this message
Sergey Arkhipov (sarkhipov) wrote :
Changed in mos:
assignee: nobody → MOS Nova (mos-nova)
milestone: none → 8.0
importance: Undecided → High
status: New → Confirmed
tags: added: mos-nova
tags: added: area-nova
removed: mos-nova
Changed in mos:
assignee: MOS Nova (mos-nova) → Sergey Nikitin (snikitin)
Changed in mos:
assignee: Sergey Nikitin (snikitin) → MOS Nova (mos-nova)
Revision history for this message
Sergey Nikitin (snikitin) wrote :

I'll be available on New Year holidays, so I assign it on nova team until January 11. I tried to reproduce it on the following environment but it didn't reproduce.

1 controller

2 compute

Ceph all

Changed in mos:
assignee: MOS Nova (mos-nova) → Roman Podoliaka (rpodolyaka)
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

I tried this on ISO #364 and it works like a charm. Log of shell session: http://paste.openstack.org/show/482949/

Nova/Cinder packages used:

root@node-1:~# dpkg -l | grep nova
ii nova-api 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - compute API frontend
ii nova-cert 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - certificate manager
ii nova-common 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - common files
ii nova-conductor 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - conductor service
ii nova-consoleauth 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - Console Authenticator
ii nova-consoleproxy 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - NoVNC proxy
ii nova-objectstore 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - object store
ii nova-scheduler 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - virtual machine scheduler
ii python-nova 2:12.0.0-1~u14.04+mos19 all OpenStack Compute - libraries
ii python-novaclient 2:2.30.2-1~u14.04+mos3 all client library for OpenStack Compute API
root@node-1:~# dpkg -l | grep cinder
ii cinder-api 2:7.0.1-1~u14.04+mos7 all OpenStack block storage system - API server
ii cinder-backup 2:7.0.1-1~u14.04+mos7 all OpenStack block storage system - Backup server
ii cinder-common 2:7.0.1-1~u14.04+mos7 all OpenStack block storage system - common files
ii cinder-scheduler 2:7.0.1-1~u14.04+mos7 all OpenStack block storage system - Scheduler server
ii cinder-volume 2:7.0.1-1~u14.04+mos7 all OpenStack block storage system - Volume server
ii python-cinder 2:7.0.1-1~u14.04+mos7 all OpenStack block storage system - Python libraries
ii python-cinderclient 1:1.4.0-1~u14.04+mos2 all Python bindings to the OpenStack Volume API - Python 2.x

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

^ Ceph was used for volumes:

root@node-1:~# grep volume_driver /etc/cinder/cinder.conf
volume_driver=cinder.volume.drivers.rbd.RBDDriver

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Sergey, I also checked logs you provided in #1: looks like they have been stripped somehow - request IDs and exception tracebacks are missing. Unfortunately, InvalidBDM is the most general exception in the class hierarchy of BDM exceptions, so it's not clear what exactly went wrong.

Sergey, could you please give it another try on the recent ISO?

Changed in mos:
assignee: Roman Podoliaka (rpodolyaka) → Sergey Arkhipov (sarkhipov)
status: Confirmed → Incomplete
Revision history for this message
Sergey Arkhipov (sarkhipov) wrote :

Hi, Roman.

Unfortunately, I have no that environment, but have tried with other setup. Problem is reproduced

MOS 8.0, ISO #361

Environment:
    * 50 baremetal nodes
    * Ceph *is not* used
        - Cinder LVM over iSCSI for volumes is enabled, rest of checkboxes are disabled
    * Ceilometer, Murano and Sahara *are* installed
    * Each node has 32142MB RAM, 12 CPU

So as you can see, environment is similar but Ceph is not used. I've tried the same usecase as mentioned, and succeed to reproduce:
    * Failed instance is 18807ac1-c130-4c56-bc04-f7b047076e95
    * Volume snapshot is af872626-e0bb-4b00-b9f7-6c5bb0f620e1
    * Snapshot is made from volume 9e165e5c-3ba1-4048-affd-bdfee3059444

Error:
    * Message: Build of instance 18807ac1-c130-4c56-bc04-f7b047076e95 aborted: Block Device Mapping is Invalid.
    * Code: 500
    * Details:
File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance filter_properties) File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2025, in _build_and_run_instance 'create.error', fault=e) File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1996, in _build_and_run_instance block_device_mapping) as resources: File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2143, in _build_resources reason=e.format_message())
    * Created Jan. 11, 2016, 3:48 p.m.

I cannot create diagnostic snapshot due to https://bugs.launchpad.net/fuel/+bug/1529182 so I've gathered all possible logs (not *.gz). Please find them here:
https://drive.google.com/a/mirantis.com/file/d/0B9tzODpFABxkS3h2ZmN1R0ZVQXc/view?usp=sharing

Changed in mos:
status: Incomplete → Confirmed
Revision history for this message
Sergey Arkhipov (sarkhipov) wrote :

Versions in ISO #361:

root@node-4:~# dpkg -l | grep -E "nova|cinder" | awk '{print $2,"\t",$3}'
cinder-api 2:7.0.1-1~u14.04+mos7
cinder-common 2:7.0.1-1~u14.04+mos7
cinder-scheduler 2:7.0.1-1~u14.04+mos7
nova-api 2:12.0.0-1~u14.04+mos19
nova-cert 2:12.0.0-1~u14.04+mos19
nova-common 2:12.0.0-1~u14.04+mos19
nova-conductor 2:12.0.0-1~u14.04+mos19
nova-consoleauth 2:12.0.0-1~u14.04+mos19
nova-consoleproxy 2:12.0.0-1~u14.04+mos19
nova-objectstore 2:12.0.0-1~u14.04+mos19
nova-scheduler 2:12.0.0-1~u14.04+mos19
python-cinder 2:7.0.1-1~u14.04+mos7
python-cinderclient 1:1.4.0-1~u14.04+mos2
python-nova 2:12.0.0-1~u14.04+mos19
python-novaclient 2:2.30.2-1~u14.04+mos3

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Looks like #1533197 is a more general case

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.