Creating volume from snapshot on real/production/multicluster installation of OpenStack is broken

Bug #1008866 reported by Jaroslav Pulchart
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Rongze Zhu
OpenStack Compute (nova)
Fix Released
Medium
Rongze Zhu

Bug Description

Hello,

I'm deploying OpenStack as Private Cloud implementation on multi-hw-node deployment with many Nova Volume (and other) services.

At the beginning I must say that I'm very disappointed of the QA of the OpenStack! OpenStack is now in 5. major (essex) release and his features are still very buggy, non-completed,... everything looks like it was never tested nor unit-tested! The whole OpenStack in this state is a "tool" for school project for one laptop/desktop only, or looks like Alpha version v0.2. OpenStack's database is consistently broken due to internal bugs, which are in simplest Cloud task like attaching volumes, detaching volumes, stopping/starting instances... Immediately when the bug occurs OpenStack log the bug/exception in middle of his "task" (why it is not transaction?) that keeps inconsistent DB entries, iSCSI and LVM stuff, ... That is not robust behaviour which is required for production deployment. How it is possible that this happen in fifth major release? Please do not implement any new features until current OpenStack's features are not fully functional and (that is important) robust!!!

Now I will report one of the regressions/issues: "Creating volume from snapshot on real/production/multicluster installation of OpenStack is broken"

Error log from nova-volume service:

2012-06-04 14:44:17 DEBUG nova.utils [] Running cmd (subprocess): sudo /usr/bin/nova-rootwrap dd if=/dev/mapper/nova--volumes-snap--00000005 of=/dev/mapper/nova--volumes-vol--000000f4 count=10240 bs=1M from (pid=11034) execute /usr/lib/python2.6/site-packages/nova/utils.py:220
2012-06-04 14:44:17 DEBUG nova.utils [] Result was 1 from (pid=11034) execute /usr/lib/python2.6/site-packages/nova/utils.py:236
2012-06-04 14:44:17 ERROR nova.rpc.amqp [] Exception during message handling
2012-06-04 14:44:17 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-06-04 14:44:17 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 253, in _process_data
2012-06-04 14:44:17 TRACE nova.rpc.amqp rval = node_func(context=ctxt, **node_args)
2012-06-04 14:44:17 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/volume/manager.py", line 138, in create_volume
2012-06-04 14:44:17 TRACE nova.rpc.amqp volume_ref['id'], {'status': 'error'})
2012-06-04 14:44:17 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2012-06-04 14:44:17 TRACE nova.rpc.amqp self.gen.next()
2012-06-04 14:44:17 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/volume/manager.py", line 127, in create_volume
2012-06-04 14:44:17 TRACE nova.rpc.amqp snapshot_ref)
2012-06-04 14:44:17 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/volume/driver.py", line 158, in create_volume_from_snapshot
2012-06-04 14:44:17 TRACE nova.rpc.amqp snapshot['volume_size'])
2012-06-04 14:44:17 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/volume/driver.py", line 116, in _copy_volume
2012-06-04 14:44:17 TRACE nova.rpc.amqp run_as_root=True)
2012-06-04 14:44:17 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/utils.py", line 243, in execute
2012-06-04 14:44:17 TRACE nova.rpc.amqp cmd=' '.join(cmd))
2012-06-04 14:44:17 TRACE nova.rpc.amqp ProcessExecutionError: Unexpected error while running command.
2012-06-04 14:44:17 TRACE nova.rpc.amqp Command: sudo /opt/openstack/bin/nova-rootwrap dd if=/dev/mapper/nova--volumes-snap--00000005 of=/dev/mapper/nova--volumes-vol--000000f4 count=10240 bs=1M

Task: create instance with volume root "euca-run-instances ami-0000001 -z foo -g default -t m1.small --block-device-mapping /dev/vda=snap-00000006:10:true"

The problem is this - OpenStack use local copy command ("dd") on multi-node deployment with many volume services. So the source snapshot is on different node which is accessible over iSCSI and TCP/IP but not directly by "dd".

I hope that the QA of OpenStack will be better soon,
Jaroslav

Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :

I'm now working on some solution and will paste patch when it is ready.

Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :

I prepared solution of this issue in my OpenSack Nova fork on GitHUB at https://github.com/pulchart/nova/commit/76e59b9628153b6c71150797f7fe35804502624d

How it works:
- each new snapshot is exported via iSCSI (like volumes)
- nova-volume, on which we creating the new volume from snapshot, register this iSCSI export into local system (if iscsi_ip_preffix filter pass)
- nova-volume can now use local copy
- after that the iSCSI snapshot is unregistered from local system

Unfortunately "host" and "provider_location" are missing in DB schema so I created some hacks:
- "host" is used from snapshot's volume (LVM snapshot is on same host as volume), there is a new API call "snapshot_get_host"
- iSCSI discovery is used instead of using location from database

Fixes:
- "_run_iscsiadm" from nova/volume/driver.py is synchronized with same function from libvirt

Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :

I forget to "init_host" function, so snapshot is not registered after server reboot. I will fix it and send updated patch.

Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :

Fixed in 1008866_v2.patch
- new: snapshot iSCSI exports are initialized by nova-volume start
(https://github.com/pulchart/nova/commit/a2df1c340fe874f2d2100e3e59b89140974ac51c)

Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :

1008866_v3.patch:
- "create_volume_from_snapshot" function need to be synchronized

....
+ @utils.synchronized('create_volume_from_snapshot')
     def create_volume_from_snapshot(self, volume, snapshot):
....

Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :
Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :
Revision history for this message
Thierry Carrez (ttx) wrote :

@Jaroslav: would you consider pushing the change to Gerrit ?
See http://wiki.openstack.org/HowToContribute#If_you.27re_a_developer.2C_start_here:

Changed in nova:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :

I have to handle one latest issue of this patch before I push it into "world". I try to find solution with any DB table/entry update, unfortunately it still not correctly works without updating "iscsi_targets" table.

So current solution expect this:
1/ all volume.(id)s on nova-volume on one hostX != snapshot.(id)s on the same hostX
2/ created snapshot.id < max(volume.id)
That is caused due to using/mixing "iscsi_targets.volume_id" with both "volume.id"/"snapshot.id". This workaround woks for my use case but can cause unxepected problems in some other situations.

I have seen some blueprints about changing volume_id, snapshot_id to UUID which is uniq across whole stack, so I thing this migration to UUID will solve all problems of this issue.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

I wonder if a simpler solution might just be to force volumes created from snapshots onto the same host as the snapshot. In other words, skip the scheduler altogether and just send the message directly to the right host if snapshot_id is set. Clearly this could lead to an overload of some hosts in the system, but it seems much more managable in the short term. Then we have more time to work on a robust solution for migrating/creating volumes on different hosts.

Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :

That is simple solution for implementation but not usable for cloud use-case(s). We have to be able create volume from snapshot in every place of In the Private Cloud (= on any volume node in cloud). Attached patch is now used and it solve many problems that we had.

Question: Do we use (or create) "Cloud solution" or "a simple tool for managing local LVM stuff"?
My answer is: We do NOT (!) need simple LVM tool manager for local volume node. We need cloud solution.

I will have some time this week for updating attached patch and fix the last problem of proposed solution. Please keep tuned :).

Revision history for this message
Vish Ishaya (vishvananda) wrote :

I understand that we want a better soution, but I think your patch is too large to backport into stable/essex and therefore should be done against the new cinder project.

The simple solution is appropriate for an essex backport. The more complex solution can be done in cinder, as it still requires some cleanup.

For example, the attach to snapshot code is essentially the same as attach to volume code and should be put into a little library of some sort. Also, I couldn't tell from the patch if it is using discovery by default, but if so it should be storing provider_location in the snapshots as well and connecting via an initialize_connection call in the same way it does for volumes.

In other words, this is a large set of changes that should be planned coherently in the cinder project.

Revision history for this message
Jaroslav Pulchart (jaroslav-pulchart-4) wrote :

> I understand that we want a better soution, but I think your patch is too large to
> backport into stable/essex and therefore should be
> done against the new cinder project.

Yes I understand too. Unfortunately, proposed simple solution will not fix this regression of this feature for production environment with many nova-volume nodes. But, yes you can workaround "the error/exception in log" by this way (until volume space is not exhausted). In our use-case we will rollback the simple fix and apply our fix in stable/essex release in our build of OpenStack.

> The simple solution is appropriate for an essex backport.
> The more complex solution can be done in cinder, as it still requires some cleanup.

Yes completely agree, cleanups are needed for future release. This patch was created in hurry without thinking about best practices.

> For example, the attach to snapshot code is essentially the same as
> attach to volume code and should be put into a little library of some sort.
> Also, I couldn't tell from the patch if it is using discovery by default,
> but if so it should be storing provider_location in the snapshots as
> well and connecting via an initialize_connection call in the same
> way it does for volumes.

I completely agree.

> In other words, this is a large set of changes that should be planned coherently in the cinder project.

Yes, again I completely agree without comments because review comments was expected ;). Please fell free to re-factor this patch/code for cinder project. I will be happy man whit the best solution/code :). This patch was for essex = do it without any DB modifications, without rewriting current functions if it is possible, do it immediately because this regression was blocking issue for our Private Cloud based on Essex.

Changed in cinder:
importance: Undecided → High
milestone: none → folsom-3
Changed in cinder:
assignee: nobody → ZhuRongze (zrzhit)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/10649

Changed in nova:
assignee: nobody → ZhuRongze (zrzhit)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/9761
Committed: http://github.com/openstack/cinder/commit/99456bd690445443ae05c0d4fe1ec43ba6090f6f
Submitter: Jenkins
Branch: master

commit 99456bd690445443ae05c0d4fe1ec43ba6090f6f
Author: ZhuRongze <email address hidden>
Date: Fri Jul 13 12:07:13 2012 +0000

    Send 'create volume from snapshot' to the proper host

    A simple solution for bug 1008866. When creating volume from snapshot on
    multicluster, in volume it will check if snapshot_id is set. If snapshot_id
    is set, make the call create volume directly to the volume host where the
    snapshot resides instead of passing it through the scheduler. So snapshot can
    be copy to new volume.

    Change-Id: Ie9c1a77f62abc40e294b1d0c604cf885652728da

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/10649
Committed: http://github.com/openstack/nova/commit/6795de644b8a8a1879543101d85ba90674219c8b
Submitter: Jenkins
Branch: master

commit 6795de644b8a8a1879543101d85ba90674219c8b
Author: ZhuRongze <email address hidden>
Date: Wed Aug 1 13:23:13 2012 +0000

    Send 'create volume from snapshot' to the proper host

    A simple solution for bug 1008866. When creating volume from snapshot on
    multicluster, in volume it will check if snapshot_id is set. If snapshot_id
    is set and FLAGS.snapshot_same_host is true, make the call create volume
    directly to the volume host where the snapshot resides instead of passing it
    through the scheduler. So snapshot can be copy to new volume. The same as
    review 9761.

    Change-Id: Ic182eb4563b9462704c5969d5116629442df316a

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → folsom-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/12675

Changed in cinder:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/12675
Committed: http://github.com/openstack/cinder/commit/3eaf43a9f5a26a51a89347cffe39bfae2b12c2d6
Submitter: Jenkins
Branch: master

commit 3eaf43a9f5a26a51a89347cffe39bfae2b12c2d6
Author: Rongze Zhu <email address hidden>
Date: Sun Sep 9 15:35:26 2012 +0800

    Prevent from bug #1008866 is reverted

    Fixes bug #1047841.

    Commit 2f5360753308eb8b10581fc3c026c1b66f42ebdc (Adds new volume API
    extensions) reverted a part of commit
    99456bd690445443ae05c0d4fe1ec43ba6090f6f (Send 'create volume from
    snapshot' to the proper host), so bug #1008866 is reproduced. I make
    API.create_volume to call _cast_create_volume in cinder/volume/api.py,
    it Prevent from bug #1008866 is reverted.

    Change-Id: I1bf0b7c5fc47da756bce95128f8fd770d14399b0

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/12854

Changed in nova:
status: Fix Released → In Progress
Thierry Carrez (ttx)
Changed in nova:
status: In Progress → Fix Released
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: folsom-3 → 2012.2
Thierry Carrez (ttx)
Changed in nova:
milestone: folsom-3 → 2012.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.