[NFS] Server resize failed when using volume from image

Bug #2002535 reported by Jean Pierre Roquesalane
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Medium
Jean Pierre Roquesalane
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

Description: When resizing an instance, an error is thrown to the user:

Error resizing server: 17a238f3-01be-490c-97c8-35f47c804624
Error resizing server

Steps to reproduce:
1. Create a bootable volume from an existing image
openstack volume create --image cirros-0.5.2-x86_64-disk --size 1 --bootable volboot
2. Create an instance attached to that volume
openstack server create --flavor m1.small --volume volboot --network public server1
3. Resize the server
openstack server resize --flavor m1.medium --wait server1

In the nova compute logfile, we can see an error which refers to the disk format
.
Jan 10 20:29:09 e2e-os-pstorenfs105 nova-compute[3362105]: ERROR oslo_messaging.rpc.server libvirt.libvirtError: internal error: process exited while connecting to monitor: 2023-01-10T12:29:08.542757Z qemu-system-x86_64: -blockdev {"node-name":"libvirt-1-format","read-only":false,"cache":

{"direct":true,"no-flush":false}

,"driver":"qcow2","file":"libvirt-1-storage","backing":null}: Image is not in qcow2 format

After a first analysis, it seems that during the creation of the volume, qemu-img convert qcow2 file format to raw and never converts it back to qcow2. Attachement information then mistmatches and server resize is failing.

It's an NFS environment.

Changed in cinder:
assignee: nobody → Jean Pierre Roquesalane (jproque15130)
tags: added: cache image nfs nova qcow2
Changed in cinder:
importance: Undecided → Medium
Revision history for this message
Jean Pierre Roquesalane (jproque15130) wrote :
Download full text (5.9 KiB)

Below is the actions executed by cinder at the time of the failure:

Jan 13 02:47:52 e2e-os-pstorenfs105 cinder-volume[369491]: INFO cinder.volume.manager [req-387efd2d-a695-4f9f-850e-aa55cb0c9890 req-5dd551a8-1c4c-404f-b232-ef9ca400335d admin None] Terminate volume connection completed successfully.
Jan 13 02:47:52 e2e-os-pstorenfs105 cinder-volume[369491]: DEBUG cinder.volume.manager [req-387efd2d-a695-4f9f-850e-aa55cb0c9890 req-5dd551a8-1c4c-404f-b232-ef9ca400335d admin None] Deleting attachment 09b84254-c092-4cfc-9af9-c08a6f770efb. {{(pid=369491) attachment_delete /opt/stack/new/cinder/cinder/volume/manager.py:4998}}
Jan 13 02:47:53 e2e-os-pstorenfs105 cinder-volume[369491]: DEBUG cinder.volume.drivers.nfs [req-387efd2d-a695-4f9f-850e-aa55cb0c9890 req-771ac965-6761-482d-bd2c-e2ec2c2b1a39 admin None] Initializing connection to volume 00d0f8b9-98e4-4d00-b697-24097efd1a9f. Connector: {'platform': 'x86_64', 'os_type': 'linux', 'ip': '10.228.225.105', 'host': 'e2e-os-pstorenfs105', 'multipath': True, 'initiator': 'iqn.2005-03.org.open-iscsi:37fab0269180', 'do_local_attach': False, 'uuid': '0801a08f-93fd-4be3-8a20-b3607ffd0c14', 'system uuid': '422a6a69-7f46-aab6-c409-74221f8028af', 'nvme_native_multipath': False, 'mountpoint': '/dev/vda'} {{(pid=369491) initialize_connection /opt/stack/new/cinder/cinder/volume/drivers/nfs.py:138}}
Jan 13 02:47:53 e2e-os-pstorenfs105 cinder-volume[369491]: DEBUG oslo_concurrency.processutils [req-387efd2d-a695-4f9f-850e-aa55cb0c9890 req-771ac965-6761-482d-bd2c-e2ec2c2b1a39 admin None] Running cmd (subprocess): /usr/bin/python3.8 -m oslo_concurrency.prlimit --as=1073741824 --cpu=60 -- sudo cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C qemu-img info --output=json --force-share /opt/stack/data/cinder/mnt/0f31b785b44925eea1ad3ce3b8eff927/volume-00d0f8b9-98e4-4d00-b697-24097efd1a9f {{(pid=369491) execute /usr/local/lib/python3.8/dist-packages/oslo_concurrency/processutils.py:384}}
Jan 13 02:47:54 e2e-os-pstorenfs105 cinder-volume[369491]: DEBUG oslo_concurrency.processutils [req-387efd2d-a695-4f9f-850e-aa55cb0c9890 req-771ac965-6761-482d-bd2c-e2ec2c2b1a39 admin None] CMD "/usr/bin/python3.8 -m oslo_concurrency.prlimit --as=1073741824 --cpu=60 -- sudo cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C qemu-img info --output=json --force-share /opt/stack/data/cinder/mnt/0f31b785b44925eea1ad3ce3b8eff927/volume-00d0f8b9-98e4-4d00-b697-24097efd1a9f" returned: 0 in 0.560s {{(pid=369491) execute /usr/local/lib/python3.8/dist-packages/oslo_concurrency/processutils.py:422}}
Jan 13 02:47:54 e2e-os-pstorenfs105 cinder-volume[369491]: DEBUG cinder.volume.drivers.nfs [req-387efd2d-a695-4f9f-850e-aa55cb0c9890 req-771ac965-6761-482d-bd2c-e2ec2c2b1a39 admin None] NfsDriver: conn_info: {'driver_volume_type': 'nfs', 'data': {'export': '172.16.20.10:/openstack-nfs1', 'name': 'volume-00d0f8b9-98e4-4d00-b697-24097efd1a9f', 'options': None, 'format': 'raw'}, 'mount_point_base': '/opt/stack/data/cinder/mnt'} {{(pid=369491) initialize_connection /opt/stack/new/cinder/cinder/volume/drivers/nfs.py:164}}
Jan 13 02:47:54 e2e-os-pstorenfs105 cinder-volume[369491]: DEBUG cinder.volume.manager [req-387efd2d-a695-4f9f-850e-aa55cb0c989...

Read more...

Revision history for this message
Jean Pierre Roquesalane (jproque15130) wrote :

I think it's even more broken than I thought.

Actually, when I create a volume from an image with or without image volume cache enabled, the file format ends up as raw.

After the volume below has been created on the nfs backend:
stack@e2e-os-pstorenfs105:~$ qemu-img info volume-ba7960d9-d169-4b27-90a1-9b7aa1dc3452
image: volume-ba7960d9-d169-4b27-90a1-9b7aa1dc3452
file format: raw
virtual size: 1 GiB (1073741824 bytes)
disk size: 16.4 MiB

as we can see in the log from cinder:
DEBUG oslo_concurrency.processutils [None req-a0ba2364-c743-405c-a698-3ca6b6248e54 admin None] CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf qemu-img convert -O raw -f qcow2 /opt/stack/data/cinder/conversion/image_fetch_a087ca5c-af22-4e80-a126-649ac96212bd_iqnhlpm6e2e-os-pstorenfs105@powerstorenfs1 /opt/stack/data/cinder/mnt/0f31b785b44925eea1ad3ce3b8eff927/volume-ba7960d9-d169-4b27-90a1-9b7aa1dc3452" returned: 0 in 0.530s {{(pid=1840449) execute /usr/local/lib/python3.8/dist-packages/oslo_concurrency/processutils.py:422}}

summary: - Server resize failed when image volume cache enabled
+ [NFS] Server resize failed when image volume cache enabled
Revision history for this message
Jean Pierre Roquesalane (jproque15130) wrote :
Download full text (11.4 KiB)

When creating a volume from an image, and even if cinder converts it from qcow2 to raw, the volume metadata still shows it as a qcow2 volume:

mysql> select * from volume_admin_metadata where volume_id='cb45a18f-3577-4908-9a38-1eba2e5fa954';
+---------------------+------------+------------+---------+----+--------------------------------------+--------+-------+
| created_at | updated_at | deleted_at | deleted | id | volume_id | key | value |
+---------------------+------------+------------+---------+----+--------------------------------------+--------+-------+
| 2023-06-27 14:18:59 | NULL | NULL | 0 | 12 | cb45a18f-3577-4908-9a38-1eba2e5fa954 | format | qcow2 |
+---------------------+------------+------------+---------+----+--------------------------------------+--------+-------+
1 row in set (0.01 sec)

When lauching an instance from this volume, and as a result of the above, the attachment and connection info to this volume reflects a qcow2 volume.

tack@ubuntu2204-devstack:~/cinder/cinder/volume/drivers$ openstack volume attachment --os-volume-api-version 3.27 show cb7d407e-8f05-427d-81c2-3fe700843c4f
+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ID | cb7d407e-8f05-427d-81c2-3fe700843c4f |
| Volume ID | cb45a18f-3577-4908-9a38-1eba2e5fa954 |
| Instance ID | 76dfd939-85d4-491c-803e-1566e226552d |
| Status | attached ...

summary: - [NFS] Server resize failed when image volume cache enabled
+ [NFS] Server resize failed when using volume from image
Revision history for this message
Jean Pierre Roquesalane (jproque15130) wrote :

In a more readable format:
https://paste.opendev.org/show/820474/

Changed in cinder:
status: New → In Progress
Revision history for this message
Saravanan Manickam (msaravan) wrote :

I tried to reproduce and run the changes for NetApp volumes (as requested by Rajat). But, I didnt hit the issue as mentioned in this bug. Mine is not a LUKS volume, and its a normal one. I pulled the devstack code 2 days back (30-Sept), and performed the operations to reproduce before applying the changes. I ran them twice. I see the volumes created from image is already in RAW format, and resize operation is always going through. Please share the steps if I missed anything.

The snippets and tables are captured here (more readable):

https://paste.opendev.org/show/bZ2NcRBRShf1rkWMsCOT/

Revision history for this message
Jean Pierre Roquesalane (jproque15130) wrote :

I tried to reproduce the issue with the current code and I was unable to do so.

All resize operations with devstack stable/master run fine, I'll abandon this patch. Thanks for the update.

Changed in nova:
status: New → Invalid
Changed in cinder:
status: In Progress → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master)

Change abandoned by "Jean Pierre Roquesalane <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/cinder/+/887081
Reason: Original issue is no longer reproducible.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.