[feature request] Add support of Cinder iSCSI/FC storage backend

Bug #1905042 reported by Vladimir Grevtsev
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Glance Charm
Fix Released
Wishlist
Hemanth Nakkina

Bug Description

Hi there,

Currently, our Glance charm can use only Ceph as a storage backend; however, the upstream Glance can support some more different backends:

https://docs.openstack.org/glance/latest/configuration/configuring.html#configuring-glance-storage-backends

"A comma separated list of enabled glance stores. Some available options for this option are (filesystem, http, rbd, swift, cinder, vmware)"

The expectation in this case is to have a possibility to completely remove Ceph from the deployment equation (we have a customer who has FC-backed external storage array and almost no local drives, meaning we can't deploy Ceph).

As for now, both charms have a relation:

relations:
  - - 'cinder:image-service'
    - 'glance:image-service'
  - - 'cinder:cinder-volume-service'
    - 'glance:cinder-volume-service'

But it does nothing except of writing the "default_store=cinder" to the glance.conf so it's not going to be functional.

Changed in charm-glance:
status: New → Triaged
importance: Undecided → Wishlist
Revision history for this message
Nobuto Murata (nobuto) wrote :

The work required to use Cinder (iSCSI based) as a backend of Glance:

1. enable cinder backend per relation[1]
2. generate a unique initiator name
   printf "InitiatorName=%s\n" "$(iscsi-iname -p iqn.1993-08.org.debian:01)" | sudo tee /etc/iscsi/initiatorname.iscsi
3. restart open-iscsi service

Please note that Glance will attach and mount a Cinder volume in the Glance unit so it has to be a bare metal or a KVM machine instead of a LXD container to use iSCSI.

[1] For example:
# git diff --no-index /etc/glance/glance-api.conf{.orig,}
diff --git a/etc/glance/glance-api.conf.orig b/etc/glance/glance-api.conf
index e85b4c1..1682794 100644
--- a/etc/glance/glance-api.conf.orig
+++ b/etc/glance/glance-api.conf
@@ -21,12 +21,12 @@ db_enforce_mysql_charset = False

 image_size_cap = 1099511627776

-enabled_backends = local:file
+enabled_backends = cinder-lvm:cinder

 [glance_store]

-default_backend = local
+default_backend = cinder-lvm

 filesystem_store_datadir = /var/lib/glance/images/

Nobuto Murata (nobuto)
tags: added: good-first-bug
tags: added: sts
Changed in charm-glance:
assignee: nobody → Hemanth Nakkina (hemanth-n)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-glance (master)
Changed in charm-glance:
status: Triaged → In Progress
Revision history for this message
Narinder Gupta (narindergupta) wrote : Re: [feature request] Add support of Cinder storage backend

looks like the patch is not complete. As when we create a volume or upload an image we get the following errors. https://pastebin.canonical.com/p/Z4VMXRKBt4/

I was testing the patch on my orangebox and found the following issue while uploading the image file using glance. https://pastebin.canonical.com/p/ZCBmyY8G5Z/

Revision history for this message
Narinder Gupta (narindergupta) wrote :

here is the randered glance-api.conf file. https://pastebin.canonical.com/p/3HN49rHPQC/

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

@narindergupta

Thanks for verification of the patch and reporting the problems.

From https://pastebin.canonical.com/p/Z4VMXRKBt4/, glance service is not running (or haproxy before glance service)

https://172.27.100.122:9292 "GET / HTTP/1.1" 503 381
RESP: [503] Connection: close Content-Length: 381 Content-Type: text/html; charset=iso-8859-1 Date: Fri, 12 Nov 2021 18:00:10 GMT Server: Apache/2.4.41 (Ubuntu)
RESP BODY: Omitted, Content-Type is set to text/html; charset=iso-8859-1. Only application/json responses have their bodies logged.
Request returned failure status: 503

* Can you check logs from /var/log/haproxy, /var/log/apache2/, /var/log/glance for any errors and paste them
* Which openstack release is used for the testing. Can you upload the bundle.

Revision history for this message
Narinder Gupta (narindergupta) wrote :

this is the bundle i am running and having local glance to include your patch. https://paste.ubuntu.com/p/T6gndT66vM/

this is the error I am getting in syslog

Nov 15 14:24:02 juju-a994a5-2-lxd-1 glance-api[1917872]: ERROR: 'NoneType' object has no attribute 'user_id'
Nov 15 14:24:03 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Main process exited, code=exited, status=99/n/a
Nov 15 14:24:03 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Failed with result 'exit-code'.
Nov 15 14:24:03 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Scheduled restart job, restart counter is at 94183.
Nov 15 14:24:03 juju-a994a5-2-lxd-1 systemd[1]: Stopped OpenStack Image Service API.
Nov 15 14:24:03 juju-a994a5-2-lxd-1 systemd[1]: Started OpenStack Image Service API.
Nov 15 14:24:05 juju-a994a5-2-lxd-1 glance-api[1917894]: ERROR: 'NoneType' object has no attribute 'user_id'
Nov 15 14:24:05 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Main process exited, code=exited, status=99/n/a
Nov 15 14:24:05 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Failed with result 'exit-code'.
Nov 15 14:24:05 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Scheduled restart job, restart counter is at 94184.
Nov 15 14:24:05 juju-a994a5-2-lxd-1 systemd[1]: Stopped OpenStack Image Service API.
Nov 15 14:24:05 juju-a994a5-2-lxd-1 systemd[1]: Started OpenStack Image Service API.
Nov 15 14:24:07 juju-a994a5-2-lxd-1 glance-api[1917916]: ERROR: 'NoneType' object has no attribute 'user_id'
Nov 15 14:24:08 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Main process exited, code=exited, status=99/n/a
Nov 15 14:24:08 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Failed with result 'exit-code'.
Nov 15 14:24:08 juju-a994a5-2-lxd-1 systemd[1]: glance-api.service: Scheduled restart job, restart counter is at 94185.
Nov 15 14:24:08 juju-a994a5-2-lxd-1 systemd[1]: Stopped OpenStack Image Service API.
Nov 15 14:24:08 juju-a994a5-2-lxd-1 systemd[1]: Started OpenStack Image Service API.

Revision history for this message
Narinder Gupta (narindergupta) wrote :

error.log:[Mon Nov 15 14:22:04.923955 2021] [proxy:error] [pid 1300620:tid 139974485681920] (111)Connection refused: AH00957: HTTP: attempt to connect to 127.0.0.1:9272 (localhost) failed
error.log:[Mon Nov 15 14:22:04.923995 2021] [proxy_http:error] [pid 1300620:tid 139974485681920] [client 172.27.100.122:50350] AH01114: HTTP: failed to make connection to backend: localhost
error.log:[Mon Nov 15 14:22:04.937603 2021] [proxy:error] [pid 1300619:tid 139974569608960] (111)Connection refused: AH00957: HTTP: attempt to connect to 127.0.0.1:9272 (localhost) failed
error.log:[Mon Nov 15 14:22:04.937646 2021] [proxy_http:error] [pid 1300619:tid 139974569608960] [client 172.27.100.122:50354] AH01114: HTTP: failed to make connection to backend: localhost

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

@narindergupta

Glance unit should be running on Baremetal or KVM instead of on LXD to use iSCSI (see comment #1)
Could you please update the bundle accordingly and let us know how it works.

I will add a status warning message to the glance unit if its running on lxd and connected to cinder-volume-service.

Revision history for this message
Narinder Gupta (narindergupta) wrote :

I can try running glance on KVM or bare metal but just to mention that I am not using iSCSI here as ceph is default backend for cinder so I am trying to use the __DEFAULT__ volume type which is based on ceph as backend.

Revision history for this message
Nobuto Murata (nobuto) wrote :

> I can try running glance on KVM or bare metal but just to mention that I am not using iSCSI here as ceph is default backend for cinder so I am trying to use the __DEFAULT__ volume type which is based on ceph as backend.

Ceph(rbd) is something different from iSCSI from a Glance-Cinder integration point of view. Long story short, it's not going to work with the current config the charm writes nor with the patch proposed. Ceph(rbd) should be used with a direct relation from Glance to ceph-mon. The data copy can leverage copy-on-write from the Glance pool to Cinder pool as long as the image format is raw.

What's your motivation to use Ceph(rbd) through Cinder over Glance-rbd backend?

summary: - [feature request] Add support of Cinder storage backend
+ [feature request] Add support of Cinder iSCSI/FC storage backend
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

@narindergupta

I agree with Nobuto (comment #10) and would like to understand the use case.

Also just want to point couple of things from the bundle and logs you provided:
1. ceph:rbd is also configured as backend in glance-api as glance, ceph-mon are related as per the bundle.

charm-glance configures default backend store based on existence of relations in the following precedence ceph, swift, s3, cinder, local.
In this case, ceph is configured as default backend. So Image creation should talk to ceph store to create volume.

[glance_store]
default_backend = ceph

If you dont want this, need to remove the following relation
juju remove-relation ceph-mon:client glance:ceph

2. Glance-api service is getting restarted continuously with the following error.
   (Earlier releases to xena should not have this issue)

Nov 15 14:24:02 juju-a994a5-2-lxd-1 glance-api[1917872]: ERROR: 'NoneType' object has no attribute 'user_id'

Since the glance-api service is not up, image creation is getting failed.
This can be solved either by enhancing charm-glance code to introduce new config options related to cinder_store user credentials OR update glance_store to soft warn when not able to connect to cinder. I raised a bug on glance_store [1] for the later.

Since you have only one cinder volume type defined, you can workaround this issue with
juju config glance --reset cinder-volume-types

[1] https://bugs.launchpad.net/glance-store/+bug/1951081

Revision history for this message
Narinder Gupta (narindergupta) wrote :

While running this test with other iscsi devices in pre production we found the following issues when trying to create multiple instances at the same time. Up to 3 instances, it works fine but has more than 3 instances found the following errors.

Error: Failed to perform requested operation on instance "ubuntu-10", the instance has an error status: Please try again later [Error: Build of instance 2fe8c36b-a0b7-4b71-aa80-126d8c2c1d4b aborted: Volume 7f86ba6d-e60b-43d9-82f0-7ec2bbc6331c did not finish being created even after we waited 188 seconds or 61 attempts. And its status is downloading.].
Solutions:
Following values in nova.conf value solves the problem
block_device_allocate_retries = 600
block_device_allocate_retries_interval = 30
image_volume_cache_enabled = true

Error: Build of instance f1e61230-c5cf-41cd-bb8d-95e1fc81a332 aborted: Volume d4a32ec1-f5c4-4a37-a4bc-b103dc486313 did not finish being created even after we waited 30 seconds or 2 attempts. And its status is error.
Solutions:
Following values in glance.conf for respective glance_store value solves the problem
cinder_http_retries = 30
cinder_state_transition_timeout = 3000

Revision history for this message
Narinder Gupta (narindergupta) wrote :

While running this test with other iscsi devices in production we found the following issues when trying to create multiple instances at the same time. Up to 3 instances, it works fine but has more than 3 instances found the following errors.
Error: Failed to perform requested operation on instance "ubuntu-10", the instance has an error status: Please try again later [Error: Build of instance 2fe8c36b-a0b7-4b71-aa80-126d8c2c1d4b aborted: Volume 7f86ba6d-e60b-43d9-82f0-7ec2bbc6331c did not finish being created even after we waited 188 seconds or 61 attempts. And its status is downloading.].
Solutions:
Following values in nova.conf value solves the problem
block_device_allocate_retries = 600
block_device_allocate_retries_interval = 30
Juju config nova-compute config-flags block_device_allocate_retries=600, block_device_allocate_retries_interval=30
Error: Build of instance f1e61230-c5cf-41cd-bb8d-95e1fc81a332 aborted: Volume d4a32ec1-f5c4-4a37-a4bc-b103dc486313 did not finish being created even after we waited 30 seconds or 2 attempts. And its status is error.
 Solutions:
Following values in glance.conf for respective glance_store value solves the problem
 cinder_http_retries = 30
cinder_state_transition_timeout = 3000

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-glance (master)

Reviewed: https://review.opendev.org/c/openstack/charm-glance/+/814882
Committed: https://opendev.org/openstack/charm-glance/commit/c2f877a7d420363be6d9f2718d0ecbcb424ecd84
Submitter: "Zuul (22348)"
Branch: master

commit c2f877a7d420363be6d9f2718d0ecbcb424ecd84
Author: Hemanth Nakkina <email address hidden>
Date: Thu Oct 21 15:11:12 2021 +0530

    Add support for cinder storage backend

    Create new glance_api.conf template from Ussuri release to
    use default_backend and enabled_backends configuration
    parameters instead of deprecated stores, default_store
    parameters.
    Add new config option cinder-volume-types to specify the
    volume types in cinder that can be used to store glance
    images.
    Add logic to update cinder in glance-api configurations
    if cinder-volume-service relation is joined.

    Also add two flags, cinder_http_retries and
    cinder_state_transition_timeout

    Closes-Bug: #1905042
    Change-Id: Ife649defc9b765b433d7973ab31778f9cb1efdd9

Changed in charm-glance:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-glance (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/charm-glance/+/835013

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-glance (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/charm-glance/+/835013
Committed: https://opendev.org/openstack/charm-glance/commit/51baf6899a834ca39d534e350e47b7e06bbfbcf0
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 51baf6899a834ca39d534e350e47b7e06bbfbcf0
Author: Hemanth Nakkina <email address hidden>
Date: Thu Oct 21 15:11:12 2021 +0530

    Add support for cinder storage backend

    Create new glance_api.conf template from Ussuri release to
    use default_backend and enabled_backends configuration
    parameters instead of deprecated stores, default_store
    parameters.
    Add new config option cinder-volume-types to specify the
    volume types in cinder that can be used to store glance
    images.
    Add logic to update cinder in glance-api configurations
    if cinder-volume-service relation is joined.

    Also add two flags, cinder_http_retries and
    cinder_state_transition_timeout

    Conflicts:
        tests/bundles/focal-wallaby.yaml
        tests/bundles/focal-xena.yaml
        tests/bundles/impish-xena.yaml
        tests/bundles/jammy-yoga.yaml
        tests/tests.yaml

    To resolve the conflicts, updated tests.yaml and focal
    xena/wallaby bundles with proper channels.
    Removed impish-xena, jammy-yoga bundles.

    func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/735

    Closes-Bug: #1905042
    Change-Id: Ife649defc9b765b433d7973ab31778f9cb1efdd9
    (cherry picked from commit c2f877a7d420363be6d9f2718d0ecbcb424ecd84)

tags: added: in-stable-xena
Changed in charm-glance:
milestone: none → 22.04
tags: added: seg
removed: sts
Revision history for this message
Nobuto Murata (nobuto) wrote :

@Hemanth,

> cinder-state-transition-timeout:
> type: int
> default: 30

Do you happen to know why the default value in the charm is shorter than the upstream default as 300?

https://docs.openstack.org/glance/latest/configuration/configuring.html
> cinder_state_transition_timeout
>
> Optional. Default: 300

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to charm-glance (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/charm-glance/+/838810

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

@nobuto

its by mistake, submitted a new patch to change the value to openstack defaults.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to charm-glance (master)

Reviewed: https://review.opendev.org/c/openstack/charm-glance/+/838810
Committed: https://opendev.org/openstack/charm-glance/commit/df4ef60bec7ec4b457eb8f14a9a659e83af897db
Submitter: "Zuul (22348)"
Branch: master

commit df4ef60bec7ec4b457eb8f14a9a659e83af897db
Author: Hemanth Nakkina <email address hidden>
Date: Thu Apr 21 09:58:34 2022 +0530

    Change cinder-state-transition-timeout default value to 300

    The defaults for cinder-state-transition-timeout in openstack
    glance is 300. Change the charm configuration as well to set
    the same value.

    Related-Bug: #1905042
    Change-Id: I8add26e9fc23dffac75ded29673d50c7fcc48a6f

Changed in charm-glance:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.