Newton > Ocata upgrade. secret_uuid empty

Bug #1687581 reported by Tadas Ustinavičius on 2017-05-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Matt Riedemann
Ocata
High
Matt Riedemann

Bug Description

Hello,
I've bumped into issue with Nova:
After upgrading Neutron to Ocata via openstack-ansible, I was unable to start any of the servers.

openstack server list |grep a6200dff-2010-44a2-92ca-3af303ec3577

| a6200dff-2010-44a2-92ca-3af303ec3577 | Win10VDI-3-ephemeral | SHUTOFF | internal=10.0.0.17|

openstack server start a6200dff-2010-44a2-92ca-3af303ec3577
openstack server list |grep a6200dff-2010-44a2-92ca-3af303ec3577

| a6200dff-2010-44a2-92ca-3af303ec3577 | Win10VDI-3-ephemeral | SHUTOFF | internal=10.0.0.17|

Server is still in SHUTOFF state.
nova.log contains the flowing entries:
http://paste.ubuntu.com/24465172/

After digging more into problem, I've found, that https://github.com/openstack/nova/blob/stable/ocata/nova/virt/libvirt/volume/net.py#L65 is defined as None.
This made me read _set_auth_config_rbd description, so I've reconfigured my cinder nodes and added `rbd_secret_uuid` line (this line was not provided in my Newton cinder configuration, but everything seemed to work fine back then).
After reconfiguration, I've managed to create and start new machines, but machines, which were created in Newton sill failed. After some more debugging it came to me, that old machines still had no 'secret_uuid' in their connection_info dictionary:

{u'driver_volume_type': u'rbd', u'connector': {u'wwpns': [u'21000024ff54b106', u'21000024ff54b107'], u'wwnns': [u'20000024ff54b106', u'20000024ff54b107'], u'ip': u'172.31.105.203', u'initiator': u'iqn.1993-08.org.debian:01:b7f25b95aff0', u'platform': u'x86_64', u'host': u'ostack-ibm3', u'do_local_attach': False, u'os_type': u'linux2', u'multipath': False}, u'serial': u'87c5d950-c212-4e9c-9d13-6a583fed9e62', u'data': {u'secret_type': u'ceph', u'name': u'cinder-volumes/volume-87c5d950-c212-4e9c-9d13-6a583fed9e62', u'encrypted': False, u'cluster_name': u'ceph', u'secret_uuid': None, u'qos_specs': None, u'hosts': [u'172.31.104.1', u'172.31.104.2', u'172.31.104.3'], u'volume_id': u'87c5d950-c212-4e9c-9d13-6a583fed9e62', u'auth_enabled': True, u'access_mode': u'rw', u'auth_username': u'ostack', u'ports': [u'6789', u'6789', u'6789']}}

It seems, that Ocata requires secret_uuid to be provided in database. Machines form Newton release did not contain that entry:

Newton :

| {"driver_volume_type": "rbd", "connector": {"wwpns": ["21000024ff54b106", "21000024ff54b107"], "wwnns": ["20000024ff54b106", "20000024ff54b107"], "ip": "172.31.105.203", "initiator": "iqn.1993-08.org.debian:01:b7f25b95aff0", "platform": "x86_64", "host": "ostack-ibm3", "do_local_attach": false, "os_type": "linux2", "multipath": false}, "serial": "87326e73-1e9a-4589-9140-985950e93068", "data": {"secret_type": "ceph", "name": "cinder-volumes/volume-87326e73-1e9a-4589-9140-985950e93068", "encrypted": false, "cluster_name": "ceph", "secret_uuid": null, "qos_specs": null, "hosts": ["172.31.104.1", "172.31.104.2", "172.31.104.3"], "volume_id": "87326e73-1e9a-4589-9140-985950e93068", "auth_enabled": true, "access_mode": "rw", "auth_username": "ostack", "ports": ["6789", "6789", "6789"]}}
Ocata:
                                             |
| {"driver_volume_type": "rbd", "connector": {"wwpns": ["21000024ff54b106", "21000024ff54b107"], "wwnns": ["20000024ff54b106", "20000024ff54b107"], "ip": "172.31.105.203", "initiator": "iqn.1993-08.org.debian:01:b7f25b95aff0", "platform": "x86_64", "host": "ostack-ibm3", "do_local_attach": false, "os_type": "linux2", "multipath": false}, "serial": "565ba4ab-2f8a-40fd-ae82-9346414ceb15", "data": {"secret_type": "ceph", "name": "cinder-volumes/volume-565ba4ab-2f8a-40fd-ae82-9346414ceb15", "encrypted": false, "cluster_name": "ceph", "secret_uuid": "a11833c5-9403-4423-8a26-111222333444", "qos_specs": null, "hosts": ["172.31.104.1", "172.31.104.2", "172.31.104.3"], "volume_id": "565ba4ab-2f8a-40fd-ae82-9346414ceb15", "auth_enabled": true, "access_mode": "rw", "auth_username": "ostack", "ports": ["6789", "6789", "6789"]}}

So, is there any way to refresh database to fix this, or should I refresh block_device_mapping table by hand?
I'm not sure if this is a bug, perhaps there is need to put some warning about this in upgrade documentation?
Thank you.

summary: - Newton > Ocata upgrace. secret_uuid empty
+ Newton > Ocata upgrade. secret_uuid empty
Tadas Ustinavičius (tadas-u) wrote :

The following patch should fix this issue:

http://paste.ubuntu.com/24497744/

Sean Dague (sdague) on 2017-06-08
Changed in nova:
importance: Undecided → High
tags: added: ceph
Matt Riedemann (mriedem) wrote :

Likely caused by this change in Ocata: https://review.openstack.org/#/c/389399/

Changed in nova:
status: New → Triaged

Fix proposed to branch: master
Review: https://review.openstack.org/472266

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Triaged → In Progress
Sean Dague (sdague) on 2017-06-09
Changed in nova:
milestone: none → pike-3

Reviewed: https://review.openstack.org/472266
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f2d27f6a8afb62815fb6a885bd4f8ae4ed287fd3
Submitter: Jenkins
Branch: master

commit f2d27f6a8afb62815fb6a885bd4f8ae4ed287fd3
Author: Matt Riedemann <email address hidden>
Date: Thu Jun 8 09:35:42 2017 -0400

    libvirt: handle missing rbd_secret_uuid from old connection info

    Change Idcbada705c1d38ac5fd7c600141c2de7020eae25 in Ocata
    started preferring Cinder connection info for getting RBD auth
    values since Nova needs to be using the same settings as Cinder
    for volume auth.

    However, that introduced a problem for guest connections made
    before that change, where the secret_uuid might not have been
    configured on the Cinder side and that's what is stored in the
    block_device_mappings.connection_info column and is what we're
    checking in _set_auth_config_rbd. Before Ocata this wasn't a
    problem because we'd use the Nova configuration values for the
    rbd_secret_uuid if set. But since Ocata it is a problem since
    we don't consult nova.conf if auth was enabled, but not completely
    configured, on the Cinder side.

    So this adds a fallback check to set the secret_uuid from
    nova.conf if it wasn't set in the connection_info via Cinder
    originally. A note is also added to caution about removing
    any fallback mechanism on the nova side - something we'd
    need to consider before we could likely drop this code.

    Co-Authored-By: Tadas Ustinavičius <email address hidden>

    Change-Id: I6fc7108817fcd9df4a342c9dabbf14ab7911d06a
    Closes-Bug: #1687581

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/472687
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fb4184f1e690901378a155573368a55ff9a8a779
Submitter: Jenkins
Branch: stable/ocata

commit fb4184f1e690901378a155573368a55ff9a8a779
Author: Matt Riedemann <email address hidden>
Date: Thu Jun 8 09:35:42 2017 -0400

    libvirt: handle missing rbd_secret_uuid from old connection info

    Change Idcbada705c1d38ac5fd7c600141c2de7020eae25 in Ocata
    started preferring Cinder connection info for getting RBD auth
    values since Nova needs to be using the same settings as Cinder
    for volume auth.

    However, that introduced a problem for guest connections made
    before that change, where the secret_uuid might not have been
    configured on the Cinder side and that's what is stored in the
    block_device_mappings.connection_info column and is what we're
    checking in _set_auth_config_rbd. Before Ocata this wasn't a
    problem because we'd use the Nova configuration values for the
    rbd_secret_uuid if set. But since Ocata it is a problem since
    we don't consult nova.conf if auth was enabled, but not completely
    configured, on the Cinder side.

    So this adds a fallback check to set the secret_uuid from
    nova.conf if it wasn't set in the connection_info via Cinder
    originally. A note is also added to caution about removing
    any fallback mechanism on the nova side - something we'd
    need to consider before we could likely drop this code.

    Co-Authored-By: Tadas Ustinavičius <email address hidden>

    NOTE(mriedem): The unit test is modified slightly to not
    pass an instance to the disconnect_volume method as that
    was only available starting in Pike: b66b7d4f9d

    Change-Id: I6fc7108817fcd9df4a342c9dabbf14ab7911d06a
    Closes-Bug: #1687581
    (cherry picked from commit f2d27f6a8afb62815fb6a885bd4f8ae4ed287fd3)

This issue was fixed in the openstack/nova 15.0.6 release.

This issue was fixed in the openstack/nova 16.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers