DetachedInstanceError: Parent instance <VolumeAttachment at > is not bound to a Session

Bug #1834845 reported by Patrick Oberdorf on 2019-07-01
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Cinder
Undecided
Gorka Eguileor
Ubuntu Cloud Archive
Status tracked in Stein
Stein
High
James Page
Train
High
James Page
cinder (Ubuntu)
Status tracked in Eoan
Disco
High
James Page
Eoan
High
James Page

Bug Description

Hey there.

We upgraded from rocky to stein and facing a strange error. Our cinder-ceph volume services gets marked as down after a few seconds. So we looked into this and found a error in the logs:

Error starting thread.: sqlalchemy.orm.exc.DetachedInstanceError: Parent instance <VolumeAttachment at 0x7f04ba5ed3c8> is not bound to a Session; lazy load operation of attribute 'volume' cannot proceed (Background on this error at: http://sqlalche.me/e/bhk3)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service Traceback (most recent call last):
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/oslo_service/service.py", line 796, in run_service
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service service.start()
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/service.py", line 222, in start
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service service_id=Service.service_id)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/volume/manager.py", line 445, in init_host
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service self._init_host(added_to_cluster, **kwargs)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/volume/manager.py", line 523, in _init_host
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service volumes = self._get_my_volumes(ctxt)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/volume/manager.py", line 3007, in _get_my_volumes
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service limit, offset)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/volume/manager.py", line 3003, in _get_my_resources
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service offset=offset)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/objects/volume.py", line 617, in get_all
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service volumes, expected_attrs=expected_attrs)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/oslo_versionedobjects/base.py", line 1133, in obj_make_list
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service **extra_args)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/objects/volume.py", line 290, in _from_db_object
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service db_volume.get('volume_attachment'))
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/oslo_versionedobjects/base.py", line 1133, in obj_make_list
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service **extra_args)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/objects/volume_attachment.py", line 102, in _from_db_object
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service if 'volume' in expected_attrs and hasattr(db_attachment, 'volume'):
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/sqlalchemy/orm/attributes.py", line 242, in __get__
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service return self.impl.get(instance_state(instance), dict_)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/sqlalchemy/orm/attributes.py", line 601, in get
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service value = self.callable_(state, passive)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/sqlalchemy/orm/strategies.py", line 596, in _load_for_state
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service (orm_util.state_str(state), self.key)
2019-07-01 10:29:22.947 16745 ERROR oslo_service.service sqlalchemy.orm.exc.DetachedInstanceError: Parent instance <VolumeAttachment at 0x7f04ba5ed3c8> is not bound to a Session; lazy load operation of attribute 'volume' cannot proceed (Background on this error at: http://sqlalche.me/e/bhk3)

It looks like, because of this error the service gets marked down. This issue is definitly linked to this: https://bugs.launchpad.net/cinder/+bug/1812913 but the fix that is released is not fixing it.
If we put a try-catch block around this block, the services stays up, but i am not sure if something else will break.
This is in our perspective a high priority bug, because we are not able to use one of our ceph backends.

ii cinder-api 2:14.0.0-0ubuntu1~cloud0 all Cinder storage service - API server
ii cinder-common 2:14.0.0-0ubuntu1~cloud0 all Cinder storage service - common files
ii cinder-scheduler 2:14.0.0-0ubuntu1~cloud0 all Cinder storage service - Scheduler server
ii cinder-volume 2:14.0.0-0ubuntu1~cloud0 all Cinder storage service - Volume server
ii python3-cinder 2:14.0.0-0ubuntu1~cloud0 all Cinder Python 3 libraries

Sean McGinnis (sean-mcginnis) wrote :

Gorka, could you take a look at this?

Changed in cinder:
assignee: nobody → Gorka Eguileor (gorka)
Patrick Oberdorf (obi12341) wrote :

_Dirty_ workaround for this

The attachment "really_dirty_workaround.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch

Fix proposed to branch: master
Review: https://review.opendev.org/668646

Changed in cinder:
status: New → In Progress
Patrick Oberdorf (obi12341) wrote :

I can confirm the commit is working :)

Changed in cinder:
assignee: Gorka Eguileor (gorka) → Eric Harney (eharney)
Eric Harney (eharney) on 2019-07-02
Changed in cinder:
assignee: Eric Harney (eharney) → Gorka Eguileor (gorka)
Changed in cinder:
assignee: Gorka Eguileor (gorka) → Eric Harney (eharney)
Eric Harney (eharney) on 2019-07-03
Changed in cinder:
assignee: Eric Harney (eharney) → Gorka Eguileor (gorka)
Changed in cinder:
assignee: Gorka Eguileor (gorka) → Eric Harney (eharney)
Eric Harney (eharney) on 2019-07-09
Changed in cinder:
assignee: Eric Harney (eharney) → Gorka Eguileor (gorka)
Matt Riedemann (mriedem) wrote :

Note that bug 1626499 and bug 1640920 seem like older versions of a similar thing but on different parts of the data model.

Reviewed: https://review.opendev.org/668646
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=2e73bede80cc2acdb3527f06bc5c5f9c1a8463a7
Submitter: Zuul
Branch: master

commit 2e73bede80cc2acdb3527f06bc5c5f9c1a8463a7
Author: Gorka Eguileor <email address hidden>
Date: Tue Jul 2 11:06:17 2019 +0200

    Fix DetachedInstanceError for VolumeAttachment

    Patch I253123d5451b32f0e3143916e41aaa1af75561c2 fixed the
    DetachedInstanceError for VolumeAttachment OVOs but only partially, as
    apparently it was dependent on the SQLAlchemy version due to the use os
    "hasattr".

    This patch replaces "hasattr" with a check on the object's dictionary,
    which will never trigger a Lazy Load.

    Closes-Bug: #1834845
    Change-Id: Iac785eef9be4b9cdb5c739ee0a87949805282867

Changed in cinder:
status: In Progress → Fix Released
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in cinder (Ubuntu):
status: New → Confirmed
tags: added: canonical-bootstack
James Page (james-page) on 2019-08-07
Changed in cinder (Ubuntu Eoan):
status: Confirmed → Triaged
Changed in cinder (Ubuntu Disco):
status: New → Triaged
importance: Undecided → High
Changed in cinder (Ubuntu Eoan):
importance: Undecided → High
James Page (james-page) on 2019-08-07
Changed in cinder (Ubuntu Eoan):
status: Triaged → In Progress
assignee: nobody → James Page (james-page)
James Page (james-page) wrote :

Ubuntu SRU information

[Impact]
After upgrading from Ubuntu with OpenStack Rocky to Ubuntu with OpenStack Stein, the cinder-volume service fails to restart resulting in loss of ability to create and manage volumes

[Test Case]
Deploy OpenStack Rocky
Create volume, attach to instance
Upgrade to OpenStack Stein
openstack volume create --size 5 test-volume fails.

[Regression Potential]
The patch for this fix is fairly defensive in approach and fixes an issue with an associated failure fixed in an earlier commit; low regression potential.

James Page (james-page) wrote :

I've uploaded fixes for eoan and disco; disco-proposed current contains 14.0.1-0ubuntu1 which will need to clear first.

Changed in cinder (Ubuntu Disco):
status: Triaged → In Progress
assignee: nobody → James Page (james-page)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cinder - 2:15.0.0~b1~git2019080709.0a0d55d8a-0ubuntu1

---------------
cinder (2:15.0.0~b1~git2019080709.0a0d55d8a-0ubuntu1) eoan; urgency=medium

  * New upstream snapshot for OpenStack Train (LP: #1834845).

 -- James Page <email address hidden> Wed, 07 Aug 2019 13:13:56 +0100

Changed in cinder (Ubuntu Eoan):
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/675081
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=44a15be6a71ebdaf0e5d1663fc31011ff43ef37b
Submitter: Zuul
Branch: stable/stein

commit 44a15be6a71ebdaf0e5d1663fc31011ff43ef37b
Author: Gorka Eguileor <email address hidden>
Date: Tue Jul 2 11:06:17 2019 +0200

    Fix DetachedInstanceError for VolumeAttachment

    Patch I253123d5451b32f0e3143916e41aaa1af75561c2 fixed the
    DetachedInstanceError for VolumeAttachment OVOs but only partially, as
    apparently it was dependent on the SQLAlchemy version due to the use os
    "hasattr".

    This patch replaces "hasattr" with a check on the object's dictionary,
    which will never trigger a Lazy Load.

    Closes-Bug: #1834845
    Change-Id: Iac785eef9be4b9cdb5c739ee0a87949805282867
    (cherry picked from commit 2e73bede80cc2acdb3527f06bc5c5f9c1a8463a7)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers