Unable to create a volume backup within 1 min after restarting the cinder backup service

Bug #2059416 reported by Anton Kurbatov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
In Progress
Undecided
Unassigned

Bug Description

I've encountered an odd behavior after restarting the cinder-backup process.
Subsequent attempts to create a volume backup result in errors despite the absence of any errors reported by the cinder-backup service.

1) Start the cinder-volume backup process (around 11:19:20).
2) Wait for 20-30 seconds, then attempt to create a volume backup (around 11:19:40)

[root@ak-devstack0 ~]# openstack volume backup create vol-test --name bcp1
[root@ak-devstack0 ~]# openstack volume backup show fe8b4eea-bd86-4b16-aabb-f0be400f3bd0 -c status -c created_at -c updated_at
+------------+----------------------------+
| Field | Value |
+------------+----------------------------+
| created_at | 2024-03-28T11:19:43.000000 |
| status | error |
| updated_at | 2024-03-28T11:19:43.000000 |
+------------+----------------------------+
[root@ak-devstack0 ~]#

If I wait a little more than 1 minute, subsequent backups are created successfully.

In the cinder-scheduler logs, I found the following lines:

Mar 28 11:19:26 ak-devstack0 cinder-scheduler[134179]: DEBUG cinder.scheduler.host_manager [None req-98a5047b-45e1-47db-a2a6-6681ef3c6a80 None None] Received backup service update from ak-devstack0: {'backend_state': False, 'driver_name': 'cinder.backup.drivers.nfs.NFSBackupDriver', 'availability_zone': 'nova'} {{(pid=134179) update_service_capabilities /opt/stack/cinder/cinder/scheduler/host_manager.py:597}}
Mar 28 11:19:43 ak-devstack0 cinder-scheduler[134179]: ERROR cinder.scheduler.manager [None req-960bd56a-974f-4d28-a5e1-458e9257ec46 demo None] Service not found for creating backup.: cinder.exception.ServiceNotFound: Service cinder-backup could not be found.
Mar 28 11:20:43 ak-devstack0 cinder-scheduler[134179]: DEBUG cinder.scheduler.host_manager [None req-fc2aa722-3509-4294-acaa-c22a84312590 None None] Received backup service update from ak-devstack0: {'backend_state': True, 'driver_name': 'cinder.backup.drivers.nfs.NFSBackupDriver', 'availability_zone': 'nova'} {{(pid=134179) update_service_capabilities /opt/stack/cinder/cinder/scheduler/host_manager.py:597}}

I've reviewed the code [1] and noted that the logs do not contain "Backup driver was successfully initialized" or "Failed to initialize driver."
Additionally, the init_loop.start method catches the LoopingCallDone exception, meaning that we never enter the 'except loopingcall.LoopingCallDone:' condition within setup_backup_backend.

[1] https://opendev.org/openstack/cinder/src/commit/54856da91045299537fdb69edf43fb61aba79cc6/cinder/backup/manager.py#L166

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/cinder/+/914641

Changed in cinder:
status: New → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.