Instance creation randomly fails on rdb image import and then instance is assigned a second IP

Bug #1851310 reported by Yael Perez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Oleksiy Molchanov

Bug Description

Steps to reproduce: Create a batch of instances

Expected Result: All instances are created without an issue

Actual Result: When customer (Adobe) creates a batch of instances, at random times an instance will end up with 2 floating IPs. The last attempt of 60 instances ended up with 2 that had a second IP. From the logs when this happens it seems that the rbd image import fails with a 'File Exists' error. So it seems that the instance with attempt to be created on once compute node, the networking gets created, the rbd image error happens, then the instance gets moved to another compute when the rdb image succeeds and a second IP is assigned at that time. Customer states that the same image was being used for each instance and when targeting a specific compute (where there was a failure) the instance spins up without a problem. Cusomter is running Ceph 12.2.11 and confirmed this is the case across the environment.

Error:
18:27:18.999289 7f0abffff700 -1 librbd::image::CreateRequest: 0x55e8900ab3e0 handle_create_image: error writing header: (17) File exists\nrbd: image creation failed\n\rImporting image: 0% complete...failed.\nrbd: import failed: (17) File exists\n'

Workaround: Recreate the failed instance

Environment: MOS 9.2, OS Mitaka, Neutron+OVS

Changed in fuel:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Oleksiy Molchanov (omolchanov)
milestone: none → 9.x-updates
Changed in fuel:
milestone: 9.x-updates → 9.2-mu-15
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/nova (9.0/mitaka)

Reviewed: https://review.fuel-infra.org/41521
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0/mitaka

Commit: 6b6a602a0dbfa4ab33b6cfe2a4a7e481182a87a7
Author: Brent Tang <email address hidden>
Date: Mon Nov 11 15:54:24 2019

Instance obj_clone leaves metadata as changed

When performing an obj_clone on an Instance object, it relies on the
base obj_clone's deep copy method which just goes through all of the
fields and duplicates them on the clone. However, the Instance object
has internal tracking attributes for metadata and system_metadata that
keep track of which keys have changed. Since these don't get copied
over on the deep copy, these will show up as having changed on the
cloned Instance (if they had metadata set originally) and on a save
on this cloned object will end up updating the metadata in the db
which if these have stale information will wipe out other changes.

Such a scenario to this is occurring during build_instance where the
Claim constructor is saving off a clone of the Instance object
so that it will have the current copy. In the case of a failure
during the build_instance, the claim will use this current copy
to then set the value of the host on the instance to None. Since
the clone reflects that the system_metadata has changed, it will
as part of the save try to update the system_metadata in the db,
which since it is stale will wipe out changes made.

In the case of the issue being fixed here, the value in the
system_metdata being wiped out is the network_allocated=True
attribute. By this being wiped out then reschedules of that
build_instance fail in cases where the IP address has been
assigned to the instance.

Change-Id: Ie49a90993fb9643c04d75402dbaaa933c6b5222e
Closes-Bug: #1851310

Changed in fuel:
status: In Progress → Fix Committed
Pavel Glazov (pglazovv)
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.