Datastore Install In Prepare (vs. Prebaked) Fails
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack DBaaS (Trove) |
Fix Released
|
Undecided
|
Viswa Vutharkar |
Bug Description
note that in https:/
if the "rm -rf" above is not present, then the guest fails to initialize due to an issue with the rsync of the data-dir never returning successfully (because the guest is convinced it needs to migrate existing data).
2014-01-30 07:04:41.031 DEBUG trove.openstack
^^ will spin forever.
the issue is, that if the package is being installed in prepare() vs. pre-baked in the image, the same "rm -rf" needs to happen to avoid the issue mentioned above, but such a command is never issued.
either the rsync logic needs to be fixed, the same "rm -rf" needs to be done in the guest, or some intelligence needs to be added to the guest to understand the difference between an empty/dummy data-dir vs. a legitimate one that needs to be migrated.
tl;dr: unless mongodb is pre-baked into the image, mongodb will not work. the same issue exists for cassandra, and likely for others.
summary: |
- Datastore Install On Prepare Fails for Mongo/Cassandra + Datastore Install In Prepare (vs. Prebaked) Fails |
Changed in trove: | |
milestone: | none → icehouse-3 |
Changed in trove: | |
assignee: | nobody → Viswa Vutharkar (vvutharkar) |
Changed in trove: | |
milestone: | icehouse-3 → icehouse-rc1 |
Changed in trove: | |
milestone: | icehouse-rc1 → none |
status: | In Progress → Fix Released |
I have done some reading around this https:/ /github. com/openstack/ diskimage- builder sources. list.d/ ------- ------- -
install.d elements run in 'chroot' mode on the Host machine itself where the image is being built (the redstack machine). It makes no sense to delete this dir at this stage because I have observed one of the two things happen based on version of mongo
mongo 2.0.4
------------------
By default the apt-get install step above installs mongo version 2.0.4
The install process also configures 'upstart'/init.d scripts to startup mongod process automatically at boot time.
Given the above, when the image is provisioned and VM comes up, at boot time the mongodb process starts up, and not finding the /var/lib/mongodb/ dir, it goes ahead and creates one again and populates it with sparse files (about 3GB total size)
The trove guest agent install (rsync) etc comes later, as part of the execution of first-boot.d element. So when the manager comes up, it detects the presence of /var/lib/mongodb/ dir and triggers the 'migration' process anyway (mount the volume, move the files via rsync, remount /var/lib/mongodb on the volume etc).
So, not sure what exactly is accomplished by removing the directory at install.d element stage
mongo 2.4.9
------------------
I forced the installation of mongo 2.4.9 by adding couple more steps before the apt-get install and including mongodb.org debian repos in the etc/apt/
This install (of version 2.4.9) via the apt-get install also configures the 'upstart'/init.d scripts to startup mongod process automatically at boot time.
Given the above, when the image is provisioned and VM comes up, at boot time the mongodb process starts up. But possibly due to non resilient upstart scripts bundled with this version, the mongo version 2.4.9 fails to startup if the /var/lib/mongodb/ dir is not present. So I tweaked the rm statement instead to "rm -rf /var/lib/mongodb/* " . But that doesn't help either due to the remaining logic described in the 2.0.4 case (mongodb starts up and creates files in the dir, 3.5GB worth of data, and first-boot.d only comes later on, when guest agent starts and detects the dir presence and triggers migration).
What I also noticed is that the rsync triggered to migrate the data into the mounted volume took about 10 mins, but this might be environmental (cinder network speed) at our company.
Anyway, So, not sure what exactly is accomplished by removing the directory at install.d element stage
Proper solution
-------
The proper solution is to tackle this removal of /var/lib/mongodb/* contents in the guestagent manager::prepare() method. Even there what is the need to do rsync at all, If that data is discardable (which is why you would consider rm -rf in the first place). Just delete the content, don't do rsync, and just mount the /var/lib/mongodb on the empty formatted volume.
But there could be a sensitive datastore that may put some critical data there without which it cannot start up. In that case you may want to migrate the data rather than delete it. That is one more reason why you want to tackle this in a datastore specific manager, at one time prepare() phase.
My assumption is that the prepare() ...