MongoDB Charm

mongodb functional tests have race condition due to fixed sleep time

Bug #1518468 reported by Ryan Beisner on 2015-11-20

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	MongoDB Charm	Fix Released	Undecided	Mario Splivalo

Bug Description

The mongodb amulet tests have been variably passing/failing in automation since I began observing them not long ago. This is due to the arbitrary one-time sleep period, apparently intended to allow enough time for the charm and its deployed services to quiesce.

A sleep is not reliable, as the wait time is not always long enough for the charm and replica to settle before tests commence. The amount of time necessary to wait will vary, depending on test rig load, internet speed, and other factors, and is not reliably predictable.

Increasing sleep duration isn't an ideal solution, as it only defers the potential for race, and causes test rigs to be idle and blocking unnecessarily when not under increased load.

Checking that the underlying service is ready, in a polling interval loop with a timeout of several minutes, before testing, would be ideal.

That, and/or implementing extended status and having the charm advertise when it is settled and ready, and only testing once that condition is met.

See original description

Tags:

Related branches

lp:charms/trusty/mongodb

Ryan Beisner (1chb1n) on 2015-11-23

description:	updated
summary:	- mongodb functional tests are racey due to fixed sleep time + mongodb functional tests have race condition due to fixed sleep time

Ryan Beisner (1chb1n) on 2015-11-23

description:	updated
description:	updated

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-11-23:

FYI, two recent proposals, with automated tests failing in various race windows. In a handful of spot checks of those results, they are failing due to tests jumping the gun.

https://code.launchpad.net/~evarlast/charms/trusty/mongodb/fix-dump-actions/+merge/277191

https://code.launchpad.net/~tvansteenburgh/charms/trusty/mongodb/use-charm-benchmark-lib/+merge/278044

Revision history for this message

Matt Bruzek (mbruzek) wrote on 2015-12-16:

Ryan I merged 277191 because the change replaced some arbitrary sleep() calls with Amulet wait() methods. There is still room for improvement in these functional tests. I highly encourage anyone who knows the mongodb service well to contribute tests that replace all sleep() functions with reliable ways to determine the correct time to run tests and eliminate the race condition. Such as running a command on the unit to tell when it is ready, or using mongo libraries to determine the ready state.

Revision history for this message

Mario Splivalo (mariosplivalo) wrote on 2016-05-17:

I've reviewed the amulet failures for the first mp that Ryan Beisner pasted in comment#1 (as second URL is no longer available). The amulet failure is attributed to 03_deploy_replicaset.py failing - this is because sometimes when replicaset is formed mongodb/0 unit is started after mongodb/1, so you have two PRIMARYes in the replicaset.

As the code for building replicaset was written before leader election in juju, there was no safe way of determining which unit should initialize the replicaset - therefore, the unit with the lowest number is initializing the replicaset; which, as shown, sometimes fails.

I will fix this in the charm, and then amulet tests should not fail any more.

This is the reason for the failure of mongodb when used with landscape, as explained in this bug:
https://bugs.launchpad.net/charms/+source/mongodb/+bug/1467742

Changed in mongodb (Juju Charms Collection):
status:	New → Confirmed
assignee:	nobody → Mario Splivalo (mariosplivalo)

Mario Splivalo (mariosplivalo) on 2016-05-17

Changed in mongodb (Juju Charms Collection):
status:	Confirmed → In Progress

Mario Splivalo (mariosplivalo) on 2017-04-07

affects:

mongodb (Juju Charms Collection) → mongodb-charm

Revision history for this message

Mario Splivalo (mariosplivalo) wrote on 2018-03-05:

This is now fixed as mongodb charm code uses leader election when choosing a unit to initialize replicaset from: https://code.launchpad.net/~mariosplivalo/mongodb-charm/+git/mongodb-charm/+merge/340136

Changed in mongodb-charm:
status:	In Progress → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.