MAAS

check_commissioning fails test: test_check_with_commissioning_but_not_expired_node

Bug #1354511 reported by Jeroen T. Vermeulen on 2014-08-08

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	MAAS	Fix Released	Critical	Jeroen T. Vermeulen
	1.6	Fix Released	Undecided	Jeroen T. Vermeulen	MAAS 1.6.0

Bug Description

This test fails consistently for me in trunk, and Julian mentioned seeing it fail as well: src/maasserver/tests/test_api_commissioning.py:TestCommissioningTimeout.test_check_with_commissioning_but_not_expired_node

A node is created in Commissioning state, then the check_commissioning API call is invoked, and the node is immediately marked Failed. Julian has reported seeing a similar test failure.

Looking at the code, it seems obvious: check_commissioning compares the node's last-update timestamp, which is based on the database's now() transaction clock, to Python's datetime.now() minus one hour — instead of datetime.utcnow() as you'd expect. Julian and I observed the failures in large positive timezone offsets, so datetime.now() would be well over an hour ahead of the database now().

What still puzzles me is that the same test consistently passes for me in 1.6 or earlier, and consistently fails in trunk. The check_commissioning code hasn't changed, and neither has the TimestampedModel code that sets the update timestamp. The test is unchanged as well. Could we be faking the clock for testing, but in a different way than before? Or has there been a change in some dependency that our dev systems don't take from an installed package?

(On a sidenote, can we really count on the node's latest update timestamp being an accurate representation of when it started commissioning? How do we know there are no periodic updates to refresh the timestamp before the timeout can elapse?)

I'm treating this as a potential production problem, not as a spurious test failure, until we know better.

Related branches

lp:~jtv/maas/bug-1354511

Merged into lp:~maas-committers/maas/trunk at revision 2668

Blake Rouse (community): Approve on 2014-08-08

lp:~jtv/maas/1.6-bug-1354511

Merged into lp:maas/1.6 at revision 2543

Jeroen T. Vermeulen (community): Approve on 2014-08-08

Superseded for merging into lp:~maas-committers/maas/trunk

MAAS Maintainers: Pending requested 2014-08-08

lp:~jtv/maas/1.5-bug-1354511

Merged into lp:maas/1.5 at revision 2296

Jeroen T. Vermeulen (community): Approve on 2014-08-08

lp:~jtv/maas/1.2-bug-1354511

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2014-08-08:

I scanned the diff of the “src” directory between 1.6 and trunk for “[Tt]ime” and “[Cc]lock” but found nothing that looked like it might cause this. :-|

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2014-08-08:

Found it! The database cluster in my trunk branch was still on UTC+1. Setting it up afresh made the test pass.

This does reveal a bug in the API code, however: it shouldn't be mixing its client-side clock with the database clock. It'd be great if it could just issue the SQL expression “now() - <interval>” instead of computing the value itself.

Jeroen T. Vermeulen (jtv) on 2014-08-08

Changed in maas:
milestone:	none → 1.6.1

Jeroen T. Vermeulen (jtv) on 2014-08-08

Changed in maas:
assignee:	nobody → Jeroen T. Vermeulen (jtv)
status:	Triaged → In Progress

MAAS Lander (maas-lander) on 2014-08-08

Changed in maas:
status:	In Progress → Fix Committed

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2014-08-08:

The backport to 1.2 is nontrivial. Not sure it's worth the trouble — the conditions for the failure aren't all that common.

Julian Edwards (julian-edwards) on 2014-08-11

Changed in maas:
milestone:	1.6.1 → none

Julian Edwards (julian-edwards) on 2014-08-20

Changed in maas:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.