Comment 2 for bug 1774252

Revision history for this message
Matthew Booth (mbooth-9) wrote :

jichenjc,

It would be avoided with a brief race, which is that confirm will be able to proceed after the periodic task has run for the first time.

I filed this separately because I think this code should be more defensive. We've got code called in an unknown number of ways which will fail unless some other code has run before it, but we don't do anything to ensure that the other code has run first.

Firstly, we should obviously fix the bug which is causing the periodic to fail today. Once we've done that, we should either:

* Move the initial population of ResourceTracker to init_host so that the compute host won't start processing jobs until it has been successfully initialized.

* Make ResourceTracker do something defensive and non-faily if we try to do stuff with it before it has been initialized.

I suspect that the former would be better. Anyway, this is a separate issue from the immediate bug, which is why I created 2.