Comment 10 for bug 1827664

Revision history for this message
John A Meinel (jameinel) wrote :

I think I worked out why it is triggering every 3s. Specifically the delay line is:
 delay = time.Duration(float64(delay) * math.Pow(engine.config.BackoffFactor, float64(info.startAttempts-1)))

But note that it is tracking "startAttempts". However, startAttempts resets to 0 if gotStarted ends up getting called.
func (engine *Engine) gotStarted(name string, worker worker.Worker, resourceLog []resourceAccess) {
...
 default:
  // It's fine to use this worker; update info and copy back.
  engine.config.Logger.Debugf("%q manifold worker started", name)
  info.worker = worker
  info.starting = false
  info.startCount++
  // Reset the start attempts after a successful start.
  info.startAttempts = 0
...

So we probably need to track failing workers slightly differently. Where we treat a failure shortly-after-started similarly to failing at start.