Comment 4 for bug 1964811

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/833889
Committed: https://opendev.org/starlingx/config/commit/1bafdbde4b42b58f80292a5c69221b55f44b957f
Submitter: "Zuul (22348)"
Branch: master

commit 1bafdbde4b42b58f80292a5c69221b55f44b957f
Author: Kyle MacLeod <email address hidden>
Date: Tue Mar 15 13:20:18 2022 -0400

    Extend cert-mon network_max_retry default value

    We are running into cases where slow hardware means that subclouds can
    take a long time to complete their startup. In such cases, cert-mon
    is giving up on the initial audit after 15m, leaving the subcloud in
    an out-of-sync state until the next daily cert-mon audit run.

    The network_max_retry works with the network_retry_interval (in seconds)
    to affect the total time spent retrying a subcloud before giving up:

      network_retry_interval * (network_retry_interval/60) = time in minutes

    So we are increasing the overall retry time from 15m to 90m:

       30 retries * 180s/60m per retry = 90m

    Test Plan:

    PASS:
    - install using new values, verify that the default is now changed
    - verify retry limit is reached
    - verify case where daily audit is initiated while the subloud is in the
      reattempt state

    Partial-Bug: 1964811

    Signed-off-by: Kyle MacLeod <email address hidden>
    Change-Id: I706334eb9b665a4303a971255fa66fc2f27f3976