service_down_time shouldn't be allowed to be less than report_interval

Bug #1255685 reported by Soren Hansen
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Low
Liyingjun
OpenStack Compute (nova)
Fix Released
Low
Liyingjun

Bug Description

If service_down_time is less than report_interval, services will routinely be considered down, because they report in too rarely.

We should err out somehow if someone tries to do this. (Not that anyone would. *ahem*)

Revision history for this message
Matt Riedemann (mriedem) wrote :

Makes sense to me. Looks like nova.service.Service.basic_config_check would be a good place to check this. Would you completely keep the service from being created though? Or maybe log a warning and override the values to use the defaults?

Changed in nova:
status: New → Confirmed
Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like if basic_config_check fails, it's a system exit for the service, so that's probably good enough for this.

Changed in nova:
importance: Undecided → Low
Revision history for this message
Liyingjun (liyingjun) wrote :

I think this also needed by cinder and neutron.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/60760

Changed in cinder:
assignee: nobody → Liyingjun (liyingjun)
status: New → In Progress
Changed in cinder:
importance: Undecided → Low
Liyingjun (liyingjun)
Changed in nova:
assignee: nobody → Liyingjun (liyingjun)
Changed in nova:
status: Confirmed → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/60760
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=18b14203d8b30c9792d8c819f7993961b2ec8fc5
Submitter: Jenkins
Branch: master

commit 18b14203d8b30c9792d8c819f7993961b2ec8fc5
Author: liyingjun <email address hidden>
Date: Sat Nov 23 22:06:19 2013 +0800

    Make sure report_interval is less than service_down_time

    If service_down_time is less than report_interval, services will
    routinely be considered down, because they report in too rarely.
    Add check for service_down_time vs report_interval, if report_interval
    is larger than service_down_time, automatically change
    service_down_time to be report_interval * 2.5

    DocImpact

    Closes bug #1255685

    Change-Id: I9b291669ea201321b03a48d2a132810f3bace2dc

Changed in cinder:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
milestone: none → icehouse-2
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/69288

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/69288
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=80096b6fb62cec4056efd1aba070623a789571dc
Submitter: Jenkins
Branch: master

commit 80096b6fb62cec4056efd1aba070623a789571dc
Author: Zhiteng Huang <email address hidden>
Date: Thu Jan 9 14:54:22 2014 +0800

    Make sure report_interval is less than service_down_time

    Services that inherit service.py/Service class would register
    themselves to DB and then update stats periodically (every
    report_interval second). The consumer of this kind of information,
    like scheduler or 'os-service' API extension, will consider a service
    is 'up' (active) if last update from that service is not longer than
    'service_down_time' ago.

    The problem is if 'report_interval' was configured/provided greater
    than 'service_down_time' by mistake, services would then be always
    considered in 'down' state, which can result in unsuccesful placement
    of volume create request for example. This is what Bug #1255685 is
    about.

    In previous fix: https://review.openstack.org/#/c/60760/, a
    configuration check helper function basic_config_check() was added
    *wrongly* to WSGIService class instead of Service class. This patch
    moves the configuration check helper function and the check to the
    right place to make sure 'report_interval' is less then
    'service_down_time'.

    Closes-bug #1255685

    Change-Id: I14bd8c54e5ce20719844f437808ad98a011820de

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/60748
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=94e2cbd6dea374b7e355b1272de9ba6e1d9f7b0a
Submitter: Jenkins
Branch: master

commit 94e2cbd6dea374b7e355b1272de9ba6e1d9f7b0a
Author: liyingjun <email address hidden>
Date: Sat Nov 23 20:17:42 2013 +0800

    Make sure report_interval is less than service_down_time

    If service_down_time is less than report_interval, services will
    routinely be considered down, because they report in too rarely.
    Add check for service_down_time vs report_interval, if report_interval
    is larger than service_down_time, automatically change
    service_down_time to be report_interval * 2.5

    DocImpact

    Closes bug #1255685

    Change-Id: I1be42e1826142b8f3f2c39f3734bef713a12a693

Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
milestone: none → icehouse-3
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-3 → 2014.1
Thierry Carrez (ttx)
Changed in cinder:
milestone: icehouse-2 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.