Ironic Conductor performance trends down with uptime

Bug #1672457 reported by Justin Kilpatrick on 2017-03-13
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Medium
Sam Betts

Bug Description

https://snapshot.raintank.io/dashboard/snapshot/S7oMf3rpbynrPK7U313Xk1WRZMrFg4ZB

https://snapshot.raintank.io/dashboard/snapshot/WULvdWAGoqvwc5OOIlm8TH4kDAa3Uq2s?panelId=72&fullscreen

https://snapshot.raintank.io/dashboard/snapshot/63GIebBJWAkAyH5iI286v6BXqtm4l31B

These are collected on two different clouds, one performing many ironic operations constantly as a benchmark, one simply left alone over the weekend. Resets line up with restarting the ironic conductor service, these are stock TripleO deployments with Newton, nothing special required to reproduce.

If you zoom in some it seems to be related to periodic conductor tasks like updating node status.

https://snapshot.raintank.io/dashboard/snapshot/Sp2wuk2M5adTpqfXMJenMXcSlCav2PiZ

This dashboard shows a very slight upward trend in average introspection times as the cpu load trends up. About 30 seconds, so you'd be hard pressed to call it a serious performance issue at this point, but if the trend continues over lets say a couple of months it would be concerning.

http://elk.browbeatproject.org:8080/goto/71754b3cba15049fe7a89a5de30fcf2d

Credit to rook for the second and third graphs.

Michael Turek (mjturek) wrote :

Moving to confirmed as there are multiple sources.

Changed in ironic:
status: New → Confirmed
importance: Undecided → Medium
description: updated
Derek Higgins (derekh) on 2017-03-23
Changed in ironic:
assignee: nobody → Derek Higgins (derekh)
Derek Higgins (derekh) wrote :
Download full text (3.7 KiB)

The problem here is the way drivers/base.py::BareDriver inherits from and then adds to drivers/base.py::BaseDriver

BaseDriver has a number of static member lists, two of which are appended too in __init__() of BareDriver, because they are static every times a new instance of BareDriver is created its appending an additional entry each time
        self.core_interfaces.append('network')
        self.standard_interfaces.append('storage')

These lists are then iterated over as part of acquiring a lock each time one is needed for a node
see: common/driver_factory.py:_attach_interfaces_to_driver

        for iface in driver_or_hw_type.all_interfaces:
            impl = getattr(driver_or_hw_type, iface, None)
            setattr(bare_driver, iface, impl)

Over time this loop takes more time and CPU resources. This can be seen by logging the value of driver_or_hw_type.all_interfaces

2017-03-23 22:33:56.727 28332 INFO ironic.common.driver_factory [req-0751c94d-5894-4237-b8ae-cc1be6b3a064 - - - - -] ['power', 'deploy', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'network', 'console', 'management', 'boot', 'inspect', 'raid', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage', 'storage...

Read more...

Fix proposed to branch: master
Review: https://review.openstack.org/449577

Changed in ironic:
status: Confirmed → In Progress
Changed in ironic:
assignee: Derek Higgins (derekh) → Sam Betts (sambetts)

Reviewed: https://review.openstack.org/449577
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=338651eae5b7c416f04970b9d60f09dc2dab8adb
Submitter: Jenkins
Branch: master

commit 338651eae5b7c416f04970b9d60f09dc2dab8adb
Author: Derek Higgins <email address hidden>
Date: Fri Mar 24 14:03:56 2017 +0000

    Copy and append to static lists

    core_interfaces and standard_interfaces are both static members of BaseDriver
    we need to take a copy of them before appending to them.

    Change-Id: Ic6edc5e49a25849c7871dbc9e6e1d5a5eb229e57
    Closes-Bug: #1672457

Changed in ironic:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/451441
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=7f1639e77efb32be280f56983a22485f56e24718
Submitter: Jenkins
Branch: stable/ocata

commit 7f1639e77efb32be280f56983a22485f56e24718
Author: Derek Higgins <email address hidden>
Date: Fri Mar 24 14:03:56 2017 +0000

    Copy and append to static lists

    core_interfaces and standard_interfaces are both static members of BaseDriver
    we need to take a copy of them before appending to them.

    Change-Id: Ic6edc5e49a25849c7871dbc9e6e1d5a5eb229e57
    Closes-Bug: #1672457
    (cherry picked from commit 338651eae5b7c416f04970b9d60f09dc2dab8adb)

tags: added: in-stable-ocata

Reviewed: https://review.openstack.org/451459
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=000ade88a4e366dcc4c606c7c53977065fc2de49
Submitter: Jenkins
Branch: stable/newton

commit 000ade88a4e366dcc4c606c7c53977065fc2de49
Author: Derek Higgins <email address hidden>
Date: Fri Mar 24 14:03:56 2017 +0000

    Copy and append to static lists

    core_interfaces and standard_interfaces are both static members of BaseDriver
    we need to take a copy of them before appending to them.

    Conflicts:
     ironic/drivers/base.py

    Change-Id: Ic6edc5e49a25849c7871dbc9e6e1d5a5eb229e57
    Closes-Bug: #1672457
    (cherry picked from commit 338651eae5b7c416f04970b9d60f09dc2dab8adb)

tags: added: in-stable-newton

This issue was fixed in the openstack/ironic 8.0.0 release.

This issue was fixed in the openstack/ironic 7.0.1 release.

This issue was fixed in the openstack/ironic 6.2.3 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers