Openstack services caching misconfigured

Bug #1657727 reported by Vladimir Kuklin
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
Critical
Vladimir Kuklin
Mitaka
Fix Released
Critical
Vladimir Kuklin
Newton
Fix Released
Critical
Vladimir Kuklin
Ocata
Fix Committed
Critical
Vladimir Kuklin

Bug Description

While fixing openstack performance degradation issues we added sharded memcached servers list to all services configs thus allowing for significant services degradation when services/servers are restarting/unavailable https://review.openstack.org/#/q/Id1034e22d79c3ea6b25575d9bcf8e8750a02365d

Instead we should have simply configured local caching for majority of services leaving keystone memcached sharded, thus providing for better balance between performance and stability.

Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

This bug is similar to https://bugs.launchpad.net/mos/+bug/1621541 but for other components

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/422684
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=a529033fdcb36ccea8cf0cc76339816ed31418c7
Submitter: Jenkins
Branch: master

commit a529033fdcb36ccea8cf0cc76339816ed31418c7
Author: Vladimir Kuklin <email address hidden>
Date: Thu Jan 19 17:35:35 2017 +0300

    Set memcached server to local one for non-keyston services

    We misconfigured local cache for services with change
    https://review.openstack.org/#/q/Id1034e22d79c3ea6b25575d9bcf8e8750a02365d
    Thus, it becomes extremely slow when a controller is down.

    With this commit we revert things back to normal with local memcached
    for all openstack services leaving keystone memcached shared for tokens
    (this was thoroughly tested previously)

    Change-Id: I8f6bbf77d27f3d8976985241deb8a948984862f5
    Closes-bug: #1657727

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/422731
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=96e369880c441bdcf9ab30f196b4aabaafb9b921
Submitter: Jenkins
Branch: stable/mitaka

commit 96e369880c441bdcf9ab30f196b4aabaafb9b921
Author: Vladimir Kuklin <email address hidden>
Date: Thu Jan 19 17:35:35 2017 +0300

    Set memcached server to local one for non-keyston services

    We misconfigured local cache for services with change
    https://review.openstack.org/#/q/Id1034e22d79c3ea6b25575d9bcf8e8750a02365d
    Thus, it becomes extremely slow when a controller is down.

    With this commit we revert things back to normal with local memcached
    for all openstack services leaving keystone memcached shared for tokens
    (this was thoroughly tested previously)

    Change-Id: I8f6bbf77d27f3d8976985241deb8a948984862f5
    Closes-bug: #1657727

Revision history for this message
Vladimir Khlyunev (vkhlyunev) wrote :

Feels like fixed - swarm runs on 801 and 804 snapshots does not contains this failure

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/7.0)

Reviewed: https://review.openstack.org/427733
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=812b15bccf2d49714bcc098f38a4dde13d37890e
Submitter: Jenkins
Branch: stable/7.0

commit 812b15bccf2d49714bcc098f38a4dde13d37890e
Author: Alex Schultz <email address hidden>
Date: Wed Jul 6 16:15:15 2016 -0600

    Use memcache for keystone_authtoken

    That allow to cache keyston authtoken in local instance of memcached.
    Commit adds new global variable local_memcached_server and force all
    capable services use it. That should improve speed of operations in
    that services.
    Initial job has been started in LP#1597512. But that commit also
    include improvements from #1657727.

    Closes-Bug: #1597512
    Closes-bug: #1657727

    Change-Id: I6004a8366ddc639feb1aed55b6dfbaf626f82839

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/fuel-library 11.0.0.0rc1

This issue was fixed in the openstack/fuel-library 11.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/450870

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/newton)

Reviewed: https://review.openstack.org/450870
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=41d4a11ee73337de2e9a5cf8fe983c6f0a114bc6
Submitter: Jenkins
Branch: stable/newton

commit 41d4a11ee73337de2e9a5cf8fe983c6f0a114bc6
Author: Vladimir Kuklin <email address hidden>
Date: Thu Jan 19 17:35:35 2017 +0300

    Set memcached server to local one for non-keyston services

    We misconfigured local cache for services with change
    https://review.openstack.org/#/q/Id1034e22d79c3ea6b25575d9bcf8e8750a02365d
    Thus, it becomes extremely slow when a controller is down.

    With this commit we revert things back to normal with local memcached
    for all openstack services leaving keystone memcached shared for tokens
    (this was thoroughly tested previously)

    This commit https://github.com/openstack/fuel-library/commit/a529033fdcb36ccea8cf0cc76339816ed31418c7
    pointed all non-keystone services to local memcached for keystone auth tokens, however it also
    pointed cache/memcache_servers in nova to local memcached. This led to regression in Nova.

    Revert setting local memcached server for swift proxy

    Switch back to using all available mamcached servers, because of
    failures during swift testing.

    Change-Id: I8f6bbf77d27f3d8976985241deb8a948984862f5
    Closes-bug: #1657727
    Closes-Bug: #1576218
    Closes-Bug: 1666837

Revision history for this message
Vladimir Khlyunev (vkhlyunev) wrote :

Verified by several swarm runs, including both RC1 and RC2

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.