Memcached pool <pool>, thread <thread>: Marked host <memcached_host> dead until <time>

Bug #1515350 reported by Leontii Istomin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Won't Fix
Medium
Artem Roma

Bug Description

during create_and_list_routers rally scenario we faced with error:
from rally.log: http://paste.openstack.org/show/478561/
from haproxy by 401 and time: http://paste.openstack.org/show/478563/
from neutron-server: http://paste.openstack.org/show/478564/
the token had be alive: http://paste.openstack.org/show/478565/
But keystone couldn't be connected to one of memcached: http://paste.openstack.org/show/478574/

fail occurred at 11 08:36:13 I executed the following command on each controller at 18:07:03 after ~9,5 hours after the test:
DATE=`date`; MEMCACHED_RUN_TIME=$(ps -p `ps aux | grep memcached | grep -v grep | awk '{print $2}'` -o etime | grep -v ELAPSED | sed s/" "//g); SERVER_RUN_TIME=`uptime | awk '{print $1}' | sed s/" "//g`; echo Current time $DATE; echo Memcached is live $MEMCACHED_RUN_TIME; echo Server is live $SERVER_RUN_TIME
results of the command: http://paste.openstack.org/show/478595/
It means that memcached hasn't been restarted. But it was unreachable. We need to investigate why.

Cluster configuration:
Baremetal,Ubuntu,IBP,HA,Neutron-vlan,DVR,Ceph-all,Nova-debug,Nova-quotas,7.0-301-mu1
Controllers:3 Computes:178 Copmutes+Ceph:20

api: '1.0'
astute_sha: 6c5b73f93e24cc781c809db9159927655ced5012
auth_required: true
build_id: '301'
build_number: '301'
feature_groups:
- mirantis
fuel-agent_sha: 50e90af6e3d560e9085ff71d2950cfbcca91af67
fuel-library_sha: 5d50055aeca1dd0dc53b43825dc4c8f7780be9dd
fuel-nailgun-agent_sha: d7027952870a35db8dc52f185bb1158cdd3d1ebd
fuel-ostf_sha: 2cd967dccd66cfc3a0abd6af9f31e5b4d150a11c
fuelmain_sha: a65d453215edb0284a2e4761be7a156bb5627677
nailgun_sha: 4162b0c15adb425b37608c787944d1983f543aa8
openstack_version: 2015.1.0-7.0
production: docker
python-fuelclient_sha: 486bde57cda1badb68f915f66c61b544108606f3
release: '7.0'

Diagnostic Snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2015-11-11_12-04-03.tar.xz

Tags: area-mos scale
description: updated
Artem Roma (aroma-x)
Changed in fuel:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → MOS Keystone (mos-keystone)
milestone: none → 7.0-updates
tags: added: area-mos
Revision history for this message
Alexander Makarov (amakarov) wrote :

This is clearly not keystone bug.
The cause of the problem is network failure.
Please read the description and assign the proper team.

Changed in fuel:
assignee: MOS Keystone (mos-keystone) → Artem Roma (aroma-x)
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Won't Fix for 7.0-updates because of Medium importance

Changed in fuel:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.