Ceilometer api requests periodically fails with 'Cannot connect to proxy.', error(104, 'Connection reset by peer'

Bug #1533680 reported by Vitaly Gusev
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Ivan Berezovskiy
8.0.x
Fix Released
High
Ivan Berezovskiy
Mitaka
Fix Released
High
Ivan Berezovskiy

Bug Description

See the bug https://bugs.launchpad.net/fuel/+bug/1532250
Scenario:
1. Create new environment
2. Choose Neutron, VLAN
3. Choose Ceph for volumes and images, ceph for ephemeral and Rados GW for objects
4. Add 3 controller
5. Add 1 compute
6. Add 3 mongo nodes
7. Verify networks
8. Deploy cluster
9. Verify networks
10. Run OSTF

Multiple Ceilometer OSTF failed. In ostf.log we can see the following errors:
ConnectionError: HTTPSConnectionPool(host='10.109.8.3', port=8777): Max retries exceeded with url: /v2/alarms/7fe57a9c-c4f0-4cd7-b252-8dad0ad5f2f2/state (Caused by ProxyError('Cannot connect to proxy.', error(104, 'Connection reset by peer')))

At a certain time requests do not reach to HAProxy.
We added in ceilometer haproxy config (/etc/haproxy/conf.d/140-ceilometer.cfg) the following options:
  option httplog
  option httpclose
and all works fine.

I propose to add these options in the ceilometer haproxy config by default.

Changed in fuel:
assignee: nobody → Ivan Berezovskiy (iberezovskiy)
status: New → Confirmed
Revision history for this message
Ivan Berezovskiy (iberezovskiy) wrote :

According to HAProxy docs (https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#option%20httpclose):
Using "option http-server-close" or "option forceclose" is strongly recommended instead. So, I suggest to use "option forceclose":
https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#option%20forceclose

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/266924

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

why we created separate bug description for the issue?

tags: added: ceilometer haproxy
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/267569

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/266924
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=7221bd5ca604bb86f902e14535d648b9763e91ce
Submitter: Jenkins
Branch: master

commit 7221bd5ca604bb86f902e14535d648b9763e91ce
Author: iberezovskiy <email address hidden>
Date: Wed Jan 13 16:40:28 2016 +0300

    Enable active connection closing for ceilometer-api

    Enable active connection closing after response
    is transferred for ceilometer API requests.
    Enable httplog option for ceilometer-api as well.

    Change-Id: I3d835275f8b223370bdffb6217577f6af13c1985
    Closes-bug: #1533680

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/267569
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=d68344e65b8de7ee7fb11d0c702cbd628bd9377d
Submitter: Jenkins
Branch: stable/8.0

commit d68344e65b8de7ee7fb11d0c702cbd628bd9377d
Author: iberezovskiy <email address hidden>
Date: Wed Jan 13 16:40:28 2016 +0300

    Enable active connection closing for ceilometer-api

    Enable active connection closing after response
    is transferred for ceilometer API requests.
    Enable httplog option for ceilometer-api as well.

    Change-Id: I3d835275f8b223370bdffb6217577f6af13c1985
    Closes-bug: #1533680

Revision history for this message
Vitaly Gusev (vgusev) wrote :

Verified on ISO:
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  api: "1.0"
  build_number: "466"
  build_id: "466"
  fuel-nailgun_sha: "f81311bbd6fee2665e3f96dcac55f72889b2f38c"
  python-fuelclient_sha: "4f234669cfe88a9406f4e438b1e1f74f1ef484a5"
  fuel-agent_sha: "6823f1d4005a634b8436109ab741a2194e2d32e0"
  fuel-nailgun-agent_sha: "b2bb466fd5bd92da614cdbd819d6999c510ebfb1"
  astute_sha: "b81577a5b7857c4be8748492bae1dec2fa89b446"
  fuel-library_sha: "fe03d887361eb80232e9914eae5b8d54304df781"
  fuel-ostf_sha: "ab5fd151fc6c1aa0b35bc2023631b1f4836ecd61"
  fuel-mirror_sha: "b62f3cce5321fd570c6589bc2684eab994c3f3f2"
  fuelmenu_sha: "fac143f4dfa75785758e72afbdc029693e94ff2b"
  shotgun_sha: "63645dea384a37dde5c01d4f8905566978e5d906"
  network-checker_sha: "9f0ba4577915ce1e77f5dc9c639a5ef66ca45896"
  fuel-upgrade_sha: "616a7490ec7199f69759e97e42f9b97dfc87e85b"
  fuelmain_sha: "727f7076f04cb0caccc9f305b149a2b5b5c2af3a"

Revision history for this message
Sofiia Andriichenko (sandriichenko) wrote :
Download full text (3.8 KiB)

environment was staying more 24 hours

don't working tests:

Ceilometer test to list meters, alarms, resources and events
Ceilometer test to check alarm state and get Nova notifications
Ceilometer test to check events and traits
Ceilometer test to check notifications from Glance
Ceilometer test to check notifications from Keystone
Ceilometer test to check notifications from Neutron
Ceilometer test to check events from Cinder
Ceilometer test to create, check and list samples
Ceilometer test to create, update, check and delete alarm

total mistake:
Can not set proxy for Health Check.Make sure that network configuration for controllers is correct

Logs ceilometer-collector:

ceilometer.dispatcher.database [-] Failed to connect to db, purpose event re-try later: 'NoneType' object has no attribute 'find' 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database Traceback (most recent call last): 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database File "/usr/lib/python2.7/dist-packages/ceilometer/dispatcher/database.py", line 51, in _get_db_conn 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database return storage.get_connection_from_config(self.conf, purpose) 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database File "/usr/lib/python2.7/dist-packages/ceilometer/storage/__init__.py", line 113, in get_connection_from_config 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database return _inner() 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database File "/usr/lib/python2.7/dist-packages/retrying.py", line 49, in wrapped_f 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database return Retrying(*dargs, **dkw).call(f, *args, **kw) 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database File "/usr/lib/python2.7/dist-packages/retrying.py", line 212, in call 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database raise attempt.get() 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database File "/usr/lib/python2.7/dist-packages/retrying.py", line 247, in get 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database six.reraise(self.value[0], self.value[1], self.value[2]) 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database File "/usr/lib/python2.7/dist-packages/retrying.py", line 200, in call 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database attempt = Attempt(fn(*args, **kwargs), attempt_number, False) 2016-03-18 16:23:16.231 4746 ERROR ceilometer.dispatcher.database File "/usr/lib/python2.7/dist-packages/ceilometer/storage

ceilometer [-] AttributeError: 'CollectorService' object has no attribute 'sample_listener' 2016-03-18 16:17:25.869 4500 ERROR ceilometer Traceback (most recent call last): 2016-03-18 16:17:25.869 4500 ERROR ceilometer File "/usr/bin/ceilometer-collector", line 10, in <module> 2016-03-18 16:17:25.869 4500 ERROR ceilometer sys.exit(main()) 2016-03-18 16:17:25.869 4500 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/ceilometer/cmd/collector.py", line 29, in main 2016-03-18 16:17:25.869 4500 ERROR ceilometer workers=CONF.collector.workers).wait() 2016-03-18 16:17:25.869 4500 ERROR ceilomete...

Read more...

Revision history for this message
Sofiia Andriichenko (sandriichenko) wrote :
Download full text (26.1 KiB)

http://srv41-bud.infra.mirantis.net/fuelweb-iso/fuel-9.0-70-2016-03-16_15-05-59.iso.torrent

2016-03-21 11:46:15 DEBUG 6243 (manager) Initializing snapshot manager
2016-03-21 11:46:15 DEBUG 6243 (manager) Making report
2016-03-21 11:46:15 DEBUG 6243 (manager) Gathering report for: {'command': ['cat /etc/fuel_build_id', 'cat /etc/fuel_build_number', 'cat /etc/fuel_release', 'cat /etc/fuel_openstack_version', "rpm -qa | \\\negrep 'fuel|astute|network-checker|shotgun' | \\\nwhile read package; do\n echo\n echo $package\n rpm -q --changelog $package | head -2\ndone\n"], 'host': {}, 'type': 'command'}
2016-03-21 11:46:15 DEBUG 6243 (driver) Initializing driver Command: host={}
2016-03-21 11:46:15 DEBUG 6243 (driver) Running local command: cat /etc/fuel_build_id
2016-03-21 11:46:15 DEBUG 6243 (utils) Trying to execute command: cat /etc/fuel_build_id
2016-03-21 11:46:15 DEBUG 6243 (driver) Running local command: cat /etc/fuel_build_number
2016-03-21 11:46:15 DEBUG 6243 (utils) Trying to execute command: cat /etc/fuel_build_number
2016-03-21 11:46:15 DEBUG 6243 (driver) Running local command: cat /etc/fuel_release
2016-03-21 11:46:15 DEBUG 6243 (utils) Trying to execute command: cat /etc/fuel_release
2016-03-21 11:46:15 DEBUG 6243 (driver) Running local command: cat /etc/fuel_openstack_version
2016-03-21 11:46:15 DEBUG 6243 (utils) Trying to execute command: cat /etc/fuel_openstack_version
2016-03-21 11:46:15 DEBUG 6243 (driver) Running local command: rpm -qa | \
egrep 'fuel|astute|network-checker|shotgun' | \
while read package; do
  echo
  echo $package
  rpm -q --changelog $package | head -2
done

2016-03-21 11:46:15 DEBUG 6243 (utils) Trying to execute command: rpm -qa | \
egrep 'fuel|astute|network-checker|shotgun' | \
while read package; do
  echo
  echo $package
  rpm -q --changelog $package | head -2
done

+---------------------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Host | Reporter | Report |
+---------------------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| nailgun.test.domain.local | cat /etc/fuel_build_id | 70 |
| | | |
| nailgun.test.domain.local | cat /etc/fuel_build_number ...

Revision history for this message
Ivan Berezovskiy (iberezovskiy) wrote :

Sofiia Andriichenko, the problem that you've described is a completely different problem. From your logs: ceilometer.dispatcher.database [-] Failed to connect to db, purpose event re-try later

This is related to db connection, and doesn't have anything in common with HA proxy.

So, I close this bug as fix committed (as it was before). Please open new bug if needed.

tags: added: on-verification
Changed in fuel:
status: Fix Committed → Fix Released
tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.