rsyslogd restart causes different services to hang

Bug #1363102 reported by Eugene Nikanorov
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
Critical
Kostiantyn Danylov
5.0.x
Fix Committed
Critical
Meg McRoberts
5.1.x
Fix Released
Critical
Kostiantyn Danylov
6.0.x
Fix Released
Critical
Kostiantyn Danylov

Bug Description

Restarting rsyslogd on Ubuntu causes neutron-server, nova-api, nova-conductor, glance-api and possibly other python services to start eating up whole CPU.

Investigation with strace shows that processes are stuck sending data on closed socket.

Steps to reproduce:
 sudo rsyslogd restart

It should instantly lead to spin in several processes

Changed in mos:
importance: Undecided → High
Changed in mos:
assignee: nobody → Kostiantyn Danylov (kdanylov)
Changed in mos:
milestone: none → 5.1
tags: added: mos-linux
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :
Revision history for this message
Kostiantyn Danylov (kdanylov) wrote :

This is caused by the some issue with syslog domain socket

> ps aux | grep glance-api
....
glance 49165 42.0 0.2 139304 44020 ? R Aug27 1000:18 /usr/bin/python /usr/bin/glance-api
....

> strace -p 49165
....
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=7, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>glance-api 2014-08-28 03:01"..., 258, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
....

fd=3 is closed, but glance-api is constantly try to send data over it.

> ls -l /proc/49165/fd/3
lrwx------ 1 glance glance 64 Aug 28 08:39 /proc/49165/fd/3 -> socket:[380252]

this is some kind of socket

> netstat -nap | grep 49165
EMPTY

My bet, that this is unixsocket

Revision history for this message
Kostiantyn Danylov (kdanylov) wrote :

Service restart is fixing problem.

There old issue, similar to current - http://bugs.python.org/issue15179
this patch already incorporated in our code

Revision history for this message
Kostiantyn Danylov (kdanylov) wrote :

After restarting rsyslog services start consume CPU on next log call, so this may not happened immediately, but after some amount of time (e.g. on restart)

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
Revision history for this message
OSCI Robot (oscirobot) wrote :

Package python-eventlet has been built from changeset: http://gerrit.mirantis.com/21476
DEB Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.1-stable-21476/ubuntu
You can build an ISO with this package:
make iso EXTRA_DEB_REPOS="http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.1-stable-21476/ubuntu /"

Revision history for this message
OSCI Robot (oscirobot) wrote :

Package python-eventlet has been built from changeset: http://gerrit.mirantis.com/21477
RPM Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-5.1-stable-21477/centos
You can build an ISO with this package:
make iso EXTRA_RPM_REPOS="osci-testing,http://osci-obs.vm.mirantis.net:82/centos-fuel-5.1-stable-21477/centos"

Roman Vyalov (r0mikiam)
Changed in mos:
status: New → In Progress
Changed in mos:
importance: High → Critical
Revision history for this message
OSCI Robot (oscirobot) wrote :

Package python-eventlet has been built from changeset: http://gerrit.mirantis.com/21476
DEB Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.1-stable/ubuntu
You can build an ISO with this package:
make iso EXTRA_DEB_REPOS="http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.1-stable/ubuntu /"

Revision history for this message
OSCI Robot (oscirobot) wrote :

Package python-eventlet has been built from changeset: http://gerrit.mirantis.com/21477
RPM Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-5.1-stable/centos
You can build an ISO with this package:
make iso EXTRA_RPM_REPOS="osci-testing,http://osci-obs.vm.mirantis.net:82/centos-fuel-5.1-stable/centos"

Changed in mos:
status: In Progress → Fix Committed
Revision history for this message
OSCI Robot (oscirobot) wrote :

Package python-eventlet has been built from changeset: http://gerrit.mirantis.com/21676
RPM Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-5.0.2-stable-21676/centos
You can build an ISO with this package:
make iso EXTRA_RPM_REPOS="osci-testing,http://osci-obs.vm.mirantis.net:82/centos-fuel-5.0.2-stable-21676/centos"

Revision history for this message
OSCI Robot (oscirobot) wrote :

Package python-eventlet has been built from changeset: http://gerrit.mirantis.com/21677
DEB Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.0.2-stable-21677/ubuntu
You can build an ISO with this package:
make iso EXTRA_DEB_REPOS="http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.0.2-stable-21677/ubuntu /"

Revision history for this message
Irina Povolotskaya (ipovolotskaya) wrote :

Meg, this summary is for you:

This bug may prevent OpenStack installation from working.

Some OpenStack processes start using 100% of CPU and stop working out requests.
In extreme cases, the service will stop functioning (together with the whole OpenStack).

The bug occurs due to an error in the code after restarting rsylsogd daemon.

It can be fixed in the following way: via updating python-eventlet package and reloading OpenStack services.

tags: added: docs
Revision history for this message
OSCI Robot (oscirobot) wrote :

Package python-eventlet has been built from changeset: http://gerrit.mirantis.com/21676
RPM Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-5.0.2-stable/centos
You can build an ISO with this package:
make iso EXTRA_RPM_REPOS="osci-testing,http://osci-obs.vm.mirantis.net:82/centos-fuel-5.0.2-stable/centos"

Revision history for this message
OSCI Robot (oscirobot) wrote :

Package python-eventlet has been built from changeset: http://gerrit.mirantis.com/21677
DEB Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.0.2-stable/ubuntu
You can build an ISO with this package:
make iso EXTRA_DEB_REPOS="http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.0.2-stable/ubuntu /"

tags: added: release-notes
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
tags: added: in progress
Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :

verified on
{

    "build_id": "2014-12-03_01-07-36",
    "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346",
    "build_number": "48",
    "auth_required": true,
    "api": "1.0",
    "nailgun_sha": "500e36d08a45dbb389bf2bd97673d9bff48ee84d",
    "production": "docker",
    "fuelmain_sha": "7626c5aeedcde77ad22fc081c25768944697d404",
    "astute_sha": "ef8aa0fd0e3ce20709612906f1f0551b5682a6ce",
    "feature_groups": [
        "mirantis"
    ],
    "release": "5.1.1",
    "release_versions": {
        "2014.1.3-5.1.1": {
            "VERSION": {
                "build_id": "2014-12-03_01-07-36",
                "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346",
                "build_number": "48",
                "api": "1.0",
                "nailgun_sha": "500e36d08a45dbb389bf2bd97673d9bff48ee84d",
                "production": "docker",
                "fuelmain_sha": "7626c5aeedcde77ad22fc081c25768944697d404",
                "astute_sha": "ef8aa0fd0e3ce20709612906f1f0551b5682a6ce",
                "feature_groups": [
                    "mirantis"
                ],
                "release": "5.1.1",
                "fuellib_sha": "a3043477337b4a0a8fd166dc83d6cd5d504f5da8"
            }

tags: removed: in progress
Revision history for this message
Anastasia Palkina (apalkina) wrote :

Verified on ISO #56

"build_id": "2014-12-18_01-32-01", "ostf_sha": "a9afb68710d809570460c29d6c3293219d3624d4", "build_number": "56", "auth_required": true, "api": "1.0", "nailgun_sha": "5f91157daa6798ff522ca9f6d34e7e135f150a90", "production": "docker", "fuelmain_sha": "45caacadb878abfbd9d60e134d72229698b469c9", "astute_sha": "16b252d93be6aaa73030b8100cf8c5ca6a970a91", "feature_groups": ["mirantis"], "release": "6.0", "release_versions": {"2014.2-6.0": {"VERSION": {"build_id": "2014-12-18_01-32-01", "ostf_sha": "a9afb68710d809570460c29d6c3293219d3624d4", "build_number": "56", "api": "1.0", "nailgun_sha": "5f91157daa6798ff522ca9f6d34e7e135f150a90", "production": "docker", "fuelmain_sha": "45caacadb878abfbd9d60e134d72229698b469c9", "astute_sha": "16b252d93be6aaa73030b8100cf8c5ca6a970a91", "feature_groups": ["mirantis"], "release": "6.0", "fuellib_sha": "73332192a257ea02c40a39885c502ad1ebdf3eda"}}}, "fuellib_sha": "73332192a257ea02c40a39885c502ad1ebdf3eda"

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.