[heat] Incorrect run of heat-engine processes

Bug #1520610 reported by Anastasia Kuznetsova on 2015-11-27
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Ivan Berezovskiy
7.0.x
High
Timur Nurlygayanov

Bug Description

Steps to reproduce:
1. Deploy FUEL/MOS 8.0 environment in any configuration
2. Try to get stack list

Observed result:
command won't be executed
Following error in heat-engine log:

2015-11-27 08:32:54.423 8539 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 192.168.0.3:5673
2015-11-27 08:32:54.440 8539 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 192.168.0.3:5673
2015-11-27 08:32:55.187 31950 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 127.0.0.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 2 seconds.
2015-11-27 08:32:55.705 31951 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 127.0.0.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 2 seconds.

Seems that at the beginning heat-engine successfully connected to AMQP server, but lately it started to send requests to invalid url.

I found very strange set of heat-engine processes on the controller:
root@node-1:~# ps -aux | grep heat-engine
heat 3281 1.1 0.9 278860 69544 ? S Nov23 68:42 /usr/bin/python /usr/bin/heat-engine
heat 3411 0.0 1.5 405028 107568 ? S Nov23 4:13 /usr/bin/python /usr/bin/heat-engine
heat 3412 0.0 1.4 401484 103772 ? S Nov23 4:02 /usr/bin/python /usr/bin/heat-engine
root 3568 0.0 0.0 8864 648 pts/23 S+ 08:31 0:00 grep --color=auto heat-engine
heat 31950 0.3 1.1 294940 83264 ? S Nov23 19:51 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf --log-file=/var/log/heat/heat-engine.log
heat 31951 0.3 1.1 294964 83256 ? S Nov23 19:50 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf --log-file=/var/log/heat/heat-engine.log

There are a few heat-engine processes which run without config file, so they use default amqp server ip and port. This processes were run by pcs.

It is not clear why there are another two heat-engine processes.

summary: - [heat] Incorrect run of heat-engine process
+ [heat] Incorrect run of heat-engine processes
description: updated
Changed in fuel:
assignee: nobody → MOS Puppet Team (mos-puppet)
milestone: none → 8.0
importance: Undecided → High
Changed in fuel:
assignee: MOS Puppet Team (mos-puppet) → Ivan Berezovskiy (iberezovskiy)
Changed in fuel:
status: New → Confirmed
Dmitry Pyzhov (dpyzhov) on 2015-12-01
tags: added: area-mos
tags: added: blocker-for-qa
tags: added: heat

Fix proposed to branch: master
Review: https://review.openstack.org/255808

Changed in fuel:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/255808
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=5f1683b098ba3f6b6043af42e5d0a5d36f4312c1
Submitter: Jenkins
Branch: master

commit 5f1683b098ba3f6b6043af42e5d0a5d36f4312c1
Author: iberezovskiy <email address hidden>
Date: Thu Dec 10 14:19:37 2015 +0300

    Install heat-docker package only after heat-engine

    Heat-engine package is heat-docker dependency,
    so if installation of heat-docker package is performed before
    heat-engine installation it leads to autostart of heat-engine package
    (in that moment override file isn't exist yet, because it will be
    triggered directly before evaluation of heat-engine package resource).
    As a result, the heat-engine service is run without any configuration.

    To fix that we need to sеt heat-engine package installation strictly
    before heat-docker package.

    Change-Id: I5420e64e3ab6b2ca0305f5f41eb722e3ead42b25
    Closes-bug: #1520610

Changed in fuel:
status: In Progress → Fix Committed

It was reproduced on MOS 7.0 on customer environment.

tags: added: customer-found
Download full text (3.4 KiB)

I am trying to reproduce on 2 controllers- 1 compute cluster on 429 iso 8.0.

node 1 and node 2 - controllers.

NODE-1:
root@node-1:~# initctl list | grep heat-engine
heat-engine stop/waiting
root@node-1:~# ps ax | grep heat-engine
 4828 ? S 0:06 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf
 5444 ? S 0:00 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf
 5445 ? S 0:00 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf

NODE-2:
root@node-2:~/test/mos-integration-tests/mos_tests/heat# initctl list | grep heat-engine
heat-engine start/running, process 21535
root@node-2:~/test/mos-integration-tests/mos_tests/heat# ps ax | grep heat-engine
22090 ? Rs 0:00 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf --log-file=/var/log/heat/heat-engine.log
22102 pts/27 S+ 0:00 grep --color=auto heat-engine

####
Stop heat engine on NODE-1 only:

Node-1:
root@node-1:~# crm resource stop clone_p_heat-engine

root@node-1:~# ps ax | grep heat-engine
 7248 pts/27 S+ 0:00 grep --color=auto heat-engine
root@node-1:~# initctl list | grep heat-engine
heat-engine stop/waiting

Heat-endine was stopped on node 1.

Node-2:
root@node-2:~/test/mos-integration-tests/mos_tests/heat# initctl list | grep heat-engine
heat-engine start/running, process 29938
root@node-2:~/test/mos-integration-tests/mos_tests/heat# ps ax | grep heat-engine
30213 ? Rs 0:00 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf --log-file=/var/log/heat/heat-engine.log

Heat engine is working on node 2!

Start heat engine again. Enshure that he running

Than stop heat engine on NODE-2 only:

root@node-2:~/test/mos-integration-tests/mos_tests/heat# crm resource stop clone_p_heat-engine

root@node-2:~/test/mos-integration-tests/mos_tests/heat# initctl list | grep heat-engine
heat-engine start/running, process 6412
root@node-2:~/test/mos-integration-tests/mos_tests/heat# ps ax | grep heat-engine
 7329 ? Rs 0:00 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf --log-file=/var/log/heat/heat-engine.log
Heat endine isn't stopped on node-2!

Node-1:
root@node-1:~# ps ax | grep heat-engine
16682 pts/27 S+ 0:00 grep --color=auto heat-engine
root@node-1:~# initctl list | grep heat-engine
heat-engine stop/waiting

Heat engine working on NODE-2 and stpped on NODE-1

Stop heat-engine using service:

root@node-2:~/test/mos-integration-tests/mos_tests/heat# service heat-engine stop
heat-engine stop/waiting
root@node-2:~/test/mos-integration-tests/mos_tests/heat# ps ax | grep heat-engine
 7838 pts/27 S+ 0:00 grep --color=auto heat-engine
root@node-2:~/test/mos-integration-tests/mos_tests/heat# initctl list | grep heat-engine
heat-engine stop/waiting

The servise is stopped.

Start heat-engine using crm on node-1
Try stopped service using service stop:
root@node-1:~# service heat-engine stop
stop: Unknown instance:

root@node-1:~# ps ax | grep heat-engine
15754 ? S 0:08 /usr/bin/python /usr/bin/heat-engine --config-file=/etc/heat/heat.conf
15891 ? S 0:01 /usr/bin/python /u...

Read more...

Tatyanka (tatyana-leontovich) wrote :

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  api: "1.0"
  build_number: "466"
  build_id: "466"
  fuel-nailgun_sha: "f81311bbd6fee2665e3f96dcac55f72889b2f38c"
  python-fuelclient_sha: "4f234669cfe88a9406f4e438b1e1f74f1ef484a5"
  fuel-agent_sha: "6823f1d4005a634b8436109ab741a2194e2d32e0"
  fuel-nailgun-agent_sha: "b2bb466fd5bd92da614cdbd819d6999c510ebfb1"
  astute_sha: "b81577a5b7857c4be8748492bae1dec2fa89b446"
  fuel-library_sha: "fe03d887361eb80232e9914eae5b8d54304df781"
  fuel-ostf_sha: "ab5fd151fc6c1aa0b35bc2023631b1f4836ecd61"
  fuel-mirror_sha: "b62f3cce5321fd570c6589bc2684eab994c3f3f2"
  fuelmenu_sha: "fac143f4dfa75785758e72afbdc029693e94ff2b"
  shotgun_sha: "63645dea384a37dde5c01d4f8905566978e5d906"
  network-checker_sha: "9f0ba4577915ce1e77f5dc9c639a5ef66ca45896"
  fuel-upgrade_sha: "616a7490ec7199f69759e97e42f9b97dfc87e85b"
  fuelmain_sha: "727f7076f04cb0caccc9f305b149a2b5b5c2af3a"

Changed in fuel:
status: Fix Committed → Fix Released
Denis Puchkin (dpuchkin) wrote :

Can not reproduce this bug in MOS 7.0

Timur, can you provide logs and more info about customer enviroment?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers