Instance's Task state never change from "scheduling" after sending SIGHUP to nova-compute

Bug #1303615 reported by Mitsuru Kanabuchi
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Ankit Agrawal

Bug Description

[Issue]

I tried to reload nova.conf with sending SIGHUP to nova-compute.
In my understanding, nova-compute can reload nova.conf by receiving SIGHUP when it has started as daemon.

Reloading config is succeed.
However, booting new instance doesn't work correctly after sending SIGHUP.
Task State would never change from "scheduling".

$ nova list
+--------------------------------------+------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+----------+
| 9aa89186-28fa-44ee-8e97-84bd0de23f91 | vm | BUILD | scheduling | NOSTATE | |
+--------------------------------------+------+--------+------------+-------------+----------+

[How to reproduce]

nova's commit id: 33fc957a5aeb9d310cbff3ac22c7a3c97a794f72

1) Start nova-compute as daemon.

$ sudo cat /etc/init/nova-compute.conf
description "nova-compute"
author "openstack"

start on (local-filesystems and net-device-up IFACE!=lo)
stop on runlevel [016]

exec su -s /bin/sh -c "exec /usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /home/devstack/log/nova-compute.log > /dev/null 2>&1" devstack

$ sudo service nova-compute start
nova-compute start/running, process 10521

2) Send SIGHUP to nova-compute's PID.

$ ps aux|grep nova-compute
root 10521 0.0 0.0 4052 1548 ? Ss 15:42 0:00 su -s /bin/sh -c exec /usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /home/devstack/log/nova-compute.log > /dev/null 2>&1 devstack
devstack 10523 19.5 1.5 220744 32796 ? Ssl 15:42 0:00 /usr/bin/python /usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /home/devstack/log/nova-compute.log
$ sudo kill -SIGHUP 10523

3) Verify nova-compute.log and check reload success.

$ cat /home/devstack/log/nova-compute.log
    :
2014-04-07 15:46:18.791 INFO nova.openstack.common.service [-] Caught SIGHUP, exiting
2014-04-07 15:46:18.811 DEBUG nova.openstack.common.service [-] Full set of CONF: from (pid=10523) _wait_for_exit_or_signal /opt/stack/nova/nova/openstack/common/service.py:167
    :

4) Boot new instance and check Task State of new instance repeatedly.

$ nova boot --flavor 1 --image dee24998-10f7-42a3-8cd7-d46d185281ca vm
+--------------------------------------+----------------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
    :

5) Task State would never change from scheduling.

$ nova list
+--------------------------------------+------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+----------+
| 9aa89186-28fa-44ee-8e97-84bd0de23f91 | vm | BUILD | scheduling | NOSTATE | |
+--------------------------------------+------+--------+------------+-------------+----------+

Tags: compute
Tracy Jones (tjones-i)
tags: added: compute
Changed in nova:
assignee: nobody → Ankit Agrawal (ankitagrawal)
Tushar Patil (tpatil)
Changed in nova:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/98076

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Ankit Agrawal (<email address hidden>) on branch: master
Review: https://review.openstack.org/98076
Reason: Abandoning this patch as it has been taken care by following patches:
https://review.openstack.org/#/c/104887

https://review.openstack.org/#/c/104099

Revision history for this message
Abhijeet Malawade (abhijeet-malawade) wrote :

This issue is still there and it is because of oslo.messaging. After sending SIGHUP to nova-compute, connections opened by listeners are not being closed after stopping service. And it is creating new listeners on service restart . Here listener object is not garbage collected after stopping service.

Oslo.messaging patch to fix this issue is still under review : https://review.openstack.org/#/c/103186/

Sean Dague (sdague)
Changed in nova:
importance: Undecided → High
Changed in nova:
status: In Progress → Confirmed
Revision history for this message
Tushar Patil (tpatil) wrote :

Patch https://review.openstack.org/#/c/103186/ is merged and I don't see this issue anymore.

Changed in nova:
status: Confirmed → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → kilo-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: kilo-rc1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.