I guest I have got this issue on a real running cluster.
I have made some investigation, to find a difference between instance that loose their network interfaces after hard reboot, and those that doesn't.
When as say 'loose their network interfaces', I means the network device is no more present into the guest and the libvirt configuration.
On a bugged instance:
We can see that networks is empty in 'nova list', but the interface is still attached:
# nova list --name XXXXXXXXXXXX
+--------------------------------------+---------------------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+---------------------+--------+------------+-------------+----------+
| 940be621-97a9-409a-8c31-7c0c9c8afbfd | XXXXXX | ACTIVE | - | Running | |
+--------------------------------------+---------------------+--------+------------+-------------+----------+
# nova interface-list 940be621-97a9-409a-8c31-7c0c9c8afbfd
+------------+--------------------------------------+--------------------------------------+---------------------------------------+-------------------+
| Port State | Port ID | Net ID | IP addresses | MAC Addr |
+------------+--------------------------------------+--------------------------------------+---------------------------------------+-------------------+
| ACTIVE | cdfb0e9f-d0cd-4ca7-9954-0fce25680905 | 07487477-8cbd-4d2a-b549-713a964ddb51 | XXXXXXXXXXXXXXXXX | fa:16:3e:fc:78:a3 |
+------------+--------------------------------------+--------------------------------------+---------------------------------------+-------------------+
So I have looked into the db and it seems that something have emptied the columns "instance_info_caches.network_info" of some of my instances:
mysql> select * from instance_info_caches where id = 174;
+---------------------+---------------------+------------+-----+--------------+--------------------------------------+---------+
| created_at | updated_at | deleted_at | id | network_info | instance_uuid | deleted |
+---------------------+---------------------+------------+-----+--------------+--------------------------------------+---------+
| 2014-09-22 13:59:11 | 2014-10-20 05:39:48 | NULL | 174 | [] | 940be621-97a9-409a-8c31-7c0c9c8afbfd | 0 |
+---------------------+---------------------+------------+-----+--------------+--------------------------------------+---------+
1 row in set (0.00 sec)
I guest I have got this issue on a real running cluster.
I have made some investigation, to find a difference between instance that loose their network interfaces after hard reboot, and those that doesn't.
When as say 'loose their network interfaces', I means the network device is no more present into the guest and the libvirt configuration.
On a bugged instance:
We can see that networks is empty in 'nova list', but the interface is still attached:
# nova list --name XXXXXXXXXXXX ------- ------- ------- ------- ------- -+----- ------- ------- --+---- ----+-- ------- ---+--- ------- ---+--- ------- + ------- ------- ------- ------- ------- -+----- ------- ------- --+---- ----+-- ------- ---+--- ------- ---+--- ------- + 97a9-409a- 8c31-7c0c9c8afb fd | XXXXXX | ACTIVE | - | Running | | ------- ------- ------- ------- ------- -+----- ------- ------- --+---- ----+-- ------- ---+--- ------- ---+--- ------- + 97a9-409a- 8c31-7c0c9c8afb fd ------- ---+--- ------- ------- ------- ------- ------- +------ ------- ------- ------- ------- ----+-- ------- ------- ------- ------- ------- --+---- ------- ------- -+ ------- ---+--- ------- ------- ------- ------- ------- +------ ------- ------- ------- ------- ----+-- ------- ------- ------- ------- ------- --+---- ------- ------- -+ d0cd-4ca7- 9954-0fce256809 05 | 07487477- 8cbd-4d2a- b549-713a964ddb 51 | XXXXXXXXXXXXXXXXX | fa:16:3e:fc:78:a3 | ------- ---+--- ------- ------- ------- ------- ------- +------ ------- ------- ------- ------- ----+-- ------- ------- ------- ------- ------- --+---- ------- ------- -+
+--
| ID | Name | Status | Task State | Power State | Networks |
+--
| 940be621-
+--
# nova interface-list 940be621-
+--
| Port State | Port ID | Net ID | IP addresses | MAC Addr |
+--
| ACTIVE | cdfb0e9f-
+--
So I have looked into the db and it seems that something have emptied the columns "instance_ info_caches. network_ info" of some of my instances:
mysql> select * from instance_ info_caches where id = 174; ------- ------- -----+- ------- ------- ------+ ------- -----+- ----+-- ------- -----+- ------- ------- ------- ------- ------- --+---- -----+ ------- ------- -----+- ------- ------- ------+ ------- -----+- ----+-- ------- -----+- ------- ------- ------- ------- ------- --+---- -----+ 97a9-409a- 8c31-7c0c9c8afb fd | 0 | ------- ------- -----+- ------- ------- ------+ ------- -----+- ----+-- ------- -----+- ------- ------- ------- ------- ------- --+---- -----+
+--
| created_at | updated_at | deleted_at | id | network_info | instance_uuid | deleted |
+--
| 2014-09-22 13:59:11 | 2014-10-20 05:39:48 | NULL | 174 | [] | 940be621-
+--
1 row in set (0.00 sec)