Tried to run Tempest suites for mos 10.0, but failed to deploy the usual tempest env with Ceph Sahara DVR and ironic.
Nodes are stuck in discover state. And some errors are seen on master node.
Snapshots used for deploy: 1069, 1080.
Diagnostic snapshot is attached
[root@nailgun ~]# fuel nodes
id | status | name | cluster | ip | mac | roles | pending_roles | online | group_id
---+----------+---------------------------+---------+------------+-------------------+-------+-------------------+--------+---------
3 | discover | slave-03_controller_mongo | 1 | 10.109.8.6 | 64:e0:47:90:d5:02 | | controller, mongo | 1 | 1
1 | discover | slave-01_controller_mongo | 1 | 10.109.8.4 | 64:ff:91:87:ed:0c | | controller, mongo | 1 | 1
2 | discover | slave-02_controller_mongo | 1 | 10.109.8.5 | 64:66:e4:f2:39:b3 | | controller, mongo | 1 | 1
4 | discover | slave-05_compute_cinder | 1 | 10.109.8.8 | 64:e8:dd:c9:82:cb | | cinder, compute | 1 | 1
6 | discover | slave-04_compute_cinder | 1 | 10.109.8.7 | 64:2a:9a:56:74:b1 | | cinder, compute | 1 | 1
5 | discover | slave-06_ironic | 1 | 10.109.8.9 | 64:bf:34:ac:84:bf | | ironic | 1 | 1
[root@nailgun ~]# grep -R ERROR /var/log/*
/var/log/anaconda/syslog:13:50:32,609 CRIT firewalld: 2016-12-08 13:50:32 FATAL ERROR: No IPv4 and IPv6 firewall.
/var/log/anaconda/syslog:13:50:32,609 ERR firewalld: 2016-12-08 13:50:32 ERROR: Raising SystemExit in run_server
/var/log/anaconda/journal.log:Dec 08 13:52:00 nailgun.test.domain.local dracut[3048]: -rw-r--r-- 1 root root 191 Mar 6 2015 usr/lib/kbd/consolefonts/ERRORS
/var/log/anaconda/journal.log:Dec 08 13:52:42 nailgun.test.domain.local dracut[13123]: -rw-r--r-- 1 root root 191 Mar 6 2015 usr/lib/kbd/consolefonts/ERRORS
/var/log/anaconda/journal.log:Dec 08 13:53:14 nailgun.test.domain.local dracut[25563]: -rw-r--r-- 1 root root 191 Mar 6 2015 usr/lib/kbd/consolefonts/ERRORS
/var/log/fuel-bootstrap-image-build.log:modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/3.10.0-327.36.3.el7.x86_64/modules.dep.bin'
/var/log/mcollective.log:E, [2016-12-08T14:21:16.863080 #20910] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
/var/log/mcollective.log:E, [2016-12-08T16:51:55.697225 #20910] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
/var/log/nailgun/api.log:2016-12-08 14:24:03.913 ERROR [7fd2752d3880] (logger) Response code '500 Internal Server Error' for PUT /api/clusters/1/network_configuration/neutron/verify/ from 10.109.8.1:41906
/var/log/nailgun/api.log:2016-12-08 14:24:24.371 ERROR [7fd2752d3880] (logger) Response code '500 Internal Server Error' for PUT /api/clusters/1/network_configuration/neutron/verify/ from 10.109.8.1:41906
/var/log/nailgun/app.log:2016-12-08 14:24:03.907 ERROR [7fd2752d3880] (base) Unexpected exception occured
/var/log/nailgun/app.log:2016-12-08 14:24:24.366 ERROR [7fd2752d3880] (base) Unexpected exception occured
/var/log/remote/10.109.8.7/bootstrap/mcollective.log:2016-12-08T14:21:50.960975+00:00 err: 14:21:50.655407 #1404] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
/var/log/remote/10.109.8.9/bootstrap/mcollective.log:2016-12-08T16:52:22.876456+00:00 err: 16:52:22.751793 #1399] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
/var/log/remote/10.109.8.4/bootstrap/mcollective.log:2016-12-08T14:21:39.087455+00:00 err: 14:21:38.871774 #1363] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
This bug happens due to incorrect test setup. Node interfaces (except for admin) are not assigned to any networks. Here is a snippet of node-1 networking config:
- assigned_networks: properties: offloading: false
- id: 1
name: fuelweb_admin
bus_info: '0000:00:03.0'
current_speed: null
driver: virtio_net
id: 1
interface_
disable_
dpdk:
available: false
enabled: false
mtu: null
numa_node: 0
pci_id: 1af4:0001
sriov:
available: false
enabled: false ...