Time sometimes synced in wrong order after revert of snapshots
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
Medium
|
Tatyanka |
Bug Description
{"build_id": "2014-05-
After investigation of issue we found the next problem. Time on nodes was change from 12:30 on 9:26. It is wrong behavior. May be caused by reverting of snapshot
Steps to Reproduce:
1. Deploy one node cluster using KVM
2. Delete it
3. Deploy neutron gre cluser with 4 nodes (1controller 2 computes 1 cinder+ mongo) with murano/
Expected
Deployment pass , cluster ready, ostf pass
Actual Result:
Controller nodes go offline during installation of openstack
node shows offline on ui, but marked with error in fuel nodes
id | status | name | cluster | ip | mac | roles | pending_roles | online
---|---
5 | deploying | Untitled (3c:b2) | 2 | 10.108.0.6 | e6:b9:c3:32:6b:48 | cinder, mongo | | True
4 | ready | Untitled (8a:3d) | 2 | 10.108.0.5 | 6a:bf:99:49:e3:44 | compute | | True
3 | ready | Untitled (8e:5c) | 2 | 10.108.0.4 | c6:c2:fe:83:78:4b | compute | | True
2 | error | Untitled (cd:66) | 2 | 10.108.0.203 | 64:d1:8c:f1:cd:66 | controller | | False
mco ping
4 time=54.02 ms
3 time=56.93 ms
5 time=58.93 ms
1 time=62.11 ms
master time=106.54 ms
node in error state is node -2, but hostname on this node says that it is node-1
arp -a on master
[root@nailgun log]# arp -a
? (172.17.0.11) at de:94:a6:ae:17:03 [ether] on docker0
node-4.
? (172.17.0.10) at d6:ee:16:fc:ad:07 [ether] on docker0
? (172.17.0.8) at f6:7a:3d:10:f2:46 [ether] on docker0
? (172.17.0.4) at fa:3c:04:24:47:d0 [ether] on docker0
node-3.
node-5.
node-2.
? (10.108.0.1) at 52:54:00:2f:85:46 [ether] on eth0
? (172.17.0.3) at aa:f1:99:a8:59:c9 [ether] on docker0
? (172.17.0.7) at 3a:18:a7:aa:a0:b0 [ether] on docker0
03903be110e3 (172.17.0.2) at 0a:4a:38:9a:fc:3a [ether] on docker0
? (172.17.0.9) at da:70:23:b1:75:b8 [ether] on docker0
ip addr on node-2
ip addr show
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,
link/ether 64:c6:f9:17:f6:87 brd ff:ff:ff:ff:ff:ff
inet 10.108.0.3/24 brd 10.108.0.255 scope global eth0
inet6 fe80::66c6:
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,
link/ether 64:0e:11:ea:81:55 brd ff:ff:ff:ff:ff:ff
inet 10.108.1.2/24 brd 10.108.1.255 scope global eth1
inet6 fe80::660e:
valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,
link/ether 64:b9:0b:f7:02:79 brd ff:ff:ff:ff:ff:ff
inet 10.108.2.2/24 brd 10.108.2.255 scope global eth2
inet6 fe80::66b9:
valid_lft forever preferred_lft forever
5: eth3: <BROADCAST,
link/ether 64:0e:8f:31:f9:75 brd ff:ff:ff:ff:ff:ff
inet6 fe80::660e:
valid_lft forever preferred_lft forever
6: eth4: <BROADCAST,
link/ether 64:52:46:7b:19:2d brd ff:ff:ff:ff:ff:ff
inet 10.108.4.2/24 brd 10.108.4.255 scope global eth4
inet6 fe80::6652:
valid_lft forever preferred_lft forever
7: eth3.103@eth3: <BROADCAST,
link/ether 64:0e:8f:31:f9:75 brd ff:ff:ff:ff:ff:ff
inet6 fe80::660e:
valid_lft forever preferred_lft forever
Info: ADanin - ip conflict here take place (for case remove 1 cluster - deploy new one)
Changed in fuel: | |
assignee: | nobody → Fuel Python Team (fuel-python) |
Changed in fuel: | |
assignee: | Fuel Python Team (fuel-python) → Tatyana (tatyana-leontovich) |
summary: |
- Controller goes offline on deploying phase If deletion of cluser take - place previously + Time sometimes sync in wrong order after revert of snapshots |
summary: |
- Time sometimes sync in wrong order after revert of snapshots + Time sometimes synced in wrong order after revert of snapshots |
description: | updated |
Changed in fuel: | |
importance: | High → Medium |
milestone: | 5.0 → 5.1 |
Changed in fuel: | |
status: | New → Confirmed |
Node-1 was not removed because of MCollective temporary failed to communicate to node-1. 06T09:33: 27.707065+ 00:00 debug: W, [2014-05- 06T12:37: 40.491489 #2248] WARN -- : runner.rb:122:in `receive' PLMC9: Expired Message: MCollective: :MsgTTLExpired: message f301a3ed51e459e d91557e71165bce ea from uid=0@6de00bbd14dc created at 1399368802 is 11058 seconds old, TTL is 5400
MCO log from node-1:
2014-05-
First timestamp in logs is a master-node time, and second timestamp is a slave-node time.
There was a timeshift occured on a master-node from 2014-05-06T12:30:05 to 2014-05- 06T09:26: 17: 06T12:30: 05.014187+ 00:00 debug: D, [2014-05- 06T12:29: 57.876365 #2248] DEBUG -- : rabbitmq.rb:66:in `on_hbfire' Publishing heartbeat 06T09:26: 17.335250+ 00:00 debug: D, [2014-05- 06T12:30: 26.377906 #2248] DEBUG -- : rabbitmq.rb:64:in `on_hbfire' Received heartbeat
2014-05-
2014-05-
and after approx 15 minutes node-1 made its own timeshift: 06T09:39: 58.047350+ 00:00 debug: D, [2014-05- 06T12:44: 12.494154 #2248] DEBUG -- : rabbitmq.rb:64:in `on_hbfire' Received heartbeat 06T09:40: 28.077188+ 00:00 debug: D, [2014-05- 06T09:40: 24.296672 #2248] DEBUG -- : rabbitmq.rb:64:in `on_hbfire' Received heartbeat
2014-05-
2014-05-