Bug #1490778 “multinode deploy results in intermittent authentic...” : Bugs : kolla

Revision history for this message

Steven Dake (sdake) wrote on 2015-09-01:

#1

[sdake@minime-03 ~]$ docker exec keystone tail -20 /var/log/keystone/keystone.log
2015-08-31 23:38:52.842 18 INFO keystone.common.wsgi [-] POST http://192.168.1.148:35357/v3/auth/tokens
2015-08-31 23:38:55.874 12 INFO keystone.common.wsgi [-] POST http://broked.selfip.net:5000/v3/auth/tokens
2015-08-31 23:38:58.858 14 INFO keystone.common.wsgi [-] POST http://broked.selfip.net:5000/v3/auth/tokens
2015-08-31 23:38:58.930 20 INFO keystone.common.wsgi [-] GET http://192.168.1.148:35357/
2015-08-31 23:38:58.991 18 WARNING keystone.middleware.core [-] RBAC: Invalid token
2015-08-31 23:38:58.991 18 WARNING keystone.common.wsgi [-] The request you have made requires authentication.
2015-08-31 23:38:58.995 19 INFO keystone.common.wsgi [-] POST http://192.168.1.148:35357/v3/auth/tokens
2015-08-31 23:38:59.088 21 INFO keystone.common.wsgi [-] POST http://192.168.1.148:35357/v3/auth/tokens
2015-08-31 23:39:02.065 12 INFO keystone.common.wsgi [-] POST http://broked.selfip.net:5000/v3/auth/tokens
2015-08-31 23:39:02.143 19 INFO keystone.common.wsgi [-] GET http://192.168.1.148:35357/v3/auth/tokens
2015-08-31 23:39:02.145 19 WARNING keystone.common.wsgi [-] Could not find token: 783549cfdaf3470cbbc7867af5091552
2015-08-31 23:39:04.926 15 INFO keystone.common.wsgi [-] POST http://broked.selfip.net:5000/v3/auth/tokens
2015-08-31 23:39:05.071 17 INFO keystone.common.wsgi [-] GET http://192.168.1.148:35357/
2015-08-31 23:39:05.184 18 WARNING keystone.middleware.core [-] RBAC: Invalid token
2015-08-31 23:39:05.184 18 WARNING keystone.common.wsgi [-] The request you have made requires authentication.
2015-08-31 23:39:05.188 20 INFO keystone.common.wsgi [-] POST http://192.168.1.148:35357/v3/auth/tokens
2015-08-31 23:43:27.198 13 INFO keystone.common.wsgi [-] POST http://broked.selfip.net:5000/v3/auth/tokens
2015-08-31 23:43:27.286 18 INFO keystone.common.wsgi [-] POST http://192.168.1.148:35357/v3/auth/tokens
2015-08-31 23:43:27.392 21 INFO keystone.common.wsgi [-] GET http://192.168.1.148:35357/
2015-08-31 23:43:27.460 17 INFO keystone.common.wsgi [-] POST http://192.168.1.148:35357/v3/auth/tokens

Changed in kolla:
assignee:	nobody → Steven Dake (sdake)
importance:	Undecided → Critical
status:	New → Triaged
milestone:	none → liberty-3
summary:	- multinode deploy results in authentication failures + multinode deploy results in intermittent authentication failures
description:	updated

Steven Dake (sdake) on 2015-09-01

description:

updated

Revision history for this message

Steven Dake (sdake) wrote on 2015-09-01:

#2

[sdake@MINIME-ONE ~]$ docker logs keepalived
Starting Healthcheck child process, pid=11
Initializing ipvs 2.6
Starting VRRP child process, pid=12
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Opening file '/etc/keepalived/keepalived.conf'.
Opening file '/etc/keepalived/keepalived.conf'.
VRRP Error : Priority not valid !
Configuration is using : 5473 Bytes
must be between 1 & 255. reconfigure !
Using default value : 100

Configuration is using : 62138 Bytes
------< Global definitions >------
Router ID = minime-one
Smtp server connection timeout = 30
Email notification from = root@minime-one
VRRP IPv4 mcast group = 224.0.0.18
VRRP IPv6 mcast group = 224.0.0.18
SNMP Trap disabled
------< VRRP Topology >------
VRRP Instance = Floating
   Want State = MASTER
   Runing on device = em1
   Virtual Router ID = 51
   Priority = 100
   Advert interval = 1sec
   Tracked scripts = 1
     check_alive weight 0
   Virtual IP = 1
     192.168.1.148/32 dev em1 scope global
------< VRRP Scripts >------
VRRP Script = check_alive
   Command = /check_alive.sh
   Interval = 2 sec
   Timeout = 0 sec
   Weight = 0
   Rise = 10
   Fall = 2
   Status = INIT
Using LinkWatch kernel netlink reflector...
------< Global definitions >------
Router ID = minime-one
Smtp server connection timeout = 30
Email notification from = root@minime-one
VRRP IPv4 mcast group = 224.0.0.18
VRRP IPv6 mcast group = 224.0.0.18
SNMP Trap disabled
------< SSL definitions >------
Using autogen SSL context
Using LinkWatch kernel netlink reflector...
VRRP_Instance(Floating) Now in FAULT state
VRRP_Script(check_alive) succeeded
Kernel is reporting: interface em1 UP
VRRP_Instance(Floating) Transition to MASTER STATE
VRRP_Instance(Floating) Entering MASTER STATE
Netlink: filter function error
Netlink: filter function error
Netlink: filter function error
Netlink: filter function error
Netlink: filter function error
Netlink: filter function error
<repeated>

[sdake@MINIME-ONE ~]$ docker logs keepalived
Starting Healthcheck child process, pid=11
Initializing ipvs 2.6
Starting VRRP child process, pid=12
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Opening file '/etc/keepalived/keepalived.conf'.
Opening file '/etc/keepalived/keepalived.conf'.
VRRP Error : Priority not valid !
Configuration is using : 5473 Bytes
             must be between 1 & 255. reconfigure !
             Using default value : 100

Configuration is using : 62138 Bytes
------< Global definitions >------
 Router ID = minime-one
 Smtp server connection timeout = 30
 Email notification from = root@minime-one
 VRRP IPv4 mcast group = 224.0.0.18
 VRRP IPv6 mcast group = 224.0.0.18
 SNMP Trap disabled
------< VRRP Topology >------
 VRRP Instance = Floating
   Want State = MASTER
   Runing on device = em1
   Virtual Router ID = 51
   Priority = 100
   Advert interval = 1sec
   Tracked scripts = 1
     check_alive weight 0
   Virtual IP = 1
     192.168.1.148/32 dev em1 scope global
------< VRRP Scripts >------
 VRRP Script = check_alive
   Command = /check_alive.sh
   Interval = 2 sec
   Timeout = 0 sec
   Weight = 0
   Rise = 10
   Fall = 2
   Status = INIT
Using LinkWatch kernel netlink reflector...
------< Global definitions >------
 Router ID = minime-one
 Smtp server connection timeout = 30
 Email notification from = root@minime-one
 VRRP IPv4 mcast group = 224.0.0.18
 VRRP IPv6 mcast group = 224.0.0.18
 SNMP Trap disabled
------< SSL definitions >------
 Using autogen SSL context
Using LinkWatch kernel netlink reflector...
VRRP_Instance(Floating) Now in FAULT state
VRRP_Script(check_alive) succeeded
Kernel is reporting: interface em1 UP
VRRP_Instance(Floating) Transition to MASTER STATE
VRRP_Instance(Floating) Entering MASTER STATE
Netlink: filter function error
Netlink: filter function error
Netlink: filter function error
Netlink: filter function error
Netlink: filter function error
Netlink: filter function error
<repeated>

Revision history for this message

Steven Dake (sdake) wrote on 2015-09-01:

#3

The internets suggested sending a SIGHUP to keepalived to resole the netlink filter function error because keepalived does not support hot plugging of interfaces. I stopped all keepalived and started all keepaliveds on all nodes and received this on the master node:

Using autogen SSL context
Using LinkWatch kernel netlink reflector...
VRRP_Script(check_alive) succeeded
VRRP_Instance(Floating) Transition to MASTER STATE
VRRP_Instance(Floating) Entering MASTER STATE
VRRP_Instance(Floating) Received lower prio advert, forcing new election
VRRP_Instance(Floating) Received lower prio advert, forcing new election
VRRP_Instance(Floating) Received lower prio advert, forcing new election
VRRP_Instance(Floating) Received lower prio advert, forcing new election

Revision history for this message

Steven Dake (sdake) wrote on 2015-09-01:

#4

Looks like other folks have this same problem.

https://ask.openstack.org/en/question/62769/keystone-loses-token-on-a-ha-setup-with-galera/

Revision history for this message

Steven Dake (sdake) wrote on 2015-09-01:

#5

https://bugzilla.redhat.com/show_bug.cgi?id=1176966

Revision history for this message

Steven Dake (sdake) wrote on 2015-09-01:

#6

inc0 confirmed this problem exists for him.

Changed in kolla:
status:	Triaged → Confirmed

Revision history for this message

Steven Dake (sdake) wrote on 2015-09-01:

#7

Deployed ubuntu from source packaging, same result.

keystone endpoint list in repetition works
glance image-list in repetition does not work

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-09-01: Fix proposed to kolla (master)

#8

Fix proposed to branch: master
Review: https://review.openstack.org/219261

Changed in kolla:
status:	Confirmed → In Progress

Revision history for this message

Sam Yaple (s8m) wrote on 2015-09-02:

#9

In this case, this was caused by incorrect time on the servers. I suggest we close this as invalid or use this in a Docs reference.

Steven Dake (sdake) on 2015-09-02

Changed in kolla:
status:	In Progress → Won't Fix
status:	Won't Fix → Invalid

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-09-02: Change abandoned on kolla (master)

#10

Change abandoned by Steven Dake (<email address hidden>) on branch: master
Review: https://review.openstack.org/219261

kolla

multinode deploy results in intermittent authentication failures

Bug Description

Other bug subscribers

Remote bug watches