Authorization Failed: Bad Gateway (HTTP 502) when executing 'fuel node'
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
In Progress
|
Medium
|
Bartłomiej Piotrowski | ||
7.0.x |
Won't Fix
|
Medium
|
Fuel Sustaining | ||
8.0.x |
Won't Fix
|
Medium
|
Fuel Sustaining | ||
Mitaka |
Won't Fix
|
Medium
|
Fuel Sustaining |
Bug Description
Detailed bug description:
When operating with fuel CLI from time to time the command fails to authenticate. For example fuel node returns the following trace:
Traceback (most recent call last):
File "/usr/bin/fuel", line 10, in <module>
sys.
File "/usr/lib/
return func(*args, **kwargs)
File "/usr/lib/
parser.parse()
File "/usr/lib/
actions[
File "/usr/lib/
method(params)
File "/usr/lib/
node_collection = NodeCollection.
File "/usr/lib/
return cls(Node.get_all())
File "/usr/lib/
return map(cls.
File "/usr/lib/
return cls.connection.
File "/usr/lib/
resp = self.get_
File "/usr/lib/
return self.session.
File "/usr/lib/
self._session = self._make_
File "/usr/lib/
session.
File "/usr/lib/
'X-Auth-Token': self.auth_token}
File "/usr/lib/
if not self.keystone_
File "/usr/lib/
self.
File "/usr/lib/
tenant_
File "/usr/lib/
self.
File "/usr/lib/
return func(*args, **kwargs)
File "/usr/lib/
resp = self.get_
File "/usr/lib/
_("
keystoneclient.
at the time of above error In keystone logs (/var/log/
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
2016-04-20 09:06:00.785 11406 ERROR keystone.
the actual FATAL error is also reported in postrges logs (/var/lib/
....
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
FATAL: remaining connection slots are reserved for non-replication superuser connections
OSTF is also not working showing "OSTF server is not available." where in fact docker container with OSTF services is running:
[root@fuel ~]# dockerctl shell ostf ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 53984 3632 ? Ss Apr17 0:00 /usr/sbin/init
dbus 34 0.0 0.0 26428 1444 ? Ss Apr17 0:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-
root 62 0.1 0.0 4336 668 ? Ss Apr17 5:55 /usr/sbin/acpid
root 487 0.0 0.0 418432 90656 ? Ssl Apr17 0:02 /usr/bin/python /usr/bin/
root 564 0.0 0.0 19764 1232 ? Rs+ 10:10 0:00 ps aux
Every time the OSTF page is refreshed in Fuel WebUI new FATAL messages appear in postgres logs.
Postgres is reaching it's configured limit of connections:
[root@fuel ~]# dockerctl shell postgres [root@fuel /]# grep max_connections /var/lib/
max_connections = 100 # (change requires restart)
# Note: Increasing max_connections costs ~400 bytes of shared memory per
# max_locks_
[root@fuel ~]# dockerctl shell postgres sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity;"
count
-------
98
(1 row)
where to majority of the existing connections are from keystone:
[root@fuel ~]# dockerctl shell postgres sudo -u postgres psql -c "SELECT datname,
datname | count
----------+-------
keystone | 83
postgres | 1
nailgun | 13
ostf | 1
(4 rows)
Steps to reproduce:
on Fuel Master execute "fuel node" multiple times or alternatively open OSTF page in Fuel UI
Expected results:
'fuel node' command should always work, OSTF tests should be displayed
Actual result:
some of "fuel node" command execution traces out with a trace pasted above, OSTF page is showing an error
Workaround:
restart postgresql
dockerctl shell postgres service postgresql restart
Once postgres is restarted the active connections looks much better:
root@fuel ~]# dockerctl shell postgres sudo -u postgres psql -c "SELECT datname,
datname | count
----------+-------
keystone | 5
postgres | 1
nailgun | 4
(3 rows)
OSTF UI opens just fine, however number of sessions from keystone are increasing with every execution of "fuel node".
Impact:
High, health check is not possible therefore the entire QA tests cannot happen.
Description of the environment:
Fuel is running on a physical host.
One environment (18 nodes in total, 9 physical, 9VMs) is already deployed. VMs were created using reduced footprint feature [1].
Advanced network template is also used for the deployed environment.
Version of components:
VERSION:
feature_groups:
- mirantis
production: "docker"
release: "8.0"
api: "1.0"
build_number: "570"
build_id: "570"
fuel-nailgun_sha: "558ca91a854cf2
python-
fuel-agent_sha: "658be72c4b42d3
fuel-
astute_sha: "b81577a5b7857c
fuel-library_sha: "c2a335b5b725f1
fuel-ostf_sha: "3bc76a63a9e7d1
fuel-mirror_sha: "fb45b80d7bee58
fuelmenu_sha: "78ffc73065a967
shotgun_sha: "63645dea384a37
network-
fuel-upgrade_sha: "616a7490ec7199
fuelmain_sha: "d605bcbabf3153
Network model:
Neutron with VXLANs
Additional Information:
Diagnostic snapshot can be provided privately to an engineer working on this bug since it might contain customer specific data.
[1] reduced footprint feature: https:/
Changed in fuel: | |
assignee: | nobody → MOS Keystone (mos-keystone) |
tags: | added: customer-found |
Changed in fuel: | |
assignee: | Bartłomiej Piotrowski (bpiotrowski) → Maksim Malchuk (mmalchuk) |
Changed in fuel: | |
assignee: | Maksim Malchuk (mmalchuk) → Bartłomiej Piotrowski (bpiotrowski) |
no longer affects: | fuel/newton |
Changed in fuel: | |
milestone: | 10.0 → 10.1 |
Well yes, is seems that keystone needs more connections to postgresql. Please increase the number of allowed connections.