master: Standalone002 job fails, keystone container didn't start because could not bind to address 192.168.24.1:35357

Bug #1820576 reported by Sagi (Sergey) Shnaidman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Medium
Bogdan Dobrelya

Bug Description

Keystone container doesn't start because of "could not bind to address" error:

http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-scenario002-standalone-master/19c2af2/logs/undercloud/var/log/extra/services.txt.gz

Mar 18 07:26:54 standalone.localdomain podman[34241]: (98)Address already in use: AH00072: make_sock: could not bind to address 192.168.24.1:35357
Mar 18 07:26:54 standalone.localdomain podman[34241]: no listening sockets available, shutting down
Mar 18 07:26:54 standalone.localdomain podman[34241]: AH00015: Unable to open logs
Mar 18 07:26:54 standalone.localdomain systemd[1]: tripleo_keystone.service: main process exited, code=exited, status=1/FAILURE
Mar 18 07:26:54 standalone.localdomain podman[34353]: 0465e2bdb28e702c21e0fc1ea55213a040c817ea121f0f7358451e1c16328f4d
Mar 18 07:26:54 standalone.localdomain systemd[1]: Unit tripleo_keystone.service entered failed state.
Mar 18 07:26:54 standalone.localdomain systemd[1]: tripleo_keystone.service failed.
Mar 18 07:26:55 standalone.localdomain systemd[1]: tripleo_keystone.service holdoff time over, scheduling restart.
Mar 18 07:26:55 standalone.localdomain systemd[1]: Stopped keystone container.
Mar 18 07:26:55 standalone.localdomain systemd[1]: start request repeated too quickly for tripleo_keystone.service
Mar 18 07:26:55 standalone.localdomain systemd[1]: Failed to start keystone container.
Mar 18 07:26:55 standalone.localdomain systemd[1]: Unit tripleo_keystone.service entered failed state.
Mar 18 07:26:55 standalone.localdomain systemd[1]: tripleo_keystone.service failed.

http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-scenario002-standalone-master/19c2af2/logs/undercloud/var/log/extra/failed_containers.log.txt.gz

0465e2bdb28e 192.168.24.1:8787/tripleomaster/centos-binary-keystone:0cd6b282920214d2995d1203aa0ad7d33c825b10_b0217950-updated-20190318070141 dumb-init --singl... 2 minutes ago Exited (1) 2 minutes ago keystone

http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-scenario002-standalone-master/19c2af2/logs/undercloud/var/log/extra/podman/containers/keystone/stdout.log.txt.gz

(98)Address already in use: AH00072: make_sock: could not bind to address 192.168.24.1:35357
no listening sockets available, shutting down
AH00015: Unable to open logs

summary: - Standalone002 job fails, keystone container didn't start because could
- not bind to address 192.168.24.1:35357
+ master: Standalone002 job fails, keystone container didn't start because
+ could not bind to address 192.168.24.1:35357
Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :

It didn't happen in next run, seems like a casual error.

tags: removed: promotion-blocker
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
Marios Andreou (marios-b) wrote :

ah but for comment #2 it looks like the link in description is from the 18th and the green runs are latest from 17th right now

Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :

It's for periodic job in master. Leaving it just to keep an eye for a few days to see it doesn't happen again.

wes hayutin (weshayutin)
tags: added: alert
Revision history for this message
Marios Andreou (marios-b) wrote :

looks like this failed exactly once with the error in description on 18th - otherwise green today (last 3 runs)

https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-scenario002-standalone-master

I propose we remove alert and mark this invalid? Please change if you disagree

tags: added: ci
removed: alert
Changed in tripleo:
status: Triaged → Invalid
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

We have net.ipv4.ip_local_port_range = 32768 60999
we should exclude 35357 from the ephemeral ports

Changed in tripleo:
status: Invalid → Triaged
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Reminded with https://bugs.launchpad.net/tripleo/+bug/1623818/comments/2
how come we have lost those reserved_ports in tripleo?

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

RabbitMQ - 41055 (we changed it to 25672 for tripleo),61613 (STOMP, not used in tripleo?)
Keystone - 35357 - let's add this to reservations
Qpidd - 49000 - 49001 - let's add this to reservations
libvirt 49152 - 49261- let's add this to reservations
ovs capwap 58881 - 58882 (is legacy and prolly not needed any more)

Changed in tripleo:
status: Triaged → In Progress
assignee: Sagi (Sergey) Shnaidman (sshnaidm) → Bogdan Dobrelya (bogdando)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Bogdan Dobrelya (<email address hidden>) on branch: master
Review: https://review.openstack.org/644545

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Lowering to medium as this is a rare race condition when an ephemeral port matches the value reserved for keystone admin endpoint

Changed in tripleo:
status: In Progress → Triaged
assignee: Bogdan Dobrelya (bogdando) → nobody
importance: Critical → Medium
tags: added: tech-debt
removed: ci
Changed in tripleo:
milestone: stein-rc1 → train-1
Changed in tripleo:
milestone: train-1 → train-2
Changed in tripleo:
milestone: train-2 → train-3
Changed in tripleo:
milestone: train-3 → ussuri-1
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Sandeep Yadav (sandeepyadav93) wrote :

Hello,

In Comment#12 (https://bugs.launchpad.net/tripleo/+bug/1820576/comments/12) it is mentioned that this is a race condition when sometimes when an ephemeral port matches the value reserved for keystone admin endpoint

Reopening this bug as we again hit this issue in Gate pipeline:-

http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e8f/720052/1/gate/tripleo-ci-centos-7-undercloud-containers/e8fb859/logs/undercloud/var/log/extra/podman/containers/keystone/stdout.log

~~~
[Tue Apr 21 02:58:10.292378 2020] [so:warn] [pid 8] AH01574: module wsgi_module is already loaded, skipping
(98)Address already in use: AH00072: make_sock: could not bind to address 192.168.24.1:35357
no listening sockets available, shutting down
AH00015: Unable to open logs
~~~

Changed in tripleo:
status: Fix Released → Triaged
milestone: ussuri-1 → ussuri-rc1
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc1 → ussuri-rc3
wes hayutin (weshayutin)
tags: added: promotion-blocker
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc3 → victoria-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/731157

Changed in tripleo:
assignee: nobody → Bogdan Dobrelya (bogdando)
status: Triaged → In Progress
Changed in tripleo:
assignee: Bogdan Dobrelya (bogdando) → wes hayutin (weshayutin)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Bogdan Dobrelya (bogdando) (<email address hidden>) on branch: master
Review: https://review.opendev.org/731157
Reason: back to https://review.opendev.org/#/c/644545

Changed in tripleo:
assignee: wes hayutin (weshayutin) → Bogdan Dobrelya (bogdando)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/644545
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=ffd31df7d3b6eb415bae76291b4adaaab4da8dd0
Submitter: Zuul
Branch: master

commit ffd31df7d3b6eb415bae76291b4adaaab4da8dd0
Author: Bogdan Dobrelya <email address hidden>
Date: Tue Mar 19 12:16:31 2019 +0100

    Add reserved ports for some services

    Exclude ports from the ephemeral pool ranges that can be shared by
    the following services:
    * Keystone - 35357
    * Qpidd/matahari - 49000
    * Clustercheck - 49000-49001 (xinetd)
    * Swift Proxy and Ironic PXE that rely on xinetd - 49001

    Closes-Bug: #1820576

    Change-Id: I71308a65bea5f59d755b766165dabf5d3e646ee1
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/733087

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/733088

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/ussuri)

Reviewed: https://review.opendev.org/733087
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=b225e6b48fe0f2409f0827c01d3c3c48cc40a133
Submitter: Zuul
Branch: stable/ussuri

commit b225e6b48fe0f2409f0827c01d3c3c48cc40a133
Author: Bogdan Dobrelya <email address hidden>
Date: Tue Mar 19 12:16:31 2019 +0100

    Add reserved ports for some services

    Exclude ports from the ephemeral pool ranges that can be shared by
    the following services:
    * Keystone - 35357
    * Qpidd/matahari - 49000
    * Clustercheck - 49000-49001 (xinetd)
    * Swift Proxy and Ironic PXE that rely on xinetd - 49001

    Closes-Bug: #1820576

    Change-Id: I71308a65bea5f59d755b766165dabf5d3e646ee1
    Signed-off-by: Bogdan Dobrelya <email address hidden>
    (cherry picked from commit ffd31df7d3b6eb415bae76291b4adaaab4da8dd0)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/733088
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=3d16a7009dccbcecd96a2d22a16f37b840c9294a
Submitter: Zuul
Branch: stable/train

commit 3d16a7009dccbcecd96a2d22a16f37b840c9294a
Author: Bogdan Dobrelya <email address hidden>
Date: Tue Mar 19 12:16:31 2019 +0100

    Add reserved ports for some services

    Exclude ports from the ephemeral pool ranges that can be shared by
    the following services:
    * Keystone - 35357
    * Qpidd/matahari - 49000
    * Clustercheck - 49000-49001 (xinetd)
    * Swift Proxy and Ironic PXE that rely on xinetd - 49001

    Closes-Bug: #1820576

    Change-Id: I71308a65bea5f59d755b766165dabf5d3e646ee1
    Signed-off-by: Bogdan Dobrelya <email address hidden>
    (cherry picked from commit ffd31df7d3b6eb415bae76291b4adaaab4da8dd0)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.4.0

This issue was fixed in the openstack/tripleo-heat-templates 11.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.