tripleo_nova_migration_target.service fails to starts: port 2022 is already taken

Bug #1816523 reported by Cédric Jeanneret
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Cédric Jeanneret

Bug Description

Hello,

While doing a routine check of a standalone deploy against Master, I found this issue:

● tripleo_nova_migration_target.service loaded failed failed nova_migration_target container
● tripleo_nova_migration_target_healthcheck.service loaded failed failed nova_migration_target healthcheck

After some investigations, it appears that this container (nova_migration_target) wants to start an SSHD service listening on port 2022:
++ cat /run_command
+ CMD='/usr/sbin/sshd -D -p 2022'
+ ARGS=
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ [[ ! -d /var/log/kolla/nova ]]
+++ stat -c %a /var/log/kolla/nova
++ [[ 2755 != \7\5\5 ]]
++ chmod 755 /var/log/kolla/nova
++ . /usr/local/bin/kolla_nova_extend_start
+++ [[ ! -d /var/lib/nova/instances ]]
+ echo 'Running command: '\''/usr/sbin/sshd -D -p 2022'\'''
Running command: '/usr/sbin/sshd -D -p 2022'
+ exec /usr/sbin/sshd -D -p 2022

The issue is, this port is already taken by the host sshd:
[root@undercloud ~]# cat /etc/ssh/sshd_config
# File is managed by Puppet
Port 22
Port 2022

This prevents the container to start.

Anyone can have a look? I'm not sure of the wanted state...

Thanks!

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Note: we apparently hit it on the CI as well, but it's not detected, hence no failure:
http://logs.openstack.org/68/637568/1/check/tripleo-ci-centos-7-standalone/bbbca84/logs/undercloud/var/log/extra/podman/podman_allinfo.log.txt.gz

Guess it's pretty bad... ?

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Root cause is probably a mistake done while flattening the nova service:
https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/nova/nova-migration-target-container-puppet.yaml#L113-L115

This should not list the MigrationSshPort, as it will configure the host SSHD service.

tags: added: promotion-blocker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart-extras (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/637729

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Working on the proper fix for sshd port.

Changed in tripleo:
assignee: nobody → Cédric Jeanneret (cjeanner)
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/637791

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

So, not a real duplicate - a complement to the other LP entry :).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/637791
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=acebe25936efe31f1eefb0dae338769c5c14739b
Submitter: Zuul
Branch: master

commit acebe25936efe31f1eefb0dae338769c5c14739b
Author: Cédric Jeanneret <email address hidden>
Date: Tue Feb 19 09:26:14 2019 +0100

    Correct sshd configuration within nova-migration-target

    The flattening introduced an error with sshd config, where the
    host was listening on port 2022, preventing the nova_migration_target
    container to start, since it wants to start an sshd service on port
    2022.

    Closes-Bug: 1816523

    Change-Id: I3a7ba82cf978cf6c056dba2d623fc94183650474

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.openstack.org/637729
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=74983421a13329451c8a9ae6b545650e900a0187
Submitter: Zuul
Branch: master

commit 74983421a13329451c8a9ae6b545650e900a0187
Author: Chandan Kumar <email address hidden>
Date: Tue Feb 19 14:06:26 2019 +0530

    Raise an error if a service or container is failed

    Sometimes container or service does not start, and this
    doesn't make the CI fail. Until now, the failed containers
    are listed in the /var/log/extras/ tree, but it's not
    checked on a regular basis.

    This patch intends to make a hard failure in case either
    a service or a container doesn't start as expected.

    Co-Authored-By: Cédric Jeanneret <email address hidden>
    Related-Bug: #1816523
    Change-Id: I001e2f27d2b562bb0be87c8eaadcf3622e530498

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 10.4.0

This issue was fixed in the openstack/tripleo-heat-templates 10.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.