[train] Jobs are failing with Could not establish a connection to the Zaqar websocket

Bug #1893621 reported by Arx Cruz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Harald Jensås

Bug Description

https://logserver.rdoproject.org/openstack-periodic-integration-stable2-centos7/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-train/2df4c84/logs/undercloud/home/zuul/overcloud_import_nodes.log.txt.gz

2020-08-30 13:05:03 | + openstack overcloud node import instackenv.json
2020-08-30 13:05:05 | Could not establish a connection to the Zaqar websocket. The command was sent but the answer could not be read.
2020-08-30 13:05:05 | Exception occured while running the command
2020-08-30 13:05:05 | Traceback (most recent call last):
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/tripleoclient/command.py", line 32, in run
2020-08-30 13:05:05 | super(Command, self).run(parsed_args)
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/osc_lib/command/command.py", line 41, in run
2020-08-30 13:05:05 | return super(Command, self).run(parsed_args)
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/cliff/command.py", line 185, in run
2020-08-30 13:05:05 | return_code = self.take_action(parsed_args) or 0
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/tripleoclient/v1/overcloud_node.py", line 424, in take_action
2020-08-30 13:05:05 | instance_boot_option=parsed_args.instance_boot_option
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/tripleoclient/workflows/baremetal.py", line 62, in register_or_update
2020-08-30 13:05:05 | with tripleoclients.messaging_websocket() as ws:
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/tripleoclient/plugin.py", line 223, in messaging_websocket
2020-08-30 13:05:05 | cacert=self._instance.cacert)
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/tripleoclient/plugin.py", line 89, in __init__
2020-08-30 13:05:05 | sslopt=OS_CACERT)
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/websocket/_core.py", line 511, in create_connection
2020-08-30 13:05:05 | websock.connect(url, **options)
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/websocket/_core.py", line 220, in connect
2020-08-30 13:05:05 | options.pop('socket', None))
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/websocket/_http.py", line 120, in connect
2020-08-30 13:05:05 | sock = _open_socket(addrinfo_list, options.sockopt, options.timeout)
2020-08-30 13:05:05 | File "/usr/lib/python2.7/site-packages/websocket/_http.py", line 190, in _open_socket
2020-08-30 13:05:05 | raise err
2020-08-30 13:05:05 | error: [Errno 111] Connection refused
2020-08-30 13:05:05 | [Errno 111] Connection refused

This is different from https://bugs.launchpad.net/tripleo/+bug/1878540 since this is connection refused, while the mentioned before was related to certification issues.

Arx Cruz (arxcruz)
Changed in tripleo:
importance: Undecided → Critical
summary: - train featureset020 failed with Could not establish a connection to the
+ [train] Jobs are failing with Could not establish a connection to the
Zaqar websocket
Revision history for this message
Arx Cruz (arxcruz) wrote :

According ramishra on IRC, this is related to https://review.opendev.org/#/c/748673/ that needs to be backported to train

Revision history for this message
Arx Cruz (arxcruz) wrote :

Sorry, it seems the port was wrong. The patch was merged in train branch and it's breaking the promotion

Changed in tripleo:
assignee: nobody → Harald Jensås (harald-jensas)
status: Triaged → In Progress
Revision history for this message
wes hayutin (weshayutin) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (master)

Reviewed: https://review.opendev.org/748673
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=4a52809a9676f3058863882b529f88c5f078c9ee
Submitter: Zuul
Branch: master

commit 4a52809a9676f3058863882b529f88c5f078c9ee
Author: Harald Jensås <email address hidden>
Date: Fri Aug 28 15:30:11 2020 +0200

    Fix tls - zaqar wss default port

    In https://review.opendev.org/746671 we updated the
    endpoint map. It seems this broke the undercloud as
    haproxy still listen on port 9000.

    Update the default zaqar wss port in puppet as well.

    Also set the request header Host to use the service
    port so that handshake does not fail on port missmatch.

    Change-Id: I0c9dd4e9457c7cfa8cd54b5eaa167bf4e8e6bcc1
    Closes-Bug: #1893621
    Related-Bug: #1832168
    Related-RHBZ: #1868910

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/749161

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/749162

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/ussuri)

Reviewed: https://review.opendev.org/749161
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=e8dfd034c51500a8fa362a30a60615ae8efca8c0
Submitter: Zuul
Branch: stable/ussuri

commit e8dfd034c51500a8fa362a30a60615ae8efca8c0
Author: Harald Jensås <email address hidden>
Date: Fri Aug 28 15:30:11 2020 +0200

    Fix tls - zaqar wss default port

    In https://review.opendev.org/746671 we updated the
    endpoint map. It seems this broke the undercloud as
    haproxy still listen on port 9000.

    Update the default zaqar wss port in puppet as well.

    Also set the request header Host to use the service
    port so that handshake does not fail on port missmatch.

    Change-Id: I0c9dd4e9457c7cfa8cd54b5eaa167bf4e8e6bcc1
    Closes-Bug: #1893621
    Related-Bug: #1832168
    Related-RHBZ: #1868910
    (cherry picked from commit 4a52809a9676f3058863882b529f88c5f078c9ee)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/train)

Reviewed: https://review.opendev.org/749162
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=72ad205fa85377663b72d9dc7d297452098a41cb
Submitter: Zuul
Branch: stable/train

commit 72ad205fa85377663b72d9dc7d297452098a41cb
Author: Harald Jensås <email address hidden>
Date: Fri Aug 28 15:30:11 2020 +0200

    Fix tls - zaqar wss default port

    In https://review.opendev.org/746671 we updated the
    endpoint map. It seems this broke the undercloud as
    haproxy still listen on port 9000.

    Update the default zaqar wss port in puppet as well.

    Also set the request header Host to use the service
    port so that handshake does not fail on port missmatch.

    Change-Id: I0c9dd4e9457c7cfa8cd54b5eaa167bf4e8e6bcc1
    Closes-Bug: #1893621
    Related-Bug: #1832168
    Related-RHBZ: #1868910
    (cherry picked from commit 4a52809a9676f3058863882b529f88c5f078c9ee)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 11.5.0

This issue was fixed in the openstack/puppet-tripleo 11.5.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.