[Sahara] SSH connection timeout is small for big cluster

Bug #1545049 reported by Evgeny Sikachev on 2016-02-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Medium
MOS Sahara
10.0.x
Medium
MOS Sahara
9.x
Medium
MOS Sahara

Bug Description

ENVIRONMENT: MOS 8.0, 529(RC1)

STEPS TO REPRODUCE:
1. Create env with sahara
2. Deploy env with sahara
3. Register image for vanilla 2
4. Create node group template with proxy gateway
5. Create cluster on 500 nodes with 25 proxy nodes

EXPECTED RESULT:
Cluster created

ACTUAL RESULT:
SSH connection failed with timeout

WORKAROUND:
Add to /etc/sahara/sahara.conf in [DEFAULT] section:
ssh_timeout_common = 600
ssh_timeout_files = 600

Roman Podoliaka (rpodolyaka) wrote :

This does not sound really important to me and is kind of expected: the larger the cluster is, the more time you need to provision it. We just can't know the right timeout value in advance, as it also depends on hardware, networking, etc, not only on the cluster size.

I suggest we try to do a better job in 9.0 by tweaking timeout depending on the cluster size, maybe.

Please raise the importance, if you disagree.

Changed in mos:
importance: Undecided → Medium
status: New → Confirmed
status: Confirmed → Won't Fix
Nikita Konovalov (nkonovalov) wrote :

I think we can have it as a known issue listed in release notes and move it to wont-fix.

So far we have not seen requests to deploy clusters of this scale (beyond our own testing).
Anyway if those appear, there is a workaround.

And of course we need to find a proper solution for the next release.

tags: added: release-notes
tags: added: 9.0
Dina Belova (dbelova) wrote :

Added move-to-10.0 tag due to the fact bug was transferred from 9.0 to 10.0

tags: added: move-to-10.0

Related fix proposed to branch: master
Change author: Evgeny Konstantinov <email address hidden>
Review: https://review.fuel-infra.org/22313

Reviewed: https://review.fuel-infra.org/22313
Submitter: Evgeny Konstantinov <email address hidden>
Branch: master

Commit: db7a986e1dc6f00b609b4a8450cd7b3ee580328f
Author: Evgeny Konstantinov <email address hidden>
Date: Wed Jun 22 10:24:42 2016

Add Sahara known issues to relnotes 9.0

Change-Id: Iec248f62a37027fac6d163040727b175565abc08
Related-Bug: #1536259
Related-Bug: #1545049

tags: added: release-notes-done
removed: release-notes
Dina Belova (dbelova) wrote :

For 10.0/whatever consuming Newton this should be fixed in upstream

tags: added: 10.0-reviewed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers