[Ambari] 'service ambari-server start' never finishes

Bug #1579187 reported by Peter Nordquist
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Sahara
Fix Released
High
Vitalii Gridnev
Mitaka
Fix Released
High
Vitalii Gridnev

Bug Description

In my Sahara deployment (Mitaka), it seems like my Ambari server takes just long enough to start that it hits a deadlock. I've updated the ssh_timeout_common from 300 to 900 and it never finishes. The odd part is that when the ssh session times out the server finishes starting properly. My current running theory is that since they flush to stdout and it doesn't look like the output is being captured this is causing the script to wait for the ability to write to the OS stdout buffer but since it's not being read the OS blocks when trying to flush. I've changed the line here [0] to 'service ambari-server start >/dev/null' and it seemed to fix my installation. This change feels like a hack though so maybe the system should be piping the output somehow? The related change in Ambari is here [1] and [2]. I'm not sure a change in Ambari caused this to happen though as I was having intermittent issues with this before the Mitaka upgrade.

[0]: https://github.com/openstack/sahara/blob/master/sahara/plugins/ambari/deploy.py#L58
[1]: https://issues.apache.org/jira/browse/AMBARI-6447
[2]: https://reviews.apache.org/r/23383/

Revision history for this message
Vitalii Gridnev (vgridnev) wrote :

Can you explain what image was used?

Revision history for this message
Vitalii Gridnev (vgridnev) wrote :

Actually I saw such kind of behavior on Centos7 images, anyway moving to Confirmed

Changed in sahara:
status: New → Confirmed
importance: Undecided → Medium
milestone: none → newton-1
tags: added: plugin.ambari
Revision history for this message
Peter Nordquist (pnnl-plnordquist) wrote :

Yeah I was using the Centos 7 images being built by the Sahara image elements repo from the stable/mitaka branch there. It might only be an issue on Centos 7 since there's some special glue there for SystemD to execute old init.d scripts or something along those lines.

Changed in sahara:
importance: Medium → High
Changed in sahara:
assignee: nobody → Vitaly Gridnev (vgridnev)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to sahara (master)

Fix proposed to branch: master
Review: https://review.openstack.org/315024

Changed in sahara:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to sahara (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/324416

Changed in sahara:
milestone: newton-1 → newton-2
no longer affects: sahara/mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to sahara (master)

Reviewed: https://review.openstack.org/315024
Committed: https://git.openstack.org/cgit/openstack/sahara/commit/?id=6329a0aa736383f095d61527583e65b904e90b8e
Submitter: Jenkins
Branch: master

commit 6329a0aa736383f095d61527583e65b904e90b8e
Author: Vitaly Gridnev <email address hidden>
Date: Wed May 11 16:07:47 2016 +0300

    workaround to fix ambari start on centos7

    Change-Id: I1a0383861db50ae5136e8dff9de4119b566606f1
    Closes-bug: 1579187

Changed in sahara:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to sahara (stable/mitaka)

Reviewed: https://review.openstack.org/324416
Committed: https://git.openstack.org/cgit/openstack/sahara/commit/?id=f4ee3d0e1694c0c5f5cc4fb46080fc40e5af163e
Submitter: Jenkins
Branch: stable/mitaka

commit f4ee3d0e1694c0c5f5cc4fb46080fc40e5af163e
Author: Vitaly Gridnev <email address hidden>
Date: Wed May 11 16:07:47 2016 +0300

    workaround to fix ambari start on centos7

    Change-Id: I1a0383861db50ae5136e8dff9de4119b566606f1
    Closes-bug: 1579187

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/sahara 5.0.0.0b2

This issue was fixed in the openstack/sahara 5.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/sahara 4.1.0

This issue was fixed in the openstack/sahara 4.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.