Nailgun tests fail intermittently with message "[: -ne: unary operator expected"

Bug #1393566 reported by Julia Varigina
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Roman Prykhodchenko

Bug Description

Steps to reproduce:
1. Run tests as described in 8.1.1.2. Setup for Nailgun Unit Tests (http://docs.mirantis.com/fuel-dev/develop/nailgun/development/env.html#setup-for-nailgun-unit-tests)

Expected result:
Tests pass

Actual result
Tests cli, webui fail *intermittently* with error message:
./run_tests.sh: line 307: [: -ne: unary operator expected (webui)
./run_tests.sh: line 359: [: -ne: unary operator expected (cli)

Root cause:
In run_tests.sh:

function run_server {
....
# wait for server availability
which nc > /dev/null
if [ $? -eq 0 ]; then
for i in {1..50}; do # <<<< BUG. It does not wait until server starts, but returns after limited number of attempts to access the server
local http_code=`curl -s -w %{http_code} -o /dev/null -I http://0.0.0.0:$SERVER_PORT/`
if [ http_code = 200 ]; then break; fi
sleep 0.1
done
else
sleep 5
fi

Changed in fuel:
assignee: nobody → Fuel QA Team (fuel-qa)
importance: Undecided → High
Revision history for this message
Sebastian Kalinowski (prmtl) wrote :

Nailgun start was just too slow.

Changed in fuel:
milestone: none → 6.0
status: New → Confirmed
assignee: Fuel QA Team (fuel-qa) → Fuel Python Team (fuel-python)
Revision history for this message
Dima Shulyak (dshulyak) wrote :

So what our options here?
We cant wait forever, and 50 intervals was enough for CI, and probably every developer environment

Julia, can you provide any specific details about environment on which you tried to run tests?

Changed in fuel:
status: Confirmed → Incomplete
Revision history for this message
Dima Shulyak (dshulyak) wrote :

Oh, actually there might be problems with permissions on /etc/nailgun/api.log

Julia, can you try to run manage.py run and show output?

Revision history for this message
Julia Varigina (jvarigina) wrote :

I was able to make the test pass by increasing the number of attempts:
for i in {1..50}; -> for i in {1..500};
Obviously this was a quick fix.

Are there any option to request the status of the service (something like start_in_progress, started, start_failed) instead of waiting for a fixed time interval ?
Or at least to fail the test with the error message: "Cannot access service .... Wait interval ... sec. expired. Do this and that" ?
This would be more informative than "./run_tests.sh: line 307: [: -ne: unary operator expected"

Revision history for this message
Sebastian Kalinowski (prmtl) wrote :

IMHO error message and env var that would allow to set waiting time/number of retries will be enough.

summary: - Nailgun tests fail intermittently
+ Nailgun tests fail intermittently with message "[: -ne: unary operator
+ expected"
Revision history for this message
Roman Prykhodchenko (romcheg) wrote :

The script should definitely handle cases when Nailgun didn't start better. I hit that bug somewhere in the past.

Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Roman Prykhodchenko (romcheg)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-web (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/137702

Changed in fuel:
status: Incomplete → Confirmed
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Roman Prykhodchenko (romcheg) wrote :

The patch that fixes this bug cannot be merged before this bug is fixed https://bugs.launchpad.net/fuel/+bug/1398390. Working on it.

Revision history for this message
Roman Prykhodchenko (romcheg) wrote :

The dependency was merged. Now the patch for this issue can be merged but still requires a second +2.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/137702
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=87db70a4b9f2b7d48539319df3e82a0f6531f801
Submitter: Jenkins
Branch: master

commit 87db70a4b9f2b7d48539319df3e82a0f6531f801
Author: Roman Prykhodchenko <email address hidden>
Date: Thu Nov 27 13:28:14 2014 +0100

    Allow specifying maximum wait-time for Nailgun

    Some tests require Nailgun to be running. For that a server is
    started and is waited for 5 seconds. For some systems Nailgun
    requires more time to get up and running.

    This patch allows users to specify maximum time of waiting Nailgun
    to start by setting NAILGUN_START_WAIT_TIME environment variable which
    defaults to 5 seconds. Introduced hooks stop tests if Nailgun failed to
    start before maximum wait-time passes.

    DocImpact
    Closes-bug: 1393566
    Change-Id: I39d4b312510c34c190ad680c48f1e5792858b140

Changed in fuel:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.