glance-control exits with 0 when glance-<server> script is not found

Bug #817032 reported by Jason Kölker
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Glance
Fix Released
Low
Eoghan Glynn

Bug Description

If a glance-<server> script does not exist in ./bin glance-control exits with a 0 status:

$ ls -la bin/glance-scrubber
ls: cannot access bin/glance-scrubber: No such file or directory

$ ./bin/glance-control scrubber start etc/glance-scrubber.conf --pid-file=/tmp/glance.pid
Unable to increase file descriptor limit. Running as non-root?
Starting glance-scrubber with /home/jkoelker/glance-2011.3/etc/glance-scrubber.conf

$ echo $?
0

Revision history for this message
Jason Kölker (jason-koelker) wrote :

111 def launch(ini_file, pid_file):
112 args = [server, ini_file]
113 print 'Starting %s with %s' % (server, ini_file)
114
115 pid = os.fork()
116 if pid == 0:
117 os.setsid()
118 with open(os.devnull, 'r+b') as nullfile:
119 for desc in (0, 1, 2): # close stdio
120 try:
121 os.dup2(nullfile.fileno(), desc)
122 except OSError:
123 pass
124 try:
125 os.execlp('%s' % server, server, ini_file)
126 except OSError, e:
127 sys.exit('unable to launch %s. Got error: %s'
128 % (server, str(e)))
129 sys.exit(0)
130 else:
131 write_pid_file(pid_file, pid)

Since it forks, the parent exits successfully and the exception that gets caught happens in the child.

Revision history for this message
Jay Pipes (jaypipes) wrote : Re: [Bug 817032] Re: glance-control exits with 0 when glance-<server> script is not found

Hmm, interesting. This code was copied from Swift. You can assign this
to me, Jason, and I'll take care of it. Swift has updated a lot of
that code in the past year or so; I'll check out clayg's new hotness.

-jay

On Wed, Jul 27, 2011 at 10:57 AM, Jason Kölker <email address hidden> wrote:
> 111     def launch(ini_file, pid_file):
> 112         args = [server, ini_file]
> 113         print 'Starting %s with %s' % (server, ini_file)
> 114
> 115         pid = os.fork()
> 116         if pid == 0:
> 117             os.setsid()
> 118             with open(os.devnull, 'r+b') as nullfile:
> 119                 for desc in (0, 1, 2):  # close stdio
> 120                     try:
> 121                         os.dup2(nullfile.fileno(), desc)
> 122                     except OSError:
> 123                         pass
> 124             try:
> 125                 os.execlp('%s' % server, server, ini_file)
> 126             except OSError, e:
> 127                 sys.exit('unable to launch %s. Got error: %s'
> 128                          % (server, str(e)))
> 129             sys.exit(0)
> 130         else:
> 131             write_pid_file(pid_file, pid)
>
> Since it forks, the parent exits successfully and the exception that
> gets caught happens in the child.
>
> --
> You received this bug notification because you are a member of Glance
> Bug Team, which is subscribed to Glance.
> https://bugs.launchpad.net/bugs/817032
>
> Title:
>  glance-control exits with 0 when glance-<server> script is not found
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/glance/+bug/817032/+subscriptions
>

Revision history for this message
Jason Kölker (jason-koelker) wrote :

I unfortunately don't have permission to assign it to anyone. The new swift daemon would work better as its not relying on calling external script, but importing the code directly.

Revision history for this message
Jay Pipes (jaypipes) wrote :

Actually, some of the swift daemons work that way, some don't :) Assigning to myself...

Changed in glance:
importance: Undecided → Medium
status: New → Confirmed
assignee: nobody → Jay Pipes (jaypipes)
milestone: none → diablo-4
Jay Pipes (jaypipes)
Changed in glance:
milestone: diablo-4 → diablo-rbp
Thierry Carrez (ttx)
Changed in glance:
milestone: diablo-rbp → 2011.3
Jay Pipes (jaypipes)
Changed in glance:
milestone: 2011.3 → none
assignee: Jay Pipes (jaypipes) → nobody
Jay Pipes (jaypipes)
Changed in glance:
importance: Medium → Low
Eoghan Glynn (eglynn)
Changed in glance:
assignee: nobody → Eoghan Glynn (eglynn)
Revision history for this message
Eoghan Glynn (eglynn) wrote :

Is the core problem here not so much the exit code (as this is the parent glance-control's exit code as indicated above, so changing the child's exit code will have no effect on the value reported by the shell), but rather the fact there is *no* indication given what-so-ever that the attempt to launch the service has failed?

If this opaqueness is the core problem, as opposed to the nature of the failure report, then we could consider this bug addressed by:

https://review.openstack.org/#change,3516

So the scenario above was repeated using the --capture-output flag, e.g.:

./bin/glance-control scrubber start etc/glance-scrubber.conf --pid-file=/tmp/glance.pid --capture-output

then something like the following error messages would appear in /var/log/messages:

Feb 2 13:15:36 neutrino glance-scrubber[17031]: unable to launch glance-scrubber. Got error: [Errno 2] No such file or directory

Would that be sufficient positive indication of launch failure?

Revision history for this message
Eoghan Glynn (eglynn) wrote :

In addition we could have the parent glance-control process wait a (configurable) short period for the child to exit ungracefully and if this occurs, inherit the non-zero status code from the child.

Of course it would get a little messy if multiple services failed to launch with 'glance-control all start' - in that case, the best we could do would be something like last non-zero exitcode wins.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance (master)

Fix proposed to branch: master
Review: https://review.openstack.org/3670

Changed in glance:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to glance (master)

Reviewed: https://review.openstack.org/3670
Committed: http://github.com/openstack/glance/commit/593e8c2fa745de2926c957ebacbefd55fa40071d
Submitter: Jenkins
Branch: master

commit 593e8c2fa745de2926c957ebacbefd55fa40071d
Author: Eoghan Glynn <email address hidden>
Date: Thu Feb 2 15:37:13 2012 +0000

    Add --await-child option to glance-control.

    Fixes bug 817032

    Previously an immediate non-zero exit status from service
    launch was not reflected in the exit status returned from
    glance-control.

    Now the parent glance-control process configurably waits for
    the child to exit ungracefully and if this occurs, it inherits
    the non-zero status code from the child.

    Change-Id: Ibbe92a5bf40d095951a572d78ae07026d8a9313d

Changed in glance:
status: In Progress → Fix Committed
Eoghan Glynn (eglynn)
Changed in glance:
milestone: none → essex-4
Thierry Carrez (ttx)
Changed in glance:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in glance:
milestone: essex-4 → 2012.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.