Jobs won't start using pdsh if script fd >= 10

Bug #757244 reported by Jacek Konieczny
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
upstart
Fix Released
High
Scott James Remnant
PLD Linux
Fix Released
Undecided
Unassigned

Bug Description

After upgrading Upstart to 1.2 I found some of my jobs not starting (exiting with code '127'). After some investigation I found out that Upstart prepends 'exec 10<&-' to the scripts of the failing jobs, to close the input pipe. The problem is this won't work for some POSIX shells, which can handle only single-digit file descriptors this way. This works for the big and heavy bash, but won't for under PDKSH (used as /bin/sh in e.g. PLD Linux).

Can upstart be made to use some fixed file descriptor for this task? e.g. '3'?

Revision history for this message
Jacek Konieczny (jajcus-jajcus) wrote :

Attaching a quick hack (not an elegant solution) which solves the problem for me (the script fd is dup2()ed to fd #3 in the child process).

Revision history for this message
Arkadiusz Miśkiewicz (arekm) wrote :

Jacek also found this:

"Open files are represented by decimal numbers starting with zero. The largest possible value is implementation-dependent; however, all implementations support at least 0 to 9, inclusive, for use by the application. These numbers are called file descriptors. "

http://pubs.opengroup.org/onlinepubs/007908799/xcu/chap2.html

so posix shells are required to support file descriptiors 0-9 only.

Changed in upstart:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Scott James Remnant (scott) wrote :

Thanks for the patch, actually I don't think that's a hacky way to do things, but there is an issue if the writing end of the error pipe happens to have that file descriptor - you could close it in the dup2() - I'll commit an expanded version of the patch

Revision history for this message
Scott James Remnant (scott) wrote :
Changed in upstart:
status: Triaged → Fix Committed
milestone: none → 1.3
assignee: nobody → Scott James Remnant (scott)
Revision history for this message
Yuri Zaporozhets (yuri-zaporozhets) wrote :

Actually, the fix commited in 1280 does not work in all cases. Namely, when init has 7 (seven) files open. In this case the communication channel has the fd=8, and the first pipe call in job_process_run() returns (9, 10). This means that the script_fd parameter passed to job_process_spawn() also equals to 9. In such case the following code is screwed up completely

if (script_fd != -1) {
  int tmp = dup2(script_fd, JOB_PROCESS_SCRIPT_FD); /* dup2(9, 9) returns 9 */
  /*...*/
  close(script_fd); /* Close our current file descriptor */
  script_fd = tmp; /* ...which means a big trouble */
}

The correct way is, of course, not only to check script_fd against -1, but also check if it's already 9:

if ((script_fd != -1) && (script_fd != JOB_PROCESS_SCRIPT_FD)) {
  /*...*/
}

Changed in upstart:
status: Fix Committed → Incomplete
Revision history for this message
Scott James Remnant (scott) wrote :

Oh, I was apparently expecting dup2() to return -1 in the case of being given the same file descriptor

Revision history for this message
Scott James Remnant (scott) wrote :
Changed in upstart:
status: Incomplete → Fix Committed
Changed in upstart:
status: Fix Committed → Fix Released
Lukasz Kies (kiesiu)
Changed in pld-linux:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.