Widnows SBCL GUI crashes unless run on command-line

Bug #1256034 reported by Wilfredo Velázquez-Rodríguez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Fix Released
Medium
Unassigned

Bug Description

Use case:

I create standalone sbcl applications on Windows by utilizing the save-lisp-and-die feature. Most of the time, these applications are UI applications that do not use the console and instead open up their own graphical windows for the user to interact with. In these, the console window that Windows (the O.S.) provides for SBCL is distracting.
The opening of IO towards this console window is done by the Windows (the O.S.) loader upon seeing the 'Portable Executable' flag "Subsystem" in the 'Optional Header' set to 3 (Windows Console). As part of my build process, I utilize a modified sbcl.exe that has its "Subsystem" flag set to 2 (Windows GUI). I feed it arguments via command-line for it to load up files etc and ultimately SB-EXT:SAVE-LISP-AND-DIE with a :TOPLEVEL specified.
This has worked great, until I installed the latest Windows x86 release (1.1.12). In this release the created application crashes when launched. After some experimenting with the available SBCL releases, I found that this broke as of release 1.1.2, with 1.1.1 working fine.

Notes:

I've found that the application works fine when launched from a console window, crashing only on the typical case of user double-click.

Releases 1.1.2 and 1.1.1 are from:

http://sourceforge.net/projects/sbcl/files/sbcl/1.1.2/

and

http://sourceforge.net/projects/sbcl/files/sbcl/1.1.1/

Respectively.

Test case:

This is a little bit of an annoying thing to test because the use is a little esoteric, but here's my attempts at providing the least painful test case I can think of:

1) Copy sbcl.exe, making it sbclw.exe
2) Edit sbclw.exe to modify the Optional Header's Subsystem field to 2 (from 2).
  * There are various ways of doing this..
    - I typically use the PE Editor from http://sourceforge.net/projects/pe-tools/ for doing this manually.
    - Alternatively, I can provide a lisp function that performs the modification programmatically
3) Create a shortcut to sbclw.exe, probably set its start directory to something predictable (C:\home\) and set its command-line arguments to

  --eval "(progn (with-open-file (ignore \"sbcltest.out\" :direction :output :if-exists :supersede)) (exit :code 0))"

4) Run sbclw through the shortcut through explorer. Notice that the file "sbcltest.out" is not created.

5) Run sbclw.exe at the command line with the same arguments:

  sbclw.exe --eval "(progn (with-open-file (ignore \"sbcltest.out\" :direction :output :if-exists :supersede)) (exit :code 0))"

  Notice the file "sbcltest.out" is now created.

Windows occasionally catches the crash and reports:

Problem signature:
  Problem Event Name: BEX
  Application Name: sbclw.exe
  Application Version: 0.0.0.0
  Application Timestamp: 524e56e8
  Fault Module Name: StackHash_e07c
  Fault Module Version: 0.0.0.0
  Fault Module Timestamp: 00000000
  Exception Offset: 00000000
  Exception Code: c0000005
  Exception Data: 00000008
  OS Version: 6.1.7601.2.1.0.256.48
  Locale ID: 1033
  Additional Information 1: e07c
  Additional Information 2: e07c7abd1e6e01ceb89acff9ddb767b4
  Additional Information 3: e9af
  Additional Information 4: e9af0ada594c3a11c2328b3bdf30b090

I'm going to get myself learned up on SBCL source and see if something stands out to me in-between these changes that might cause this to break.

description: updated
Revision history for this message
Paul Khuong (pvk) wrote :

dto confirms he experiences pretty much the same thing on a range of Windows versions.

If you have time to look into this, the easiest would probably be to run git bisect between 1.1.1 and 1.1.2: it's usually easier to examine a single commit that causes the regression.

Changed in sbcl:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Wilfredo Velázquez-Rodríguez (zulu-inuoe) wrote :

I was able to trace this to commit

ed1910efb36f71b5ebe33b5ffffd7195e15644de

This is where fd's on Windows were replaced by using HANDLEs instead.
As stated in the documentation for GetStdHandle (http://msdn.microsoft.com/en-us/library/windows/desktop/ms683231(v=vs.85).aspx) it'll return NULL when the application has no standard handles available.

SB-IMPL:STREAM-REINIT queries these handles and makes fd streams from them (to assign to *stdin* *stdout* *stderr*) and it's where things are going bad. (Specifically, nil was being thrown as the 'fd' in make-fd-stream.)

When running a "GUI Application" on Windows, that application won't be allocated STDIN, STDOUT, or STDERR handles.
However
If the parent process of that application -did- have these handles allocated, then the child process will inherit them. This is why I was able to launch sbcl from the command-line fine, and not from explorer.exe.

The difference between using fd's and HANDLES in this case is that the CRT (lowio) file descriptors always exist, even if the handles behind them do not.

I'm looking at creating a /dev/null equivalent for cases where the standard handles are not available.

Revision history for this message
Wilfredo Velázquez-Rodríguez (zulu-inuoe) wrote :

Attached patch to direct *STDIN* *STDOUT* and *STDERR* to NUL on Windows when no standard handles are available. This is the equivalent of /dev/null

Revision history for this message
Stas Boukarev (stassats) wrote :

Appplied with minor adjustments in c3462f08137286b19e5068a750a5bae1d98beac1, thanks!

Changed in sbcl:
status: Confirmed → Fix Committed
Revision history for this message
Stas Boukarev (stassats) wrote :

And now save-lisp-and-die with :application-type :gui will create a gui application, no third party tools neccessary.

Changed in sbcl:
status: Fix Committed → Fix Released
Revision history for this message
Wilfredo Velázquez-Rodríguez (zulu-inuoe) wrote :

After more testing it appears that this patch was incomplete.. arguably.

There's an error in the case where the calling process (the one invoking CreateProcess), (in this case cmd) has std handles. If the callee (in this case SBCL) calls GetStdHandle they will receive whatever was supplied through CreateProcess, which in a grand majority of cases is the stdio handles of the parent process.
These handles however, are not valid cross-process when the callee process (SBCL) is a GUI application (set by Subsystem) unless they're set up specifically for that purpose, such as a pipe.
Right now SBCL takes these bogus handles and attempts to operate on them. WinApi will fail in these cases.
Attached is a patch that verifies the validity of the std handles via GetFileType..

I believe this is the best approach for SBCL to take for usability purposes. Other environments fail on write 'silently' (WriteFile simply returns 0, printf returns 0, etc). But when we fail to write on SBCL we must signal an error (ANSI), which is typically a sledge-hammer in the cases where this is relevant.
For example, I ran into this issue because my program does many diagnostic (format t ...) calls which I can use by doing proper i/o redirection. I then invoked my program from the shell without i/o redirection and just crashed as soon as writes attempted to happen.

To compensate for this possibility I have three options:

  1)Handle the error at every format
  2)Delegate the format calls to a central function that has error handling (and update all my existing format calls)
  3)Rely on my output to always be properly redirected.. which in the Windows environment is not particularly reliable, seeing as even the standard provided cmd.exe fails to do this

Options 1 and 2 solve the issue for -my- code, but if I load some other library (quicklisp, etc) that does output, I'm more or less screwed unless I wrap all calls to code that -might- call other libraries with error handling.
Option 3 to me is barely even an option. It's what I'm doing right now just to get around this issue but to me seems to restrictive.

So, I make the argument that if SBCL is unable to meaningfully perform I/O due to circumstances outside of its control (OS handing it bogus io handles), it should be redirected to NUL and considered to 'always succeed'.

I apologize very much for my lack of proper testing of the previous fixes.

Revision history for this message
Stas Boukarev (stassats) wrote :

Comitted in 6d5b30308bf7481577544b7ee469f5189a13e503.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.