mingw is not fully ANSI, so %lf can break it

Bug #1552913 reported by David Mathog
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Inkscape Devlibs
Triaged
Medium
Unassigned

Bug Description

The poppler library in devlibs 61 breaks (s)printf "%lf" formatting. The language standard says that "%lf" must work the same as "%f", and it does, except when programs are linked with poppler from devlibs 61. This was discovered while hunting down inkscape bug 1538361.

  https://bugs.launchpad.net/inkscape/+bug/1538361

See message 40 in that thread for a test case and commands to reproduce the problem.

This is a major problem as it indirectly breaks almost all EMF/WMF imports on Windows.

The issue was introduced somewhere after devlibs 53, which was the version used the last time I did a Windows build.

Tags: emf wmf
Revision history for this message
David Mathog (mathog) wrote :

Made a new test program which is:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
int main(void){
  char huge[64000];
  double esc=0;
  double val = cos(esc);
  printf("val %lf\n",val);
  printf("esc %lf cos %lf sin %lf -sin %lf -cos %lf\n",
    esc,cos(esc),sin(esc),-sin(esc),-cos(esc));
  sprintf(huge,"esc %lf cos %lf sin %lf -sin %lf -cos %lf\n",
    esc,cos(esc),sin(esc),-sin(esc),-cos(esc));
  printf("%s\n",huge);
  exit(EXIT_SUCCESS);
}

Obtained current release of Dr. Memory. Installed on XP 32 bit machine.
In a DOS window did:

mingwenv
mingw32-g++ -O0 -ggdb -static-libgcc -static-libstdc++ -o printf_bug printf_bug.c -Lc:\progs\devlibs61/lib -lpoppler
"C:\Program Files\Dr. Memory\bin\drmemory.exe" -logdir C:/temp/logs -- printf_bug

This logged several errors in libpoppler (see attached)

then

delete printf_bug.exe
mingw32-g++ -O0 -ggdb -static-libgcc -static-libstdc++ -o printf_bug printf_bug.c
"C:\Program Files\Dr. Memory\bin\drmemory.exe" -logdir C:/temp/logs -- printf_bug

this logged no errors.

The error log from the first one is attached but it isn't very useful because libpoppler has no symbols in it.

Note that I also tried running this in the gdb debugger and using "stepi" to move into printf() and sprintf() functions. This was not helpful. "bt" did not show anything about where in poplar execution had gone.

Revision history for this message
David Mathog (mathog) wrote :

Log file for the run of the program when it was not linked with libpoppler.

Revision history for this message
David Mathog (mathog) wrote :

Will one of the devlibs people please build a libpoppler with debugging enabled and repeat this to see what the actual problem is? Thanks.

Revision history for this message
jazzynico (jazzynico) wrote :

@Eduard - Could you please test with the win64 devlibs and confirm you're also affected (or not)? Thanks!

Revision history for this message
Patrick Storz (ede123) wrote :

No, it seems devlibs64 do not show the problem.

When I compile the testcase with devlibs64 (r30) using
"g++ printf_bug.c -o printf_bug -LC:\devlibs64/lib -lpoppler" the output of printf_bug.exe is
val 1.000000
esc 0.000000 cos 1.000000 sin 0.000000 -sin -0.000000 -cos -1.000000
esc 0.000000 cos 1.000000 sin 0.000000 -sin -0.000000 -cos -1.000000
(which I assume is the intended output)

When I compile the testcase with devlibs (r59) using
"g++ printf_bug.c -o printf_bug -LC:\devlibs/lib -lpoppler" the output of printf_bug.exe is
val 0.000000
esc 0.000000 cos 0.000000 sin 0.000000 -sin 0.000000 -cos 0.000000
esc 0.000000 cos 0.000000 sin 0.000000 -sin 0.000000 -cos 0.000000

I obviously didn't follow all the instructions above, so if you need further input let me know.

Revision history for this message
jazzynico (jazzynico) wrote :

The attachment is a poppler-0.41 version compiled on Windows XP with --enable-build-type=fulldebug with TDM-GCC-4.6 and the current win32 devlibs.
I'm a bit too busy to help a lot more for now, but I hope it's going to give some clues.

Note that I've tested the printf_bug.c file with the devlibs-51 (compiled with TDM-GCC-5.1 with very recent dependencies) and the result is unfortunately the same as with the current official win32 devlibs.

Also note that some other libs (ImageMagick, Glib...) are becoming difficult to build on Windows XP and require at least 7 to compile correctly (with no code tweak). So I wonder if our bug could be specific to XP. I should have access to a Windows 7 computer with all dev tools installed in two weeks, so if the problem isn't solved by that time, I can try to compile with it.

Revision history for this message
jazzynico (jazzynico) wrote :

...and the attached libs still fails with the test printf_bug.c file.

Revision history for this message
David Mathog (mathog) wrote :

Keith Marshall on the mingw mailing list provided some key information.

In response to this:

I am still trying to understand how linking in a library _where no function is
ever called_ can change the behavior of the test program.

He replied:

Depending on where it appears in the linking order, it could substitute
an alternative printf() implementation; one which possibly redirects the
call to __mingw_printf(), rather than to __msvcrt_printf(). Or, perhaps
you've also added associated headers, which may activate the feature
test to select __mingw_printf() as default.

So, for inkscape, anything that brings msvcrt in is potentially problematic. It does things Microsoft's way, not the C/C++ standards way. One of the places they differ is that "%lf" in a printf() for MS is the same as "%LF", not
"%f". Since %f and %Lf expect different data types the wrong result is obtained. In other words, even though mingw is using gcc and it can be set to a particular C or C++ language standard there are places where what you get is NOT compliant with that standard. This is one of them. You can see just how illogical the result is by doing this in a mingw window:

gcc -std=gnu89 -o printf_bug printf_bug.c
./printf_bug

gcc -std=c99 -o printf_bug pritnf_bug.c
.printf_bug

On my system the first one works correctly and the second one does not. This breaks "%lf" handling by printf by specifically requesting a language standard that says how it should be handled!

Since at this time it isn't clear we are ever going to understand what is jumping around within mingw to determine which variant of printf() we get, the only safe thing to do is to program around the bug that results.

Be aware that one other place these libraries differ is in the handling of long doubles, where one expects a 64 bit format and the other an 80 bit format. See for instance the 3rd comment in this thread:

https://blogs.msdn.microsoft.com/oldnewthing/20140411-00/?p=1273

There are gcc switches that force results to be stored into memory (as 64 bits) and not accessed from registers (80 bits). On well behaved platforms this gets rid of "excess precision" and can result in numeric tests getting exactly the same answers, which they might not with excess precision. Given mingw's issues with long doubles, these switches may also cause issues in that environment. If anything in devlibs or inkscape uses long doubles and passes the result into one of the printf() functions there could be problems.

Revision history for this message
David Mathog (mathog) wrote :

Jason Crain says here:

https://bugs.freedesktop.org/show_bug.cgi?id=94400

that it may be due to building popper with:

   -D__USE_MINGW_ANSI_STDIO=1

which causes the MSVCRT library to be used instead of the usual mingw one. He notes that adding that define without the library is sufficient to break the test program.

Revision history for this message
jazzynico (jazzynico) wrote :

I can confirm (from config.log) that Poppler was compiled with -D__USE_MINGW_ANSI_STDIO=1 (automatically added when mingw is detected).

Building Poppler-0.41.0 without the flag seems to work correctly (with some format warnings..), and the printf_bug.c test now pass with the updated libs.

Is there a risk that such a change introduces regressions with Inkscape?

I'm going to try inkscape with that libs, but it'll probably require lots of testing.

Changed in inkscape-devlibs:
importance: Undecided → High
status: New → In Progress
assignee: nobody → jazzynico (jazzynico)
Revision history for this message
Patrick Storz (ede123) wrote :

Any idea why the devlibs64 might be unaffected (or how I could verify they're really not affected)?

I assume it's most likely they were compiled with "-D__USE_MINGW_ANSI_STDIO=1", too, unless Partha did some configure-magic when compiling.

Revision history for this message
jazzynico (jazzynico) wrote :

Just to confirm that Inkscape seems to work correctly with the updated Poppler libs. All the EMF files from Bug #1538361 and some local PDF test files all import correctly.

> I assume it's most likely they were compiled with "-D__USE_MINGW_ANSI_STDIO=1", too, unless Partha did some configure-magic when compiling.

I did not find a specific configure option, and had to revert the change applied in https://lists.freedesktop.org/archives/poppler/2013-June/010312.html
But don't hesitate to suggest a more elegant solution.

Revision history for this message
Patrick Storz (ede123) wrote :

I'm still not sure if I understand why this is actually happening... If I understand correctly
- __mingw_printf() is the ISO-compliant version of printf() provided by MinGW
- __msvcrt_printf() MSVCR provides a non-ISO-compliant version of printf().

If I'm not mistaken we want to use __mingw_printf() if possible, right?
The ISO-compliant version is what should also work with EMF/WMF import?

Now from investigating the MinGW header files "-D__USE_MINGW_ANSI_STDIO=1" actually *enables* usage of __mingw_printf(), so this should be what we want after all!

The only explanation I have so far is that either
a) There was a bug in __mingw_printf() at some point or
b) The redefinition makes poppler use __mingw_printf() at some point, but somehow breaks printf() to fall back to the MSVCR at other points in the code.

Either way, it might be due to the old version of mingw that is used in the TDM-GCC the we use for our 32-bit builds (tdm-gcc-4.6.1.exe dates back to 2011-09-23... Were your gcc-5.1 builds affected jazzynico?)
Also a lot of changes were made in mingw-w64 regarding printf(), so 64-bit builds might not be affected at all.

Revision history for this message
jazzynico (jazzynico) wrote :

> Now from investigating the MinGW header files "-D__USE_MINGW_ANSI_STDIO=1" actually *enables* usage of __mingw_printf(), so this should be what we want after all

Except if the mingw version doesn't handle %lf correctly. I'm not sure we use %lf elsewhere in our code.

> Were your gcc-5.1 builds affected jazzynico?

Yes, the poppler libs compiled with TDM-GCC-5.1 are also affected.

Revision history for this message
David Mathog (mathog) wrote :

After a brief sojourn among the helpful denizens of the mingw mailing list I think I can explain the situation.

The short answer is that mingw does not completely conform to the C99 standard.

The long answer is:

For this code:

  double val=1.0;
  printf("%lf\n",val);

(1) With the default language setting the compiler passes "%lf\n" and a 64 bit double to _printf(). If nothing else is specified the linker uses a _printf() from MSVCRT. That library interprets the "%lf" to mean long double, which it takes to be 64 bit. So "1.0" is emitted. Accidentally it is the correct value, with the two mistakes (it isn't long double, long double isn't 64 bits) cancelling each other out.

(2) For the -std=c99 case the compiler still passes "%lf\n" and a 64 bit double, but in this case the function which receives it is provided by mingw. It also interprets "%lf" to mean long double. However it expects the value to be a gnu long double of 80 bits. Consequently it reads 80 bytes instead of 64, which results in the printing of an incorrect value. It can be so wrong that a huge string results, causing a buffer overflow, and a crash.

(3) When poppler is linked to an object prepared as in (1) something carried in the library causes the mingw variant of _printf() to be used and not the MSVCRT one, and again the wrong value is emitted, as in (2).

There is nothing wrong with our code other than it expects a fully C99 (or somewhat earlier) library, and that isn't the case in mingw. When I get a chance I will pull out the printf "%lf" parts, and that should resolve the issue no matter which printf mingw ends up using.

Other things to be aware of with respect to mingw in terms of compatibility with our other platforms:

1. %ll format is not supported. (Presumably %llu too.)

  http://www.mingw.org/wiki/c99

2. long doubles are not supported with printf. At present there is no way to print a gnu 80 bit long double within the mingw environment. Specify a long double and you may or may not get 80 bits for the calculation, depending on the compiler switches. Best to avoid these as long as we are using mingw as a base platform.

   http://stackoverflow.com/questions/7134547/gcc-printf-and-long-double-leads-to-wrong-output-c-type-conversion-messes-u

Keith Marshall in the mingw group says that the fix for the "%lf" problem, making it follow the ANSI standard with -std=c99 instead of whatever Microsoft does, is in their code tree but it isn't released yet. Unclear it that will also fix the "%ll". No fix for the long double issue is expected anytime soon.

Revision history for this message
jazzynico (jazzynico) wrote :

@David - Wow, thanks for your investigations!
So if I understand correctly the bug is not in the devlibs but in the toolchain, with no available fix for now. I suggest we keep the report open because you added very valuable information, but rename it to reflect the real cause of the bug.

Revision history for this message
David Mathog (mathog) wrote :

Renamed it. If that isn't close enough please suggest another title.

summary: - poppler in devlibs 61 breaks printf %lf
+ mingw is not fully ANSI, so %lf can break it
Revision history for this message
jazzynico (jazzynico) wrote :

No problem. Thanks David!

Changed in inkscape-devlibs:
status: In Progress → Triaged
assignee: jazzynico (jazzynico) → nobody
importance: High → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.