"could not decode image" when path contains non-latin chars
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Hugin |
Confirmed
|
Medium
|
Unassigned |
Bug Description
Hugin fails to add files to project if there non-latin characters (i.e. cyrillic) in path, gives "failed to decode image" message.
Also, when cyrillic letters are used in project file name, they are converted into ISO latin codepage when the file is created. I can supply the screenshots to make the point clear.
Both tested on build 2925 on WinXP

maximtee (maximtee) wrote : | #1 |

Lukas Jirkovsky (l-jirkovsky) wrote : | #2 |
Logged In: YES
user_id=2036447
Originator: NO
Confirming that this is windows only. This is probably more wxwindows fault than hugin, it seems that unicode characters are not properly handled.
2 maximtee: Are you sure that your wxMSW (wxwidgets) supports Unicode?

maximtee (maximtee) wrote : | #3 |
Logged In: YES
user_id=2021593
Originator: YES
I'm using whatever wxwidgets get installed with Hugin build, not sure if it has (or needs to have) specific Unicode support.
I can confirm that both bugs do not exist in "beta 4" released on 2007-02-03 and on "beta 5" released on 2007-05-10 (not sure what SVN builds they are) and can be reproduced on SVN builds at least since SVN2925. All tested on same WinXP system.

maximtee (maximtee) wrote : | #4 |
Logged In: YES
user_id=2021593
Originator: YES
The version is wxMSW-2.8.7, it appears to be the latest stable release and its readme claims that "Unicode is fully supported under Windows NT/2000".

Pablo d'Angelo (pablo.dangelo) wrote : | #5 |
Logged In: YES
user_id=30308
Originator: NO
The problem is that the platform independent core library is written in ordinary C/C++. This means that fopen() calls etc. are used. This is not a problem on linux, where filenames are usually UTF-8 encoded, and thus can be stored safely in 8 bit strings.
But unfortunately, this doesn't work for Windows, where fopen expects 8 bit strings to be in the local codepage, which is not UTF-8. So while the hugin GUI (wxWidgets) can be translated into korean, it can handle only filenames in the local codepage, due to the restriction in the hugin base library. For example, I can use german umlauts (they are in my local codepage), but not cyrillic chars etc.
Fixing this would require using a complete platform abstraction with that respect in the hugin base and all depending libraries (mostly vigra + maybe some image libraries). This is potentially a lot of work and needs quite some thinking to get it right.
Until this happens, hugin does only support filenames that can be encoded with the local codepage.
Pablo

maximtee (maximtee) wrote : | #6 |
Logged In: YES
user_id=2021593
Originator: YES
the bug is still there on SVN2978 on WinXP
In addition to my previous comment in response to Pablo's post, I would like to make a point that both bugs do not exist in "beta 4" released on 2007-02-03 and on "beta 5" released on 2007-05-10.
As I believe this is important enough to be fixed for the next release, I change group to 0.7.0

maximtee (maximtee) wrote : | #7 |
Logged In: YES
user_id=2021593
Originator: YES
I have noticed that there are only 5 refrences to fopen in 3 files (see below). Am I right assuming that these would be the places to start fixing this bug?
Find all " fopen", Subfolders, Find Results 1, "Entire Solution"
D:\huginbase\
D:\huginbase\
D:\huginbase\
D:\huginbase\
D:\huginbase\
Matching lines: 5 Matching files: 3 Total files searched: 168

Pablo d'Angelo (pablo.dangelo) wrote : | #8 |
Logged In: YES
user_id=30308
Originator: NO
Hi Maxim,
Also, all calls in the libraries I use for opening image files etc (since this is C++, ofstream, ifstream are also used, not only fopen) need to be changed. I can't make the base library UTF16 compatible so short before a release. As you mentioned that earlier versions of hugin work with cyrillic characters, I'm trying to recreate the behavior of the older version.
wxWidgets provides different convertors from unicode wxString and designed for specific tasks (separate functions for terminal, gui and filename strings) Some time after beta 4 and before svn 2925 I switched all calls that convert filenames from the generic conversion (wxConvCurrent) to the filename conversion (wxConvFileName). I hoped that this change would make support for characters in the local codepage better, but it seems it had the opposite effect.
I have reverted back to wxConvCurrent (for windows only) in SVN 3066. I hope this allows you to use the cyrillic characters which are probably supported by your local encoding.
Pablo

maximtee (maximtee) wrote : | #9 |
Logged In: YES
user_id=2021593
Originator: YES
Pablo,
thanks for a follow up.
I have tested SVN 3066 and unfortunately it does not solve any of the problems: still get "could not decode image" when adding files with cyrillics in path and when project is saved with cyrillics in file name, it is converted to ISO latin codepage.
standing by for further testing,
Maxim.

Pablo d'Angelo (pablo.dangelo) wrote : | #10 |
Logged In: YES
user_id=30308
Originator: NO
Very strange. I assume that all that worked in SVN 2929 or earlier?
On my german windows machine, I can use umlauts without problems.
What codepage do you use under windows, so that I can try that on my machine?
You can try different wxWidgets conversion settings by changing
#ifdef __WXMSW__
#define HUGIN_CONV_FILENAME (*wxConvCurrent)
to
#define HUGIN_CONV_FILENAME (wxConvLocal)
and see if it makes any difference?
ciao
Pablo

maximtee (maximtee) wrote : | #11 |
Logged In: YES
user_id=2021593
Originator: YES
I was able to reproduce the bug on SVN 2925 and all later builds that I tested.
I'm positive the bug was not there on "beta 4" released on 2007-02-03 and on "beta 5" released on 2007-05-10 (not sure what SVN builds they are).
I'm using WinXP, English version, Regional settings as follows:
(Control Panel/Regional Options) Locale set to: Russian;
(Control panel/Advanced) Language for non-unicode programs: Russian
I believe these settings set OEM codepage 866 and ANSI codepage 1251.
I will try wxConvCurrent patch that you suggested and report here.
I would be eager to try earlier builds to get to the point where the bug is introduced. Can someone hint what SVN was "beta 5" released 2007-05-10? Shall I expect that it will build with the current toolchain?
Maxim.

Pablo d'Angelo (pablo.dangelo) wrote : | #12 |
Logged In: YES
user_id=30308
Originator: NO
A further update:
I have reproduced the bug on a Windows machine. It is not related to the conversion settings, all three variants seem to produce the same result. I have traced the error back to the vigra image import libary, which hasn't changed since ages.
The problem seems to be that the C++ io functions do not like the filenames. fopen() works, but std::ifstream (used by most vigra image decoders) doesn't.
beta 4 and beta 5 were build with MSVC 2003 using a custom MSVC solution, not the one created with CMake. So I believe there might be a problem with the MSVC 2008 build. Maybe it can be solved with different compiler flags, (special options, different runtime library) but I don't know. It would probably be best to test with a very simple c++ program that uses ifstream to open a file and see if the errors also happen there. Hugin uses the static runtime libraries (which is discouraged by microsoft, and they might not be supported that well anymore), so maybe switching to the DLL variants might help (this will also require a complete recompilation of all dependencies! ARGHHHHH!). So a simple test program would be good. I don't have time to follow this issue in more detail right now, but maybe you could test with a test program such as:
#include <ifstream>
#include <stdio.h>
int main()
{
char * filename = "some_filename_
FILE * f = fopen(filename,
if (f) {
printf("OK: fopen %s worked\n", filename);
fclose(f);
} else {
perror("fopen failed");
}
std::ifstream fin2(filename, std::ios::binary);
if (!fin2.good()) {
printf(
} else {
printf("OK: std::ifstream in binary mode\n");
}
}
Try to compile it with different runtime libs (hugin uses /MT and /MTd), and see if it makes a difference.
Pablo

nobody (nobody-users) wrote : | #13 |
Logged In: NO
I did some more testing on a Windows XP machine set to Russian encoding:
(Control panel/Advanced) Language for non-unicode programs: Russian
using the simple program below, and it works nicely with russian filenames when compiled with MSVC 2003 (can open the file with both fopen and std::ifstream), but the MSVC 2008 executable fails to open the files using std::ifstream (fopen still works). Looks like a bug in MSVC 2008.
Unfortunately I can't build the windows version with MSVC 2003 right now, as CMake refuses to work properly on the machine I have installed MSVC 2003. Additionally I'll have to recompile most dependencies with MSVC 2003 again first (which is a major pain in the ass!).
Here is the simple test program to reproduce the bug:
#include <fstream>
#include <stdio.h>
int main(int argc, char * argv[])
{
char * filename = argv[1];
printf("Trying to open file %s\n", filename);
FILE * f = fopen(filename,
if (f) {
printf("OK: fopen %s worked\n", filename);
fclose(f);
} else {
perror("fopen failed");
}
std::ifstream fin2(filename, std::ios::binary);
if (!fin2.good()) {
printf("FAILED: std::ifstream in binary mode\n");
} else {
printf("OK: std::ifstream in binary mode\n");
}
}

fmannan (fmannan) wrote : | #14 |
Logged In: YES
user_id=1048760
Originator: NO
Attachment contains: 1) Hugin and 2) test code
Built using MSVC 2008 Professional Edition.
File Added: 1908349_test.7z.001

fmannan (fmannan) wrote : | #15 |
- RE: "could not decode image" when path contains non-latin chars Edit (244.1 KiB, application/octet-stream)
The file 1908349_test.7z.001 was added: Built using MSVC 2008 Professional Edition

fmannan (fmannan) wrote : | #16 |
Logged In: YES
user_id=1048760
Originator: NO
File Added: 1908349_test.7z.002

fmannan (fmannan) wrote : | #17 |
- RE: "could not decode image" when path contains non-latin chars Edit (244.1 KiB, application/octet-stream)
The file 1908349_test.7z.002 was added: None

fmannan (fmannan) wrote : | #18 |
Logged In: YES
user_id=1048760
Originator: NO
File Added: 1908349_test.7z.003

fmannan (fmannan) wrote : | #19 |
- RE: "could not decode image" when path contains non-latin chars Edit (244.1 KiB, application/octet-stream)
The file 1908349_test.7z.003 was added: None

fmannan (fmannan) wrote : | #20 |
Logged In: YES
user_id=1048760
Originator: NO
File Added: 1908349_test.7z.004

fmannan (fmannan) wrote : | #21 |
- RE: "could not decode image" when path contains non-latin chars Edit (244.1 KiB, application/octet-stream)
The file 1908349_test.7z.004 was added: None

fmannan (fmannan) wrote : | #22 |
Logged In: YES
user_id=1048760
Originator: NO
File Added: 1908349_test.7z.005

fmannan (fmannan) wrote : | #23 |
- RE: "could not decode image" when path contains non-latin chars Edit (244.1 KiB, application/octet-stream)
The file 1908349_test.7z.005 was added: None

fmannan (fmannan) wrote : | #24 |
Logged In: YES
user_id=1048760
Originator: NO
File Added: 1908349_test.7z.006

fmannan (fmannan) wrote : | #25 |
Logged In: YES
user_id=1048760
Originator: NO
File Added: 1908349_test.7z.007

fmannan (fmannan) wrote : | #26 |
- RE: "could not decode image" when path contains non-latin chars Edit (244.1 KiB, application/octet-stream)
The file 1908349_test.7z.006 was added: None

fmannan (fmannan) wrote : | #27 |
- RE: "could not decode image" when path contains non-latin chars Edit (244.1 KiB, application/octet-stream)
The file 1908349_test.7z.007 was added: None

fmannan (fmannan) wrote : | #28 |
Logged In: YES
user_id=1048760
Originator: NO
File Added: 1908349_test.7z.008

fmannan (fmannan) wrote : | #29 |
- RE: "could not decode image" when path contains non-latin chars Edit (217.5 KiB, application/octet-stream)
The file 1908349_test.7z.008 was added: None

tksharpless (tksharpless-users) wrote : | #30 |
Logged In: YES
user_id=1511901
Originator: NO
I used MSVC 2005 Standard Edition to build the test program, which eventually took the form below. It shows the same failures as one built with 2008 Express -- see comments with code.
I think this is not really a bug, though it definitely is a limitation of Windows due to the need to support codepage-based character sets. The best solution is clearly to pass only unicode file names to fstream c'tors, then the codepage issues disappear.
--Tom
/* RussianBug.cpp
Test program to invsetigate a bug reported in hugin where
std::ifstream fails to open files with Cyrillic names.
Bug tracker link:
http://
Original by Pablo d'Angelo and MaxTee, who reported the bug
and says..
"
I'm using WinXP, English version, Regional settings as follows:
(Control Panel/Regional Options) Locale set to: Russian;
(Control panel/Advanced) Language for non-unicode programs: Russian
I believe these settings set OEM codepage 866 and ANSI codepage 1251.
"
TKS mods:
Report the effective codepage number.
Try ifstream() with 4 filename flavors: argv[]; argv[] translated
to unicode with current codepage and with Russian one; same
argument read from commandline as unicode.
TKS findings:
The global Windows codepage setting has no effect on this program.
The codepage it reports is the one set in the "advanced" option for
non-unicode programs.
That codepage determines whether the commandline arguments are read
correctly into argv[], and apparently also whether ifstream() can
translate them correctly into unicode (which is the eventual format
passed to the OS). If they are not read correctly, unknown chars get
replaced by '?', and translating to unicode fails.
The ifstream c'tor has some polymorphous ability to accept either ANSI
or unicode filename arguments; however the ANSI ones must be supported
by the effective codepage.
When translation fails, the eventual result is a "file not found" error
from the OS -- nobody notices the string format problem.
If the commandline is read as unicode, then the special codepage
is not needed, and ifstream( unicode_name ) always suceeds.
*/
#include <fstream>
#include <stdio.h>
#include <windows.h>
int main(int argc, char * argv[])
{
unsigned codepage = GetACP();
printf("\nThe current Windows code page is %d\n", codepage);
/* read the command line as Unicode */
int Wargc;
LPWSTR * Wargv = CommandLineToArgvW( GetCommandLine(), & Wargc );
for(int i = 1; i < argc; i++ ){
printf("\n(ANSI) argv[%d] is '%s'\n", i, argv[i]);
printf("Targv = argv translated to Unicode with current codepage\n" );
printf("Rargv = argv translated to Unicode with Russian codepage\n" );
printf("Wargv = argv read as Unicode\n");
printf("\n");
wchar_t Targv[200];
int k = MultiByteToWide
CP_ACP, // (use current codepage),
0, // DWORD dwFlags,
argv[i], // LPCSTR lpMultiByteStr,
-1, // (is null terminated)
Targv, // LPWSTR lpWideCharStr,
200 // int cchWideChar
);
wchar_t Rargv[200];
k = ...

maximtee (maximtee) wrote : | #31 |
Logged In: YES
user_id=2021593
Originator: YES
Tom,
Your suggestion to pass only unicode file names to fstream c'tors seems like the way to move forward.
Problem is, I'm not fluent enought with the codebase to undertake the patch all by my own.
Any suggestion on how to tackle this would be appreciated,
Max.

tksharpless (tksharpless-users) wrote : | #32 |
Logged In: YES
user_id=1511901
Originator: NO
Hi Max
Although the best (and I think finally inevitable) solution is to make hugin use only Unicode strings, that will be hard, as Pablo has pointed out. There is still a lot of code that assumes ANSI strings. Newer code uses UTF-8 Unicode, which can "pass" for ANSI so long as one only uses chars in the current codepage; but in cases such as this one the masquerade breaks down.
Finding and fixing all the ANSI code would be a long, mainly thankless task needing very good debugging tools and plenty of motivation. It may be, as Pablo suggests, that on Linux we will eventually be able to build old ANSI code for UTF-8, without actually changing it; but I don't think that will ever be possible on Windows.
So actually the only practical solution for the present bug is to work around it by using ANSI filenames. However, that need not mean changing your Cyrillic names, because every NTFS file does have an ANSI name: the hidden DOS (8.3) name, which you can see with the command "dir /x". That is supposed to be a valid alternative name for the file in all situations, so it might work here.
Regards, Tom
Changed in hugin: | |
status: | New → Triaged |
tags: | added: codepage hugin international |

Vladimír Jícha (jech) wrote : | #34 |
This bug is in Hugin for a very long time, at least since I started using it 4 years ago. I just tried version 2010.4 RC1 and it fails right at the beginning when using CPFind. The root problem is, that some of the tools that Hugin uses do not support unicode filenames on Windows.
There is a quick a dirty fix which I already suggested - to use short paths. Also the user should be at least warned about the problem, if he tries to open or save such files.
I think that using .bat files on Windows is not a good idea in the first place. Windows shell is really bad and should be avoided if possible.
Thank you for testing 2010.4 and confirming that this is still an issue. Unfortunately it won't be fixed for this release. Please keep using the workaround (avoid non-latin chars) until a fix is contributed and applied.
Changed in hugin: | |
status: | Triaged → Confirmed |
tags: | added: windows |

Guy Rutenberg (guyrutenberg) wrote : | #36 |
This bug also affects Linux (at least Debian Jessie, with UTF-8 locale) when path contains non-ascii characters. A possible workaround is to save the project and then stich using the Batch Processor which, surprisingly, isn't affected by it.

Guy Rutenberg (guyrutenberg) wrote : | #37 |
I did some more debugging on this issue. The failure stems from the call to wxExecute(...) in PanoPanel:
The solution would either be to replace the call to wxExecute, or wait for wxWidgets to fix it.

Guy Rutenberg (guyrutenberg) wrote : | #38 |
- 0001-Fix-handling-on-non-ascii-paths-when-stitching.patch Edit (2.6 KiB, text/plain)
I've attached a patch that seems to fix the bug (at least for me). It works by avoiding the buggy argument splitting of wxWidgets, and instead relying on passing an array of arguments to wxExecute after correctly handling wide-char to multibyte char conversion if needed.
Logged In: YES
user_id=2021593
Originator: YES
Tested on SVN2948 on WinXP -- the bug is still there.