No escaping of special characters when reading files (maps, save games) leads to crash

Bug #1530124 reported by Hans Joachim Desserud
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
widelands
Fix Released
Low
Unassigned

Bug Description

I was investigating bug 1526916, when I got the idea to test how files with special characters were treated.

There's two ways to trigger this (can possibly argue it should be seperate bugs, but I reckon the fix would be in a central place. Let me know, and I'll split them out).

A:
1. Start a new game.
2. Click save
3. Enter the following name:
<test>
4. Quit the game to main menu.
5. Attempt to enter the load game menu.

B:
1. Go to the editor.
2. Open the menu and click save map.
3. When entering a map (or directory name), use
<test>
4. Attempt to open the save map or go back to the main menu and attempt to start a new game.

In both cases Widelands will crash with an error message saying non-existing file or directory. Interestingly enough, the file "~/.widelands/save/<test>.wgf" does exist. It looks like Widelands has a problem loading it due to the special characters though, and when deleting it, I had to run `rm \<test\>.wgf` in order to escape the special characters. Presumably there might be similar issues with other special characters which require escaping.

(For anyone testing this, you NEED TO MANUALLY go in and remove the file/directory you just created to get Widelands WORKING AGAIN.)

I have to admit I don't know what the proper fix is, especially regarding cross-platform. I believe some filesystems (like NTFS) are much stricter on which characters can be part of file names, so what might be created and work fine on one system could fail on another. (I have no idea how this affects multiplayer where the host distributes a save game or map to other players, but I reckon this might cause problems as well.) An easy fix would be to "add escaping where needed when loading files", but I don't know whether there's a straight-forward way to do this. Preferably we should utilize some existing library we already depend on here, as it is more likely they have already ironed out all the edge cases.

Widelands r7679 on Ubuntu 15.10.

(Regarding importance: I normally rank crashers at least at medium, but this is an odd edge case, and I can't remember seeing this in the wild before)

Tags: crash

Related branches

Revision history for this message
kaputtnik (franku) wrote :
Revision history for this message
GunChleoc (gunchleoc) wrote :

Good catch!

We can't just test for alphanumeric, because we want to allow Arabic and Chinese etc. file names. If our libraries have nothing usable, we'll need to research a list of disallowed characters for all OSes.

Revision history for this message
TiborB (tiborb95) wrote :

I also think that charactes like /\:?<>{}[], simply should not be part of names.....

Revision history for this message
SirVer (sirver) wrote :

#2: That might be a daunting task. Many OSes are notably sloppy around encoding of filenames and there is no clear superset of supported code points. A common workaround is the WTF8 format [1] that steers around known incompatibilities for as long as possible - but eventually you have to decide if and how you want to accept illegal characters.

[1] https://simonsapin.github.io/wtf-8/

Another solution would be to only provide slots to the user instead of allowing freeform names or to name the save game automatically (i.e. timestamp + map name).

Revision history for this message
wl-zocker (wl-zocker) wrote :

I don't like it when games restrict me in the number of savegames I can have. Furthermore, having as many savegames as I need comes in very handy when I am searching a bug where I need to create multiple savegames during one game.

Also maps are affected. Users want to share their maps with others (for multiplayer or on our website), and I think that having slots or cryptic filenames (map_created_on_31_12_15_11_08) are a bad idea.

My suggestion is to check before saving if the given name is legal. If not, ask the user for another one. When loading a map/savegame whose filename contains an illegal character, don't show it/show it but make it unable to load. This case can only happen when the user has manually renamed the file, and then he cannot expect everything to work fine.

Illegal characters can surely be found in a list somewhere (see #1), surely also for non-latin based languages. If in doubt, we should simply include some more characters. As hjd already mentioned, this is an edge case, so I don't think it's worth spending too much effort on this task.

Revision history for this message
Hans Joachim Desserud (hjd) wrote :

>We can't just test for alphanumeric, because we want to allow Arabic and Chinese etc. file names.

Maybe it's me being naive, but I thought that for instance a regex for alphanumeric characters would recognize/allow characters from any alphabet? (Or does that fall under general encoding problems...)

Revision history for this message
GunChleoc (gunchleoc) wrote :

#6 I don't know, I will have to research this. The term "alphanumeric" can mean a-z, A-Z, 0-9 only, which would exclude accented letters, never mind non-Latin scripts.

I already have some checker functions in bidi.h that I need for the font renderer, which relies on code blocks, maybe there is some reuse potiential there. I will need to do more research.

Revision history for this message
Hans Joachim Desserud (hjd) wrote :

GunChleoc: Ok, I don't know enough about this either. I see that the \w option on https://en.wikipedia.org/wiki/Regular_expression#Examples mentions Unicode characters, but it's not clear to me whether it would cover everything or not.

Revision history for this message
GunChleoc (gunchleoc) wrote :

The crash is actually caused by the font handlers, which interpret <some_text> as tags.

After fixing this, I tried a bunch of special characters on Ubuntu, and they all went through, no problem.

Changed in widelands:
status: New → In Progress
assignee: nobody → GunChleoc (gunchleoc)
milestone: none → build19-rc1
GunChleoc (gunchleoc)
Changed in widelands:
status: In Progress → Fix Committed
assignee: GunChleoc (gunchleoc) → nobody
Revision history for this message
kaputtnik (franku) wrote :
Revision history for this message
GunChleoc (gunchleoc) wrote :

Does your copy of regression_test.py contain datadir_for_testing?

I call single tests like this now:

./widelands --scenario=test/maps/lua_testsuite.wmf --datadir_for_testing=.

./widelands --verbose=true --datadir=data --datadir_for_testing=. --disable_fx=true --disable_music=true --language=en_US --scenario=test/maps/ship_transportation.wmf --script=test/maps/ship_transportation.wmf/scripting/test_rip_first_port_with_worker_in_portdock.lua

Revision history for this message
kaputtnik (franku) wrote :

> Does your copy of regression_test.py contain datadir_for_testing?

Do you asking me? Yes.

> I call single tests like this now:

.> /widelands --scenario=test/maps/lua_testsuite.wmf --datadir_for_testing=.

That works. But if i use

$:> python2.7 regression_test.py -b ./widelands --scenario=test/maps/lua_testsuite.wmf --datadir_for_testing=.

The following appears:

usage: regression_test.py [-h] [-r REGEXP] [-n] [-k] [-b BINARY]
regression_test.py: error: unrecognized arguments: --scenario=test/maps/lua_testsuite.wmf --datadir_for_testing=.

Seems to me that regression_test.py is not able to parse commandline options regarding to the binary (./widelands --..)

Revision history for this message
SirVer (sirver) wrote :

Moved the regression_test problem into a separate bug: https://bugs.launchpad.net/widelands/+bug/1541697

GunChleoc (gunchleoc)
Changed in widelands:
status: Fix Committed → Fix Released
Revision history for this message
GunChleoc (gunchleoc) wrote :

Fixed in build19-rc1.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.