Rhythmbox shouldn't set illegal / special characters on filenames (i.e. "?"; question mark) when ripping to NTFS volumes

Bug #318625 reported by John Ward
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
rhythmbox (Ubuntu)
Incomplete
Low
Ubuntu Desktop Bugs

Bug Description

Whenever a song name has a special character like a "question mark" (?) in it (for example, "Where is my Mind?" or "Are You In?"), Rhythmbox rips the song to the drive and names the file with the special character, for example:

"Where is my Mind?.mp3" or "Are You In?.mp3"

On NTFS volumes (and probably FAT32 and lower) these characters (such as "?"; question mark) are illegal and make the file inaccessible and useless within Windows because it doesn't recognise or read files with illegal characters. It becomes impossible the deletion, renaming or reading of the file within Windows systems.

TEST CASE:

[1] - Rip a song from a CD using Rhythmbox that has a special character in its name like a "?" question mark (Where is my Mind?) and make sure it rips to an NTFS volume.

[2] - Exit Linux, go into a Windows system and try and play, delete or rename the file that has the special character in it. You'll find that you get an error message about the usage of illegal characters anytime you try and delete or rename the file and the file will not be playable by programmes within Windows that are designed to read it, such as Windows Media Player, Winamp or VLC etc.

[3] - Conclusion: When illegal characters are set on files by Rhythmbox (or Linux in general) on NTFS volumes, those files become completely inaccessible within Windows. You have to to return to Linux to delete, rename or read / play the files.

Maybe this is an issue for the NTFS-3g driver, it should prevent the incorrect naming of files that go onto NTFS volumes.

There should probably be an option within Rhythmbox to strip special characters from the filename (not necessarily the tag, as there they are compatible) or replace them with something compatible.

This above proposal is similar to bug # 264283.

John Ward (automail)
description: updated
Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Can you set these illegal filenames in Nautilus?

Changed in rhythmbox:
status: New → Incomplete
Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Actually, you don't need to answer that. I'm going to assign to ntfs-3g now as this almost certainly isn't a problem in Rhythmbox itself.

Changed in rhythmbox:
status: Incomplete → New
Revision history for this message
John Ward (automail) wrote :

Although I don't need to answer, it's true, I have no problem with creating files using Nautilius and putting in illegal characters, so this is a broader problem with the NTFS driver or kernel control over NTFS volumes.

Revision history for this message
Szabolcs Szakacsits (szaka) wrote :

This is definitely not an NTFS-3G problem. The '?' is a perfectly legal NTFS character in the POSIX namesspace NTFS-3G uses: http://ntfs-3g.org/support.html#posixfilenames2

affects: ntfs-3g (Ubuntu) → rhythmbox (Ubuntu)
Revision history for this message
John Ward (automail) wrote :

The fact is " ? " shouldn't be a legal character as it can't be used within Explorer or Command Prompt in Windows (XP anyway).

The following characters are illegal in Windows (under XP at least), see the screenshot too:

\ / : * ? " < > |

Revision history for this message
John Ward (automail) wrote :
Revision history for this message
Pedro Villavicencio (pedro) wrote :

Thanks for your report, that's something to send directly upstream at http://bugzilla.gnome.org , for forwarding instructions please read http://wiki.ubuntu.com/Bugs/Upstream/GNOME, Thanks in advance.

Changed in rhythmbox (Ubuntu):
assignee: nobody → Ubuntu Desktop Bugs (desktop-bugs)
importance: Undecided → Low
Revision history for this message
Nigel Babu (nigelbabu) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. We are sorry that we do not always have the capacity to look at all reported bugs in a timely manner. There have been many changes in Ubuntu since that time you reported the bug and your problem may have been fixed with some of the updates. If you could test the current Ubuntu development version, this would help us a lot. If you can test it, and it is still an issue, we would appreciate if you could upload updated logs by running apport-collect 318625, and any other logs that are relevant for this particular issue.

Changed in rhythmbox (Ubuntu):
status: New → Incomplete
Revision history for this message
John Ward (automail) wrote :

No problem, I'll try and replicate on Ubuntu 9.10 64-bit soon.

Revision history for this message
John Ward (automail) wrote :

Sorry about the delay on this, I've been having a few other issues with Karmic and have been mostly using Windows 7 for the time being, but I'm going for a fresh install of Karmic 9.10 64-bit tonight and I'll add a large NTFS partition to the drive as well, test out this problem.

Revision history for this message
John Ward (automail) wrote :

I can still confirm this bug.

I have installed Karmic 9.10 64-bit, updated fully (Rhythmbox is 0.12.5) and it still creates "?" in a song name. Another test was just to create files on the NTFS partition with "?" question marks and ":" colon's in the name using Nautilus and nano on the terminal. I can do this without problem, even though a Window's system would not be able to manipulate them in an any way.

Revision history for this message
John Ward (automail) wrote :

Obviously this is only a big deal if you're dual booting with another Windows system, but at the same time you wouldn't necessarily create an NTFS filesystem on a Linux only box (you would more than likely do a /home of ext3 or 4, or just a large ext3 or 4 partition). So the fact is that this problem will occur mainly with people who are also sharing there NTFS partition with a Windows system.

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

I'm not convinced that Rhythmbox should be special-casing certain filesystems here. It seems like the wrong place to do it. Shall we also add special cases for FAT/NTFS filesystems in Totem, gedit, nautilus, f-spot, GIMP, banshee, pitivi, openoffice etc?

Revision history for this message
John Ward (automail) wrote :

I'm not confident enough to propose elaborate solutions to these problems, especially at the technical end of things. I do agree that Rhythmbox isn't the problem here, although one of the simplest solutions would be to make it simple to enable "special character stripping". The problem is in the NTFS-3G driver, I think. It shouldn't impose full POSIX compatibility on a filesystem that is associated with an OS that isn't POSIX compliant. The developer suggests it's needed for full flexibility and interoperability:

From: http://www.tuxera.com/community/ntfs-3g-faq/#posixfilenames2
"NTFS supports several filename namespaces at the same time: DOS, Win32 and POSIX. While the NTFS-3G driver handles all of them, it always creates new files in the POSIX namespace for maximum portability and interoperability reasons. This means that filenames are case sensitive and all characters are allowed except ‘/’ and ”. This is perfectly legal on Windows, though some application may get confused. If you find so then please report it to the developer of the relevant Windows software."

His first suggestion that it's for "maximum portability and interoperability" reasons is instantly quashed by the fact that the files are unusable and completely incompatible within Windows.

You'll notice he says towards the end "This is perfectly legal on Windows, though some applications may get confused... please report it to the developer...". Windows Explorer is the first thing that get's confused and the command prompt can't manipulate it either, not just some Windows application, the Windows system. I have not tested compatibility right through Vista and 7, but it's definitely apparent in XP and I assume those two as well. The NTFS-3G driver has got to prevent the use of special characters in the same way that Nautilus prevents forward slashes being used on ext4 partitions.

The way I see it is, preventing the characters is a much safer option than allowing.

Revision history for this message
Rygle (rygle) wrote :

Isn't the whole idea of NTFS-3G to allow proper and full compatibility with the *Windows* implementation of NTFS? Isn't it to allow true interoperability? Thus, given that Windows considers certain characters to be illegal should mean that they are treated as illegal for NTFS drivers on other platforms, purely based on the premise that it is about full interoperability.

Yes, the fact that Windows doesn't like certain characters in its implementation of NTFS is likely an aberration from the POSIX namespace guidelines, but then it is the *reference implementation*.

Additionally, consider that there are many exceptions to rules within the Win32 API. The WINE project is constantly working around exceptions that are undocumented and/or that go against existing documentation on aspects of the API.

Making workarounds in every little app is redundant and counter productive when it could and, in my view, should be solved by making this simple change in NTFS-3G.

Revision history for this message
John Ward (automail) wrote :

I'm tempted to email this bug report to the NTFS-3G developer and possibly see some action on it. It's obviously not a major bug but it's been like this about 2 years now (probably longer) and when it comes to dealing with it inside a Windows OS it's a serious problem. The files are essentially dead. No deletion, renaming, copying or moving is possible.

Revision history for this message
Rygle (rygle) wrote :

A follow-up on my last comment.

Please note that Wine has a "bug-for-bug" policy when it comes to working with the windows API.
See http://wiki.winehq.org/WineFeatures

In other words, if the actual implementation disagrees with the documentation, then the implementation wins. There is a lot of sense to this policy, as the end goal is full interoperability, or "Binary Compatibility" as mentioned on the page I have referenced.

This should be the same with the NTFS-3G driver. If the NTFS documentation says full POSIX namespace support, but the NTFS implementation differs, the implementation should win. In our case, there are characters that are legal in the documentation, even if it is not Microsoft's documentation, but every known official implementation of NTFS does not allow full POSIX namespace support.

For the sake of full interoperability, this bug should not be fixed in individual apps, but in the NTFS-3G code, just like similar issues are dealt with on a "bug-for-bug" basis in the Wine code.

Revision history for this message
Rygle (rygle) wrote :

Some good news on this bug. I put up a comment on the NTFS-3G forums and got a very positive reply from Jean-Pierre, one of the lead programmers for NTFS-3G;

I asked...
---------
Could you please include a switch in NTFS-3G to allow a workaround for this Windows limitation? (i.e. to turn off use of special characters in order to allow compatibility with Windows for the many users who wouldn't have a clue how to re-mount/export their NTFS filesystems using Samba)
---------

And Jean-Pierre replied...
---------
This is available in the release candidate advanced ntfs-3g-2010.5.16AR.1, see http://pagesperso-orange.fr/b.andre/advanced-ntfs-3g.html
The option to use is "windows_names", documented in the manual of the said version.
---------

Hopefully this will help a lot in this area, but it would need to be integrated into Ubuntu by someone cleverer than me...

This can be found on the Tuxera site - http://tuxera.com/forum/viewtopic.php?f=2&t=763&p=14549&sid=053f5d02fb19e22aeda1019f6225c76d#p14549

Revision history for this message
Rygle (rygle) wrote :

BTW, this issue is possibly a duplicate of another Ubuntu bug. I have also commented about this development there.
https://bugs.launchpad.net/ubuntu/+bug/230906

And also in the Ubuntu forums here;
http://ubuntuforums.org/showthread.php?p=9424314

Revision history for this message
John Ward (automail) wrote :

I'll make it a duplicate as the other bug is more precise to the point of the overall NTFS-3G problem, whereas this is too specifically about Rhythmbox and the problem clearly lies outside of Rhythmbox.

Revision history for this message
Brian Eaton (brian-speakingincode) wrote :

As a hint to others who stumble across this bug: this same problem comes up on CIFS shares. The work around there is to mount the share with the "mapchars,iocharset=utf8" options.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.