Can't create UDF with non-ASCII names (valid UTF-8)

Bug #557160 reported by Milan Bouchet-Valat on 2010-04-07
44
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Ubuntu One Client
Status tracked in Trunk
Stable-1-2
High
Guillermo Gonzalez
Trunk
High
Guillermo Gonzalez
ubuntuone-client (Ubuntu)
High
dobey
Lucid
Undecided
Unassigned

Bug Description

Binary package hint: ubuntuone-client

It seems that Ubuntu One fails to share directories with non-ASCII Unicode chars, in my case accentuated French chars ('é'). Choosing "Synchronize with Ubuntu One" from Nautilus doesn't work, while it's fine if I rename the folder to remove the accent.

This may be related to bug 515920, which is about u1sync failing with non-ASCII chars.

The log from ~/.cache/ubuntuone/log/syncdaemon.log doesn't show any activity, even though the directory to sync seems to be acknowledged:

2010-04-07 11:31:23,703 - ubuntuone.SyncDaemon.DBus - DEBUG - Folders.create: dbus.String(u'/home/milan/Cachan/M\xe9moire')
2010-04-07 11:31:23,704 - ubuntuone.SyncDaemon.VM - DEBUG - create udf: '/home/milan/Cachan/M\xc3\xa9moire'
2010-04-07 11:31:23,705 - ubuntuone.SyncDaemon.ActionQueue - DEBUG - CreateUDF share:--- node:marker:/home/milan/Cachan/Mémoire CreateUDF() queueing in the %s META_QUEUE
2010-04-07 11:32:55,805 - ubuntuone.SyncDaemon.Main - NOTE - ---- MARK (state: <State: 'READY' (queues WORKING_ON_BOTH connection 'Not User With Network')>; queues: metadata: 2; content: 4; hash: 0, fsm-cache: hit=634 miss=14799) ----
2010-04-07 11:34:55,805 - ubuntuone.SyncDaemon.Main - NOTE - ---- MARK (state: <State: 'READY' (queues WORKING_ON_BOTH connection 'Not User With Network')>; queues: metadata: 2; content: 4; hash: 0, fsm-cache: hit=634 miss=14799) ----
2010-04-07 11:36:55,805 - ubuntuone.SyncDaemon.Main - NOTE - ---- MARK (state: <State: 'READY' (queues WORKING_ON_BOTH connection 'Not User With Network')>; queues: metadata: 2; content: 4; hash: 0, fsm-cache: hit=634 miss=14799) ----
2010-04-07 11:38:55,805 - ubuntuone.SyncDaemon.Main - NOTE - ---- MARK (state: <State: 'READY' (queues WORKING_ON_BOTH connection 'Not User With Network')>; queues: metadata: 2; content: 4; hash: 0, fsm-cache: hit=634 miss=14799) ---

---------------------------------------------------------------------

TEST CASE:
To test from the terminal, run:

u1sdtool -c
mkdir ~/ÚDF\ Tëst\ éôñßÿç
touch $HOME/ÚDF\ Tëst\ éôñßÿç/test
u1sdtool --create-folder=$HOME/ÚDF\ Tëst\ éôñßÿç

From the GUI:

1. Open Places->Home Folder
2. Click File->Create Folder
3. Name folder: ÚDF Tëst éôñßÿç
4. Double-click on ÚDF Tëst éôñßÿç
5. Right-click and select Create Document->Empty File
6. Name the file "test"
7. Right click on "ÚDF Tëst éôñßÿç" folder and select "Synchronize on Ubuntu One"

Result: https://one.ubuntu.files should show the ÚDF Tëst éôñßÿç folder

---------------------------------------------------------------------

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: ubuntuone-client 1.1.91-0ubuntu1
ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
Uname: Linux 2.6.32-19-generic i686
Architecture: i386
Date: Wed Apr 7 11:39:35 2010
EcryptfsInUse: Yes
PackageArchitecture: all
ProcEnviron:
 LANGUAGE=fr_FR:fr:en_GB:en
 PATH=(custom, user)
 LANG=fr_FR.UTF-8
 SHELL=/bin/bash
SourcePackage: ubuntuone-client
UbuntuOneClientConfig:
 [ubuntuone]
 connected = False
 connect = 1
 show_applet = 0
 bookmarked = True
UbuntuOneSyncdaemonConfig:
 [bandwidth_throttling]
 read_limit = 2097152
 write_limit = 2097152
 on = False
UbuntuOneSyncdaemonExceptionsLog:

Related branches

Milan Bouchet-Valat (nalimilan) wrote :
Changed in ubuntuone-client (Ubuntu):
importance: Undecided → Medium
description: updated
Mitch Towner (kermiac) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. This particular bug has already been reported and is a duplicate of bug 368626, so it is being marked as such. Please look at the other bug report to see if there is any missing information that you can provide, or to see if there is a workaround for the bug. Additionally, any further discussion regarding the bug should occur in the other report. Please continue to report any other bugs you may find.

The issue was non UTF-8 compliant filenames in one or more of the files/folders in your ubuntuone folder. This is a known issue. The workaround is to keep the file names & folder names UTF-8 compliant until this issue is resolved.

You can find invalid filenames using the following script: http://people.canonical.com/~roman.yepishev/ubuntuone-scripts/utf8-filename-check.py

Changed in ubuntuone-client (Ubuntu):
status: New → Confirmed
Milan Bouchet-Valat (nalimilan) wrote :

Sorry, but this is not a duplicate. As I explained in the description, renaming the top directory is enough to fix the problem, and it's in valid UTF-8. I've been using UTF-8 for ages, and all the files in this folder are correctly named:
$ python utf8-filename-check.py
You don't have any filenames with broken names

So this bug is really more of a problem, because it means all non-English scripts fail to synchronize. Sounds strange to me that I'm the only one seeing this.

Milan Bouchet-Valat (nalimilan) wrote :

Though, it's true that something is going wrong with encoding here:
2010-04-06 10:51:50,172 - ubuntuone-preferences - ERROR - [Failure instance: Traceback (failure with no frames): <class 'dbus.exceptions.DBusException'>: org.freedesktop.DBus.Python.KeyError: Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.6/dbus/service.py", line 702, in _message_cb
    retval = candidate_method(self, *args, **keywords)
  File "/usr/lib/python2.6/dist-packages/ubuntuone/syncdaemon/dbus_interface.py", line 1294, in get_info
    mdobj = self.fs.get_by_path(path.encode('utf-8'))
  File "/usr/lib/python2.6/dist-packages/ubuntuone/syncdaemon/filesystem_manager.py", line 549, in get_by_path
    mdid = self._idx_path[path]
KeyError: '/home/milan/.ubuntuone/Purchased from Ubuntu One'
]

But why is ~/.ubuntuone/Purchased from Ubuntu One the problem here, I can't tell.

summary: - Can't sync folders with non-ASCII names
+ Can't sync folders with non-ASCII names (valid UTF-8)
Roman Yepishev (rye) on 2010-04-27
summary: - Can't sync folders with non-ASCII names (valid UTF-8)
+ Can't create UDF with non-ASCII names (valid UTF-8)
Rick McBride (rmcbride) wrote :

Chinese and Japanese characters works fine. This must be a subset of valid UTF-8 characters causing the issue.

Roman Yepishev (rye) on 2010-04-27
Changed in ubuntuone-client (Ubuntu):
assignee: nobody → Roman Yepishev (rye)
Roman Yepishev (rye) wrote :

First item to fix is the udf name decoding in AQ:

CreateUDF.__init__:
# XXX Unicode boundary?
self.name = name.decode('utf8')

However after this is applied the UDF is created on the server properly but notification about new UDF fails with:

2010-04-29 11:21:48,741 - ubuntuone.SyncDaemon.EQ - ERROR - Error encountered wh
ile handling: VM_UDF_CREATED in <ubuntuone.syncdaemon.dbus_interface.EventListen
er object at 0xabcb9ac>
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/ubuntuone/syncdaemon/event_queue.py", line 794, in _dispatch
    method(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/ubuntuone/syncdaemon/dbus_interface.py", line 640, in handle_VM_UDF_CREATED
    self.dbus_iface.folders.emit_folder_created(udf)
  File "/usr/lib/python2.6/dist-packages/ubuntuone/syncdaemon/dbus_interface.py", line 1322, in emit_folder_created
    udf_dict = self._get_udf_dict(folder)
  File "/usr/lib/python2.6/dist-packages/ubuntuone/syncdaemon/dbus_interface.py", line 1250, in _get_udf_dict
    udf_dict[unicode(k)] = unicode(v)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 2: ordinal not in range(128)

Flemming Christensen (laoshi) wrote :

Like Milan I get
$ python utf8-filename-check.py
You don't have any filenames with broken names

But waiting-metadata gives:
Oops, an error ocurred:
Traceback (most recent call last):
Failure: dbus.exceptions.DBusException: org.freedesktop.DBus.Python.UnicodeEncodeError: Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.6/dbus/service.py", line 702, in _message_cb
    retval = candidate_method(self, *args, **keywords)
  File "/usr/lib/python2.6/dist-packages/ubuntuone/syncdaemon/dbus_interface.py", line 204, in waiting_metadata
    waiting_metadata.append(str(cmd))
  File "/usr/lib/python2.6/dist-packages/ubuntuone/syncdaemon/action_queue.py", line 1442, in __str__
    for attr in str_attrs]
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 17: ordinal not in range(128)

And this seems to block synchronization - the offending file, whatever that may be, is not simply ignored

Roman Yepishev (rye) on 2010-05-09
Changed in ubuntuone-client (Ubuntu):
assignee: Roman Yepishev (rye) → Guillermo Gonzalez (verterok)
Roman Yepishev (rye) wrote :

Flemming Christensen (laoshi), you are experiencing bug 561638

Changed in ubuntuone-client (Ubuntu):
importance: Medium → High
Changed in ubuntuone-client:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Guillermo Gonzalez (verterok)
tags: added: chicharra chicharra-maverick
Flemming Christensen (laoshi) wrote :

Not only accents, but other characters provoke the same fault. I have removed accents from filenames, using pyrenamer, and then met "can't encode character u'\xe6'" (which appears to be the Danish character æ) - altered the filenames, and then met "can't encode character u'\xf8' (the character ø) - I have altered filenames and expect to meet "can't encode character u'\xc5'" (the Danish character å) next.
Will follow up on this

tags: added: u1-lucid-sru
Changed in ubuntuone-client (Ubuntu):
milestone: none → lucid-updates
Changed in ubuntuone-client:
status: Confirmed → In Progress
Changed in ubuntuone-client (Ubuntu):
status: Confirmed → New
Changed in ubuntuone-client (Ubuntu):
status: New → Triaged
description: updated
Changed in ubuntuone-client (Ubuntu):
assignee: Guillermo Gonzalez (verterok) → Rodney Dawes (dobey)
Flemming Christensen (laoshi) wrote :

I have added the test case but nothing new happens.
Due to the directories that I try to synchronize the error shows up and blocks sync'ing.
When I get the exception - now: UnicodeEncodeError: 'ascii' codec can't encode character u'\xe6' in position 11: ordinal not in range(128) sync'ing runs to a halt.
I have tried to stop sync'ing of the offending directories by right clicking and chosing stop sync - this text is then greyed out, and apparently nothing else happens.
syncdaemon-exceptions log says
2010-05-20 00:03:47,051 - ubuntuone.SyncDaemon.DBus - ERROR - Unable to handle VM_VOLUME_DELETE_ERROR for volume_id=dbus.String(u'pending')
Since my last post I have been almost continually connected, but nothing happens - u1sdtool --status reports 'processing queues' ad infinutum.

The strange thing is, that æ.ø and å don't seem to have any influence in some instances. In my Documents directory directories with these characters do show up, and in some of them also files whose names contain these characters, but at some point, seemingly in directories within directories the problem arises.

As of now I am only able to proceed with sync'ing after starting anew: unsubscribing and removing folders and sudo rm -rf ~/.local/share/ubuntuone && rm -rf ~/.cache/ubuntuone which is rather tedious

Flemming Christensen (laoshi) wrote :

Today the ascii error does not show up in waiting-metadate. Instead there is a list of Dirs followed by seemingly unending Queries:
.....
 ListDir(share_id=a69fe834-7dd3-4c3d-9370-04e102015c61, node_id=4e9fe015-d2b4-496f-a130-bc5f3a08024a, server_hash=sha1:78c8460d4e8a5ef9abde7e42b94edcb169f01f6b)
 ListDir(share_id=ea22e1fe-de5b-45d8-9fd2-54fb50d468ec, node_id=fb453e65-23ab-4c1c-8a7c-71b47a2188ad, server_hash=sha1:dc0ddf1f8feaf72f6d50b05097408d3cee48ed46)
 ListDir(share_id=ea22e1fe-de5b-45d8-9fd2-54fb50d468ec, node_id=f17b9468-77af-4576-8ed4-d6fe0ffc87a8, server_hash=sha1:461b43d95a7db614f7e48804e60ff738396051de)
 GetPublicFiles
 Query
 Query
 Query
 Query
 Query
 Query
 Query
.....

dobey (dobey) on 2010-05-20
Changed in ubuntuone-client:
status: In Progress → Fix Committed

Accepted ubuntuone-client into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in ubuntuone-client (Ubuntu Lucid):
status: New → Fix Committed
tags: added: verification-needed
Joshua Hoover (joshuahoover) wrote :

Test case passed with 1.2.2-0ubuntu2 proposed installed.

Martin Pitt (pitti) wrote :

Since it was fixed in trunk a while ago, I assume that maverick's current version (1.3.2) has the fix as well. Closing all the maverick tasks.

tags: added: verification-done
removed: verification-needed
Changed in ubuntuone-client (Ubuntu):
status: Triaged → Fix Released
Martin Pitt (pitti) wrote :

ubuntuone-client (1.2.2-0ubuntu1) lucid-proposed; urgency=low

  * New upstream release.
    - Properly handle valid UTF-8 non-ASCII names for UDFs (LP: #557160)
    - Fix nautilus crash when running u1sdtool --subscribe-folder (LP: #570261)
    - Cannot reactivate "File sync" on services tab (LP: #570721)
    - Retry interrupted uploads (LP: #575817)
    - Improve logging at INFO level (LP: #578248)
    - u1sdtool --delete-folder with invalid id hangs (LP: #583412)
    - ubuntuone-syncdaemon crashed with OSError (LP: #452682)
    - ubuntuone-preferences "Got empty result for devices list." (LP: #576263)
  * Rmmove fix-571548.patch and fix-567223.patch; included upstream now.
 -- Rodney Dawes <email address hidden> Wed, 16 Jun 2010 13:42:32 -0400

Changed in ubuntuone-client (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers