UnicodeDecodeError on stderr from logging non-ascii message

Bug #714449 reported by Alexander Belchenko on 2011-02-07
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Martin Packman
Martin Packman

Bug Description

C:\work\MyCode\Visualisator\Docs>bzr st
  Список требуемых доработок 2011-02-04.doc
  Установка связи.doc

C:\work\MyCode\Visualisator\Docs>bzr ci "Установка связи.doc" -m "выделена фраза про размерность Session ID"
Committing to: C:/work/MyCode/Visualisator/Docs/
Traceback (most recent call last):
  File "logging\__init__.pyo", line 799, in emit
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 80: ordinal not in range(128)
bzr: ERROR: Path(s) are not versioned: "╨г╤Б╤В╨░╨╜╨╛╨▓╨║╨░ ╤Б╨▓╤П╨╖╨╕.doc"

The last error is correct, although the file name printed as UTF-8, but this is already known error (sigh).

I don't understand the traceback part there. Is it really needed?

C:\work\MyCode\Visualisator\Docs>bzr version
Bazaar (bzr) 2.3b5
  Python interpreter: C:\Program Files\Bazaar\python26.dll 2.6.6
  Python standard library: C:\Program Files\Bazaar\lib\library.zip
  Platform: Windows-XP-5.1.2600-SP3
  bzrlib: C:\Program Files\Bazaar\lib\library.zip\bzrlib
  Bazaar configuration: C:\Documents and Settings\modul98\Application Data\bazaar\2.0
  Bazaar log file: C:\work\.bzr.log

Related branches

bzr-core: Pending requested 2011-12-05
Martin Packman (gz) wrote :

There's arguably a logging bug here as well, but this is again another problem with stringifying exceptions that may contain non-ascii data.

The relevant stack from the exception, with Python 2.4 and an oldish bzr:

> ...\lib\logging\__init__.py(740)emit()
-> self.stream.write(fs % msg.encode("UTF-8"))
(Pdb) w
-> exit_val = bzrlib.commands.main()
-> ret = run_bzr_catch_errors(argv)
-> return exception_to_return_code(run_bzr, argv)
-> return the_callable(*args, **kwargs)
-> ret = run(*run_argv)
-> return self.run(**all_cmd_args)
-> exclude=safe_relpath_files(tree, exclude))
-> result = unbound(self, *args, **kwargs)
-> result = WorkingTree3.commit(self, message, revprops, *args, **kwargs)
-> result = unbound(self, *args, **kwargs)
-> possible_master_transports=possible_master_transports,
-> possible_master_transports=possible_master_transports)
-> return _do_with_cleanups(
-> result = func(*args, **kwargs)
-> note("aborting commit write group: %r" % (e,))
-> _bzr_logger.info(*args, **kwargs)
-> apply(self._log, (INFO, msg, args), kwargs)
-> self.handle(record)
-> self.callHandlers(record)
-> hdlr.handle(record)
-> self.emit(record)
> ...\lib\logging\__init__.py(740)emit()
-> self.stream.write(fs % msg.encode("UTF-8"))
(Pdb) msg
'aborting commit write group: PathsNotVersionedError(Path(s) are not versioned: "\xd0\xa3\xd1\x81\xd1\x82\xd0\xb0\xd0\xbd\xd0\xbe\xd0\xb2\xd0\xba\xd0\xb0")'

Changed in bzr:
importance: Undecided → Medium
status: New → Confirmed
Martin Packman (gz) wrote :

Okay, let's focus this bug on the traceback from logging, which is basically the WONTFIXed upstream issue <http://bugs.python.org/issue6991> but we can get around it by making bzrlib smarter about what it gives the logging module as handlers.

description: updated
summary: - try to commit unknown non-ascii file: got a UnicodeError traceback
+ UnicodeDecodeError on stderr from logging non-ascii message
Martin Packman (gz) wrote :

Annoyingly the underlying trigger in commit has gone, as the way "aborting commit write group" is logged has changed. Using repr on other exceptions with non-ascii components would hit the same problem though.

Changed in bzr:
assignee: nobody → Martin Packman (gz)
status: Confirmed → In Progress
Martin Packman (gz) on 2011-12-05
Changed in bzr:
milestone: none → 2.5b4
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers