Bazaar

Logging with a non-ascii string raises UnicodeError in the standard logging package

Bug #317779 reported by philyoon on 2009-01-16

10

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Bazaar	Confirmed	Low	Unassigned

Bug Description

This is due to the fact that logging.StreamHandler.emit method tries to encode the input with the 'encoding' attribute of the stream. I think you should not use StreamWriter as a file-like object, since StreamWriter isn't intended to accept str inputs. The fix for this issue follows:

=== modified file 'bzrlib/trace.py'
--- bzrlib/trace.py 2009-01-08 17:53:46 +0000
+++ bzrlib/trace.py 2009-01-16 07:39:12 +0000
@@ -269,9 +269,7 @@
         r'%Y-%m-%d %H:%M:%S')
     # after hooking output into bzr_log, we also need to attach a stderr
     # handler, writing only at level info and with encoding
- writer_factory = codecs.getwriter(term_encoding)
- encoded_stderr = writer_factory(sys.stderr, errors='replace')
- stderr_handler = logging.StreamHandler(encoded_stderr)
+ stderr_handler = logging.StreamHandler(sys.stderr)
     stderr_handler.setLevel(logging.INFO)
     logging.getLogger('bzr').addHandler(stderr_handler)

Tags:

Revision history for this message

philyoon (philyoon) wrote on 2009-01-16:

#1

diff.txt Edit (670 bytes, text/plain)

Whitespaces in the description seem to be merged. Attaching the diff file.

Revision history for this message

methane (songofacandy) wrote on 2009-12-01:

#2

It may be happen because non-ascii byte string was logged.
All of non-ascii string should be unicode str.

Revision history for this message

Martin Pool (mbp) wrote on 2009-12-02:

#3

I'm trying to get rid of this use of logging in https://code.edge.launchpad.net/~mbp/bzr/remove-logging so this bug should be taken into account there, and vice versa.

Revision history for this message

John A Meinel (jameinel) wrote on 2010-02-11:

#4

Martin, do you have an idea of the status of this bug now that some of your logging changes have landed?

Changed in bzr:
importance:	Undecided → Low
status:	New → Incomplete

Revision history for this message

Martin Pool (mbp) wrote on 2010-02-11:

#5

I wouldn't assume it's fixed because not all of that work has landed yet. We do still using logging.

Changed in bzr:
status:	Incomplete → Confirmed
tags:	added: logging trace unicode

Martin Pool (mbp) on 2010-03-15

Changed in bzr:
assignee:	nobody → Martin Pool (mbp)
status:	Confirmed → In Progress

Revision history for this message

Martin Pool (mbp) wrote on 2010-03-30:

#6

You can reproduce this from a Python shell with

>>> trace.note(u"\u68ee".encode("utf-8"))
Traceback (most recent call last):
File "/usr/lib/python2.6/logging/__init__.py", line 791, in emit
stream.write(fs % msg.encode("UTF-8"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)

just emitting a unicode object does work, and it gets converted according to the error handling of the converter:

>>> trace.note(u"\u68ee")
?

With this patch applied, in a C locale, you cannot log unicode strings, but you can log non-ascii byte strings and they get passed through as such:

In [10]: trace.note(u"\u68ee hello")
Traceback (most recent call last):
File "/usr/lib/python2.6/logging/__init__.py", line 791, in emit
stream.write(fs % msg.encode("UTF-8"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 0: ordinal not in range(128)

In [11]: trace.note(u"\u68ee hello".encode('utf-8'))
森 hello

I'm not sure that's really an improvement though. I think generally we want the input to the log functions to be unicode strings, and for them to have errors=replace treatment going to the log file or stderr.

Can you tell me more about the context where you hit this?

Changed in bzr:
assignee:	Martin Pool (mbp) → nobody
status:	In Progress → Confirmed

Jelmer Vernooij (jelmer) on 2017-11-09

tags:

added: check-for-breezy

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

Bug #397562

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

diff.txt Edit

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.