pocket-lint

Wrong Line length for lines with Unicode characters.

Bug #889648 reported by Adi Roiban on 2011-11-12

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	pocket-lint	Fix Released	High	Adi Roiban	pocket-lint 0.5.22

Bug Description

Here is an example for an unicode line, that when split it is encoded

>>> initial_line = u'mâț mițișor:x:2000:2010:Mâț Mițișor,,,,:/home/mâț mițișor:/bin/bash\n'
>>> line_encoded = "u'm\xc3\xa2\xc8\x9b mi\xc8\x9bi\xc8\x99or:x:2000:2010:M\xc3\xa2\xc8\x9b Mi\xc8\x9bi\xc8\x99or,,,,:/home/m\xc3\xa2\xc8\x9b mi\xc8\x9bi\xc8\x99or:/bin/bash\\n'"

>>> len(line_encoded)
84
>>> len(line_encoded.decode('utf-8'))
72

----

I am not sure if always converting long lines to Unicode will solve all problems.

Maybe we can change pocket-lint to be smart and convert only if the Python headers contains an unicode declaration.

I will attach a branch with a naive fix and the required tests.

Please let me know how do you think this problem should be solved.

Cheers,
Adi

Related branches

lp:~adiroiban/pocket-lint/bug-889648

Merged into lp:pocket-lint at revision 402

Curtis Hovey: Approve (code) on 2011-11-13

Curtis Hovey (sinzui) on 2011-11-13

Changed in pocket-lint:
milestone:	none → future
status:	New → Fix Committed
importance:	Undecided → High
assignee:	nobody → Adi Roiban (adiroiban)

Curtis Hovey (sinzui) on 2012-01-04

Changed in pocket-lint:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.