expand doesn't support multibyte characters (utf-8)

Bug #535102 reported by Herbert Thielen
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
coreutils
Unknown
Unknown
coreutils (Ubuntu)
Triaged
Low
Unassigned

Bug Description

Binary package hint: coreutils

'expand' doesn't replace tab characters by the correct number of spaces in case the input data contains multibyte characters, e.g. german umlauts when using the locale UTF-8.

A small example test file is attached.

I've attached a patch to coreutils 8.4 which works for me, but I'm not experienced in wide character programming, so there might still be some problem - more tests and review needed.

Thanks for the work, best regards

Herbert.

Revision history for this message
Herbert Thielen (thielen) wrote :
Revision history for this message
Herbert Thielen (thielen) wrote :

sample input file attached

Revision history for this message
Herbert Thielen (thielen) wrote :

expected output file attached

Revision history for this message
Herbert Thielen (thielen) wrote :

upstream bug report is filed at savannah.gnu.org #29138
https://savannah.gnu.org/bugs/index.php?29138

Revision history for this message
C de-Avillez (hggdh2) wrote :

Marking as Confirmed/Low; added upstream task. Generically, coreutils does not work with multi-byte strings.

@Herbert: thank you for opening the bug upstream.

Changed in coreutils (Ubuntu):
importance: Undecided → Low
status: New → Confirmed
tags: added: patch
tags: added: patch-upstreaminput
removed: patch
Daniel Hahler (blueyed)
Changed in coreutils (Ubuntu):
status: Confirmed → Triaged
tags: added: kernel-series-unknown
tags: removed: kernel-series-unknown
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.