coreutils: sort -u with another flag wrongly consider rows starting with non-ascii as duplicates

Bug #1282064 reported by Aapo Rantalainen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
coreutils (Ubuntu)
New
Undecided
Unassigned

Bug Description

Ubuntu 13.10
sort (GNU coreutils) 8.20

Test case:
#!/bin/bash
#create test-data:
#لbar
#äbar

#first word: arabic LAM=ل (d9 84) +bar
echo -en "\xd9\x84" > test.txt
echo "bar" >> test.txt

#second word: scandinavian ä (c3 a4) +bar
echo -en "\xc3\xa4" >> test.txt
echo "bar" >> test.txt

echo sort, sort -u, sort -d,g,i,M,h,n =2
sort test.txt | wc -l
sort test.txt -u | wc -l
sort test.txt -d | wc -l
sort test.txt -g | wc -l
sort test.txt -i | wc -l
sort test.txt -M | wc -l
sort test.txt -h | wc -l
sort test.txt -n | wc -l

echo sort -u with -b,-f,-R,-r,-V =2
sort test.txt -u -b| wc -l
sort test.txt -u -f| wc -l
sort test.txt -u -R| wc -l
sort test.txt -u -r| wc -l
sort test.txt -u -V| wc -l

echo sort -u with -d,-g,-i,-M,-h,-n =1
sort test.txt -u -d| wc -l
sort test.txt -u -g| wc -l
sort test.txt -u -i| wc -l
sort test.txt -u -M| wc -l
sort test.txt -u -h| wc -l
sort test.txt -u -n| wc -l
-----

Acual results:
sort, sort -u, sort -d,g,i,M,h,n =2
2
2
2
2
2
2
2
2
sort -u with -b,-f,-R,-r,-V =2
2
2
2
2
2
sort -u with -d,-g,-i,-M,-h,-n =1
1
1
1
1
1
1

Expected results:
all 2.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.