coreutils: sort -u with another flag wrongly consider rows starting with non-ascii as duplicates
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
coreutils (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Ubuntu 13.10
sort (GNU coreutils) 8.20
Test case:
#!/bin/bash
#create test-data:
#لbar
#äbar
#first word: arabic LAM=ل (d9 84) +bar
echo -en "\xd9\x84" > test.txt
echo "bar" >> test.txt
#second word: scandinavian ä (c3 a4) +bar
echo -en "\xc3\xa4" >> test.txt
echo "bar" >> test.txt
echo sort, sort -u, sort -d,g,i,M,h,n =2
sort test.txt | wc -l
sort test.txt -u | wc -l
sort test.txt -d | wc -l
sort test.txt -g | wc -l
sort test.txt -i | wc -l
sort test.txt -M | wc -l
sort test.txt -h | wc -l
sort test.txt -n | wc -l
echo sort -u with -b,-f,-R,-r,-V =2
sort test.txt -u -b| wc -l
sort test.txt -u -f| wc -l
sort test.txt -u -R| wc -l
sort test.txt -u -r| wc -l
sort test.txt -u -V| wc -l
echo sort -u with -d,-g,-i,-M,-h,-n =1
sort test.txt -u -d| wc -l
sort test.txt -u -g| wc -l
sort test.txt -u -i| wc -l
sort test.txt -u -M| wc -l
sort test.txt -u -h| wc -l
sort test.txt -u -n| wc -l
-----
Acual results:
sort, sort -u, sort -d,g,i,M,h,n =2
2
2
2
2
2
2
2
2
sort -u with -b,-f,-R,-r,-V =2
2
2
2
2
2
sort -u with -d,-g,-i,-M,-h,-n =1
1
1
1
1
1
1
Expected results:
all 2.