sed stops replacing when reaching a special character

Bug #447866 reported by lovinglinux
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sed (Ubuntu)

Bug Description

Binary package hint: sed

When filtering a large file (~600.000 lines) with sed, it stops replacing if it encounters a special character.

For example, when using the regular expression below to remove all characters except numbers:

sed -e 's/[^0123456789]//g'

and if the file contains the following line:


the output is:


instead of:


When using the regular expression below to remove all characters before the numbers:

sed -e 's/.*999/Range:/g'

the output is:


instead of:


It only happens with files containing a large number of lines.

I have applied the same regular expression filtering to the same file with perl and the output is perfect.

ProblemType: Bug
Architecture: i386
Date: Sat Oct 10 05:49:02 2009
DistroRelease: Ubuntu 9.10
NonfreeKernelModules: nvidia
Package: sed 4.2.1-1
ProcVersionSignature: Ubuntu 2.6.31-12.41-generic
SourcePackage: sed
Uname: Linux 2.6.31-12-generic i686

Revision history for this message
lovinglinux (lovinglinux) wrote :
Revision history for this message
Paolo Bonzini (bonzini) wrote :

If you don't know the charset of the file, you should set the LANG or LC_CTYPE variables to "C":

$ echo $'AAAA\x88BBBB' | sed -e 's/[^0123456789]//g' | od -x
0000000 0a88
$ echo $'AAAA\x88BBBB' | LANG=C sed -e 's/[^0123456789]//g' | od -x
0000000 000a

This is different from Perl indeed:

$ echo $'AAAA\x88BBBB' | psed 's/[^0123456789]//g' | od -x
0000000 000a

Paolo Bonzini (bonzini)
Changed in sed (Ubuntu):
status: New → Invalid
Revision history for this message
lovinglinux (lovinglinux) wrote :

I don't see why it should be considered invalid, so I'm changing the status back to new.

Changed in sed (Ubuntu):
status: Invalid → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers