Mawk does not support Posix character classes in expressions

Bug #69724 reported by Sam Trenholme on 2006-11-01
60
This bug affects 11 people
Affects Status Importance Assigned to Milestone
mawk (Debian)
Confirmed
Unknown
mawk (Ubuntu)
Wishlist
Unassigned

Bug Description

Binary package hint: mawk

Mawk does not support Posix character classes, such as [:upper:] and [:lower:], in regular expressions. This makes it more difficult to write portable Awk scripts (since [A-Z] match lower-case characters in Gawk using non-English locales)

E.G.:

$ echo x | mawk '/[[:lower:]]/'
$ echo x | gawk '/[[:lower:]]/'
x
$

Attached is a patch to fix this issue.

The patch to fix this issue.

- Sam

description: updated
Jean-Baptiste Lallement (jibel) wrote :

Thanks for your report.

We are confirming this because from the mawk manual "AWK uses extended regular expressions as with egrep" and character classes are defined in POSIX.2 which is said to be supported by mawk.

Changed in mawk:
status: New → Confirmed
Thomas Dickey (dickey-his) wrote :

One problem with the patch is that it's hardcoded, doesn't
use the locale information. I've written an alternate form which
is in the 20090727 version of mawk at
    http://invisible-island.net/mawk/

Changed in mawk (Debian):
status: Unknown → Confirmed
Erick Brunzell (lbsolost) wrote :

I've been performing testing of new versions of the Boot Info Script for it's developers and received the following input:

"mawk (default in Debian/Ubuntu) doesn't work for some of the new code:
- can't extract embedded grub4dos config file
- calculates wrong values for GiB/GB stuff (filefrag -v)"

It sounds like we can work around this using gawk but it was suggested that we might consider this newer version of mawk for Natty:

http://invisible-island.net/mawk/

Is something like that possible?

One of the developers of the Boot Info Script has confirmed that, "This mawk version, don't seem to have a problem with my current code". You can read info about the Boot Info Script here:

http://bootinfoscript.sourceforge.net/

It's proven to be an invaluable tool in diagnosing boot problems, particularly since the introduction of grub 2.

There is no indication ATM that Debian is headed in that direction:

http://packages.debian.org/search?keywords=mawk

Erick Brunzell (lbsolost) wrote :

RE my last post. The Boot Info Script developer I spoke of has filed a new bug report with a lot of information at bug 716920.

The mawk version available on: http://invisible-island.net/mawk/
solves this problem (and several other mawk 1.3.3 bugs too).

$ # mawk 1.3.3 of Ubuntu 10.10
$ echo x | mawk '/[[:lower:]]/'

$ # mawk 1.3.4
$ echo x | ../mawk-1.3.4-20100625/mawk '/[[:lower:]]/'
x

$ echo x | gawk '/[[:lower:]]/'
x

Daniel Miller (bonsaiviking) wrote :

This bug affects the bash-completion package. See related bug: https://bugs.launchpad.net/ubuntu/+source/bash-completion/+bug/778679

Jerome Potts (jerome-potts) wrote :

Why are we still using the old 1.3.3 version, and not Thomas Dickey's 1.3.4 ?

i30817 (i30817) wrote :

why indeed

Andreas Hasenack (ahasenack) wrote :

The bash completion bug was fixed.

I'm inclined to say that if character classes are needed, gawk should be used. It looks like mawk has stalled, and the call to switch to Thomas Dickey's fork isn't mine to make. We are following debian on this one.

Changed in mawk (Ubuntu):
importance: Undecided → Wishlist
status: Confirmed → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.