gawk crashes when given too big regex group index
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
gawk |
Fix Released
|
Undecided
|
Unassigned | ||
gawk (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
I have "gawk" version 1:3.1.6.
Running:
echo "abc" | valgrind gawk '{ print gensub(/(.)b(.)/, "\\4", 1)}'
==20299== Invalid read of size 4
==20299== at 0x410ECC: (within /usr/bin/gawk)
==20299== by 0x4113E9: do_gensub (in /usr/bin/gawk)
==20299== by 0x43D4EB: r_tree_eval (in /usr/bin/gawk)
==20299== by 0x412EF1: do_print (in /usr/bin/gawk)
==20299== by 0x43BC05: interpret (in /usr/bin/gawk)
==20299== by 0x43B955: interpret (in /usr/bin/gawk)
==20299== by 0x428070: do_input (in /usr/bin/gawk)
==20299== by 0x429CE0: main (in /usr/bin/gawk)
==20299== Address 0x56481c8 is 0 bytes after a block of size 16 alloc'd
==20299== at 0x4C278AE: malloc (vg_replace_
==20299== by 0x439941: (within /usr/bin/gawk)
==20299== by 0x439C77: re_search (in /usr/bin/gawk)
==20299== by 0x42C120: research (in /usr/bin/gawk)
==20299== by 0x41055E: (within /usr/bin/gawk)
==20299== by 0x4113E9: do_gensub (in /usr/bin/gawk)
==20299== by 0x43D4EB: r_tree_eval (in /usr/bin/gawk)
==20299== by 0x412EF1: do_print (in /usr/bin/gawk)
==20299== by 0x43BC05: interpret (in /usr/bin/gawk)
==20299== by 0x43B955: interpret (in /usr/bin/gawk)
==20299== by 0x428070: do_input (in /usr/bin/gawk)
==20299== by 0x429CE0: main (in /usr/bin/gawk)
==20299==
==20299== Invalid read of size 4
==20299== at 0x410EDB: (within /usr/bin/gawk)
==20299== by 0x4113E9: do_gensub (in /usr/bin/gawk)
==20299== by 0x43D4EB: r_tree_eval (in /usr/bin/gawk)
==20299== by 0x412EF1: do_print (in /usr/bin/gawk)
==20299== by 0x43BC05: interpret (in /usr/bin/gawk)
==20299== by 0x43B955: interpret (in /usr/bin/gawk)
==20299== by 0x428070: do_input (in /usr/bin/gawk)
==20299== by 0x429CE0: main (in /usr/bin/gawk)
==20299== Address 0x5648208 is 0 bytes after a block of size 16 alloc'd
==20299== at 0x4C278AE: malloc (vg_replace_
==20299== by 0x43994E: (within /usr/bin/gawk)
==20299== by 0x439C77: re_search (in /usr/bin/gawk)
==20299== by 0x42C120: research (in /usr/bin/gawk)
==20299== by 0x41055E: (within /usr/bin/gawk)
==20299== by 0x4113E9: do_gensub (in /usr/bin/gawk)
==20299== by 0x43D4EB: r_tree_eval (in /usr/bin/gawk)
==20299== by 0x412EF1: do_print (in /usr/bin/gawk)
==20299== by 0x43BC05: interpret (in /usr/bin/gawk)
==20299== by 0x43B955: interpret (in /usr/bin/gawk)
==20299== by 0x428070: do_input (in /usr/bin/gawk)
==20299== by 0x429CE0: main (in /usr/bin/gawk)
Also see this one:
$ echo "[abc,def,ghi]" | gawk '{ print gensub(
Segmentation fault
Finally this one prints some libc memory corruption warning:
echo "[abc,def,ghi]" | gawk '{ print gensub(
*** glibc detected *** gawk: realloc(): invalid next size: 0x0000000001d2aae0 ***
======= Backtrace: =========
/lib/libc.
/lib/libc.
/lib/libc.
gawk[0x410a3e]
gawk(do_
gawk(r_
gawk(do_
gawk(interpret+
gawk(interpret+
gawk(do_
gawk(main+
/lib/libc.
gawk[0x406d79]
======= Memory map: ========
00400000-00454000 r-xp 00000000 08:02 2068367 /usr/bin/gawk
00654000-00655000 rw-p 00054000 08:02 2068367 /usr/bin/gawk
00655000-0065c000 rw-p 00655000 00:00 0
01d22000-01d43000 rw-p 01d22000 00:00 0 [heap]
7fc92c000000-
7fc92c021000-
7fc93180a000-
7fc931820000-
7fc931a20000-
7fc931a21000-
7fc931a22000-
7fc931b8a000-
7fc931d8a000-
7fc931d8e000-
7fc931d8f000-
7fc931d94000-
7fc931e18000-
7fc932017000-
7fc932018000-
7fc932019000-
7fc93201b000-
7fc93221b000-
7fc93221c000-
7fc93221d000-
7fc9322f3000-
7fc9323de000-
7fc93241d000-
7fc93242f000-
7fc932430000-
7fc932431000-
7fc932432000-
7fc932439000-
7fc93243c000-
7fc93243d000-
7fff3a428000-
7fff3a5fe000-
ffffffffff60000
Aborted
Changed in gawk: | |
status: | New → Confirmed |
Changed in gawk (Ubuntu): | |
status: | Confirmed → Fix Released |
I reported this bug upstream as well and they immediately suggested a potential fix:
Index: ChangeLog ======= ======= ======= ======= ======= ======= ======= ======= ==== cvsrep/ gawk-stable/ ChangeLog, v
=======
RCS file: /d/mongo/
retrieving revision 1.101
diff -u -r1.101 ChangeLog
--- ChangeLog 16 Apr 2009 20:02:25 -0000 1.101
+++ ChangeLog 22 Apr 2009 04:43:41 -0000
@@ -1,3 +1,11 @@
+Wed Apr 22 07:42:05 2009 Arnold D. Robbins <email address hidden>
+
+ * builtin.c (sub_common): In code for handling \<dig> replacements,
+ first make sure that <dig> is within the range of parentheses sets
+ given, and then make sure that the subpattern start is not -1, meaning
+ that something actually matched. Thanks to Martin Olsson
+ <email address hidden> for the bug report.
+
Thu Apr 16 22:59:32 2009 Arnold D. Robbins <email address hidden>
* eval.c (func_call): Save nloops_active; if after function returns ======= ======= ======= ======= ======= ======= ======= ======= ==== cvsrep/ gawk-stable/ builtin. c,v
Index: builtin.c
=======
RCS file: /d/mongo/
retrieving revision 1.31
diff -u -r1.31 builtin.c
--- builtin.c 27 Mar 2009 08:01:13 -0000 1.31
+++ builtin.c 22 Apr 2009 04:40:15 -0000
@@ -2544,15 +2544,17 @@
if (backdigs) { /* gensub, behave sanely */
if (ISDIGIT(scan[1])) {
int dig = scan[1] - '0';
- char *start, *end;
+ if (dig < NUMSUBPATS(rp, t->stptr) && SUBPATSTART(rp, tp->stptr, dig) != -1) {
+ char *start, *end;
- start = t->stptr
- + SUBPATSTART(rp, t->stptr, dig);
- end = t->stptr
- + SUBPATEND(rp, t->stptr, dig);
-
- for (cp = start; cp < end; cp++)
- *bp++ = *cp;
+ start = t->stptr
+ + SUBPATSTART(rp, t->stptr, dig);
+ end = t->stptr
+ + SUBPATEND(rp, t->stptr, dig);
+
+ for (cp = start; cp < end; cp++)
+ *bp++ = *cp;
+ }
scan++;
} else /* \q for any q --> q */
*bp++ = *++scan;
This fix is not yet checked in (and I'm not sure this will be the final fix), let's keep an eye on the upstream changelog: cvs.savannah. gnu.org/ viewvc/ gawk-stable/ ChangeLog? root=gawk& view=log
http://
Hopefully this bug will be fixed upstream and a new release will be packaged for karmic (the gawk package was never updated for jaunty).