libPCRE3 8.31 regex matching is not working

Bug #1361610 reported by Mark Ebbers on 2014-08-26
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
pcre3 (Ubuntu)
Undecided
Unassigned
poco (Ubuntu)
Undecided
Unassigned

Bug Description

It looks like that libPCRE3 8.31 included in Ubuntu 13.10 does not handle regex matching alright. This also affects the libPocoFoundation RegularExpression classes.

I attached a proof-of-concept which tests a good and bad heystack string to a regex pattern with Poco, Boost and PCRE. I tested it on two different machines with the following output:

Ubuntu 13.10 3.11.0-13-generic #20-Ubuntu SMP Wed Oct 23 17:26:33 UTC 2013 i686 i686 i686 GNU/Linux
Poco 0x01030600 on Linux 3.11.0-13-generic @ i686
Poco match 1234567890 to pattern ^[0-9]{10} matches? no --> NOT CORRECT
Poco match 123456789 to pattern ^[0-9]{10} matches? no
Boost match 1234567890 to pattern ^[0-9]{10} matches? yes
Boost match 123456789 to pattern ^[0-9]{10} matches? no
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? yes --> NOT CORRECT

Ubuntu 12.04.3 LTS 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 15:31:16 UTC 2013 i686 i686 i386 GNU/Linux
Poco 0x01030600 on Linux 3.8.0-29-generic @ i686
Poco match 1234567890 to pattern ^[0-9]{10} matches? yes --> OK
Poco match 123456789 to pattern ^[0-9]{10} matches? no
Boost match 1234567890 to pattern ^[0-9]{10} matches? yes
Boost match 123456789 to pattern ^[0-9]{10} matches? no
PCRE 8.12 2011-01-15
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? no --> OK

Related branches

Mark Ebbers (m-ebbers) wrote :
Changed in pcre3 (Ubuntu):
status: New → Confirmed
Changed in poco (Ubuntu):
status: New → Confirmed

Output in 14.04 (13.10 is EOL)
Poco 0x01030600 on Linux 3.13.0-34-generic @ x86_64
Poco match 1234567890 to pattern ^[0-9]{10} matches? no
Poco match 123456789 to pattern ^[0-9]{10} matches? no
Boost match 1234567890 to pattern ^[0-9]{10} matches? yes
Boost match 123456789 to pattern ^[0-9]{10} matches? no
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? yes

Changed in poco (Ubuntu):
status: Confirmed → Invalid
Changed in poco (Ubuntu):
status: Invalid → Confirmed
Changed in poco (Ubuntu):
status: Confirmed → Invalid
no longer affects: collada-dom2.4-dp (Ubuntu)
Changed in poco (Ubuntu):
status: Invalid → Confirmed
Yoshiki Kanemoto (yocchiman) wrote :

In Ubuntu 14.04, libPocoFoundation.so has some global variables of pcre.
I found the executable which links both libPocoFoundation.so and libpcre.so outputs wrong results by pcre functions.

nm -D --defined-only /usr/lib/libPocoFoundation.so | grep pcre
0000000000110800 R _pcre_OP_lengths
000000000010f2a0 R _pcre_ucd_records
000000000010d0a0 R _pcre_ucd_stage1
00000000001032a0 R _pcre_ucd_stage2
00000000001106e0 R _pcre_ucp_gentype
00000000001107e0 R _pcre_utf8_table1
00000000001107d8 R _pcre_utf8_table1_size
00000000001107c0 R _pcre_utf8_table2
00000000001107a0 R _pcre_utf8_table3
0000000000110760 R _pcre_utf8_table4
0000000000110120 R _pcre_utt
00000000001103e0 R _pcre_utt_names
0000000000110100 R _pcre_utt_size

Yoshiki Kanemoto (yocchiman) wrote :

LD_PRELOAD=/usr/lib/libPocoFoundation.so changes the result of proof-of-concept (Poco and boost dependent code are removed from proof-of-concept)

$ g++ regex.cc -lpcre
$ ./a.out
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? no
$ LD_PRELOAD=/usr/lib/libPocoFoundation.so ./a.out
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? yes

Changed in pcre3 (Ubuntu):
status: Confirmed → Invalid
Mark Ebbers (m-ebbers) wrote :

$ ./a.out
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? no
$ LD_PRELOAD=/usr/lib/libPocoFoundation.so ./a.out
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? yes

We migrated from 12.04 x86 to 14.04 86_64 and still have this issue. Why is issue Invalid?

Arne de Bruijn (arbruijn) wrote :

When I compiled your program I got errors about duplicate pcre types. I've resolved the errors by adding a pcrepcre namespace around the pcre include/function. It gives me the correct output:

Poco POCO_VERSION on Linux 3.13.0-24-generic @ x86_64
Poco match 1234567890 to pattern ^[0-9]{10} matches? yes
Poco match 123456789 to pattern ^[0-9]{10} matches? no
Boost match 1234567890 to pattern ^[0-9]{10} matches? yes
Boost match 123456789 to pattern ^[0-9]{10} matches? no
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? no

Maybe that helps?

Otherwise it seems Poco is simply not designed to be combined with a different (system) pcre, and you where just lucky it worked on Ubuntu 12.04.

Mark Ebbers (m-ebbers) wrote :

Arne, I compiled your code, got no warnings and got a different output!

Poco 0x01030600 on Linux 4.2.0-35-generic @ x86_64
Poco match 1234567890 to pattern ^[0-9]{10} matches? no
Poco match 123456789 to pattern ^[0-9]{10} matches? no
Boost match 1234567890 to pattern ^[0-9]{10} matches? yes
Boost match 123456789 to pattern ^[0-9]{10} matches? no
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? yes

System info
 linux-vdso.so.1 => (0x00007ffc9efc4000)
 libboost_regex.so.1.54.0 => /usr/lib/x86_64-linux-gnu/libboost_regex.so.1.54.0 (0x00007f87ad124000)
 libPocoFoundation.so.9 => /usr/lib/libPocoFoundation.so.9 (0x00007f87acdd8000)
 libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f87acb99000)
 libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f87ac895000)
 libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f87ac67f000)
 libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f87ac2b9000)
 libicuuc.so.52 => /usr/lib/x86_64-linux-gnu/libicuuc.so.52 (0x00007f87abf40000)
 libicui18n.so.52 => /usr/lib/x86_64-linux-gnu/libicui18n.so.52 (0x00007f87abb39000)
 libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f87ab91a000)
 libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f87ab701000)
 libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f87ab4fd000)
 librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f87ab2f4000)
 libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f87aafee000)
 /lib64/ld-linux-x86-64.so.2 (0x00005582e37fc000)
 libicudata.so.52 => /usr/lib/x86_64-linux-gnu/libicudata.so.52 (0x00007f87a9780000)

inux ws004 4.2.0-35-generic #40~14.04.1-Ubuntu SMP Fri Mar 18 16:37:35 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)

Mark Ebbers (m-ebbers) wrote :

Additional info; when I change the linking order of pcre and poco I got a different output:

$ g++ regex.cc -lPocoFoundation -lpcre
$ ./a.out
Poco 0x01030600 on Linux 4.2.0-35-generic @ x86_64
Poco match 1234567890 to pattern ^[0-9]{10} matches? no
Poco match 123456789 to pattern ^[0-9]{10} matches? no
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? yes
$ g++ regex.cc -lpcre -lPocoFoundation
$ ./a.out
Poco 0x01030600 on Linux 4.2.0-35-generic @ x86_64
Poco match 1234567890 to pattern ^[0-9]{10} matches? yes
Poco match 123456789 to pattern ^[0-9]{10} matches? no
PCRE 8.31 2012-07-06
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? no

Our system uses heavily Poco's StringTokenizer and Regex classes which are not working correctly on 14.04 and I'm most interested in a solution to get it back working on 14.04 :)

I'm not familiar with LD_PRELOAD and the also do not fully understand replies like 'Yoshiki Kanemoto (yocchiman) wrote on 2015-05-01' so if anyone could tell we what the best approach here would be I would very greatfull

Arne de Bruijn (arbruijn) wrote :

I was afraid so.

You could try using the pcre.h header from the embedded pcre in Poco (apt-get source libPoco-dev, poco-1.3.6p1/Foundation/src/pcre.h), that I've attached as pocopcre.h and not linking to the system pcre (just g++ regex.cc -lPocoFoundation).

That seems to work for me:
Poco POCO_VERSION on Linux 3.13.0-24-generic @ x86_64
Poco match 1234567890 to pattern ^[0-9]{10} matches? yes
Poco match 123456789 to pattern ^[0-9]{10} matches? no
PCRE 7.8 2008-09-05
PCRE match 1234567890 to pattern ^[0-9]{10} matches? yes
PRCE match 123456789 to pattern ^[0-9]{10} matches? no

If you need a more recent pcre you need to change Poco somehow to not expose the pcre functions, for example by renaming them, adding a namespace or using __attribute__ ((visibility ("hidden"))). Maybe the poco developers can help you with that.

Mark Ebbers (m-ebbers) wrote :

Arne,

Thank you for your quick reply but I don't think this solves my problem. It only solves it for the proof-of-concept which I made to proof, the bug in Poco regex functionality.

Our system does not even link against libPCRE but uses Poco. If I change the proof-of-concept to a Poco only solution (see attachment) I still have the problem of a non functioning regex.

$ g++ regex.cc -lPocoFoundation
$ ./a.out
Poco 0x01030600 on Linux 4.2.0-35-generic @ x86_64
Poco match 1234567890 to pattern ^[0-9]{10} matches? no
Poco match 123456789 to pattern ^[0-9]{10} matches? no
$ ldd ./a.out
 linux-vdso.so.1 => (0x00007ffe94177000)
 libPocoFoundation.so.9 => /usr/lib/libPocoFoundation.so.9 (0x00007f14dd744000)
 libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f14dd440000)
 libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f14dd229000)
 libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f14dce64000)
 libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f14dcc46000)
 libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f14dca07000)
 libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f14dc7ee000)
 libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f14dc5ea000)
 librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f14dc3e1000)
 libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f14dc0db000)
 /lib64/ld-linux-x86-64.so.2 (0x000055cbc9c93000)

$ LD_PRELOAD=/lib/x86_64-linux-gnu/libpcre.so.3 ./a.out
Poco 0x01030600 on Linux 4.2.0-35-generic @ x86_64
Poco match 1234567890 to pattern ^[0-9]{10} matches? yes
Poco match 123456789 to pattern ^[0-9]{10} matches? no
$ LD_PRELOAD=/lib/x86_64-linux-gnu/libpcre.so.3 ldd ./a.out
 linux-vdso.so.1 => (0x00007ffed2c4f000)
 /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f34efa89000)
 libPocoFoundation.so.9 => /usr/lib/libPocoFoundation.so.9 (0x00007f34ef722000)
 libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f34ef41d000)
 libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f34ef207000)
 libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f34eee42000)
 libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f34eec23000)
 libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f34eea0a000)
 libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f34ee806000)
 librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f34ee5fd000)
 libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f34ee2f7000)
 /lib64/ld-linux-x86-64.so.2 (0x0000564cbde41000)

Mark Ebbers (m-ebbers) wrote :

Also opened an issue at the pocoproject. https://github.com/pocoproject/poco/issues/1284

tags: added: upgrade-software-version
Changed in poco (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers