Pattern matching problem in ebook-editor

Bug #1652621 reported by leastcommonancestor
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Unassigned

Bug Description

There appears to be a problem with pattern matching which I first noticed with Calibre 2.75.1 with patterns where a first part can match the empty string and a second part matches at least 2 characters, e.g. "a?yz".

Given the string "xxxxxxxyz" the resulting match is not "yz" but "xxxxxxx".
The pattern "a?y" works fine and finds "y" as one would expect.

I checked the pattern/string-combination with the Python 2.7.12 interpreter and there was no error, however the Python version used by Calibre is 2.7.9.

Reverting to Calibre 2.74.0 did not solve the problem.

Trying to replace the wrong search result with the empty string produced:

calibre, version 2.74.0
ERROR: Unhandled exception: <b>MemoryError</b>:

calibre 2.74 embedded-python: True is64bit: True
Linux-4.4.0-57-generic-x86_64-with-debian-stretch-sid Linux ('64bit', 'ELF')
('Linux', '4.4.0-57-generic', '#78-Ubuntu SMP Fri Dec 9 23:50:32 UTC 2016')
Python 2.7.9
Linux: ('debian', 'stretch/sid', '')
[…]
Traceback (most recent call last):
  File "site-packages/calibre/gui2/tweak_book/boss.py", line 946, in search
  File "site-packages/calibre/gui2/tweak_book/search.py", line 1441, in run_search
  File "site-packages/calibre/gui2/tweak_book/search.py", line 1365, in do_replace
  File "site-packages/calibre/gui2/tweak_book/editor/widget.py", line 303, in replace
  File "site-packages/calibre/gui2/tweak_book/editor/text.py", line 494, in replace
MemoryError

I'm running 64-bit Calibre 2.75.1 on Ubuntu 16.04.

When I used a similar pattern with "Replace all" on a moderately-sized ebook, I had to kill calibre (loosing some edits) after the memory usage was above 7GB and the system started swapping.

Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1652621

This is caused by a bug in the regex engine the editor uses: https://bitbucket.org/mrabarnett/mrab-regex/issues/227/incorrect-behavior-for-operator-with

You will have to wait for a fix for it.

 status invalid

Changed in calibre:
status: New → Invalid
Revision history for this message
Kovid Goyal (kovid) wrote : Fixed in master

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: Invalid → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.