regex does not properly select match
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| calibre |
Fix Released
|
Undecided
|
Unassigned | ||
Bug Description
The text selection when searching in a html file does not seem to be correct.
This is a mininal reproducable example:
```
ππππππ
<p> test</p>
```
with the regex:
```
test(?=<\/p>)
```
I verified it on `https:/
I really don't know _what_ it is doing exactly, depending on where the search starts + it gives different selections if `up` or `down` is set as search direction.
Sometimes it _does_ give the correct match which makes it even more confusing.
It does **not** exhibit this behaviour when using "normal" characters (eg: "a", "b", ...) and breaks with "weird" characters (eg: "π", "π", ...).
Version: 7.16
OS:
Edition Windows 10 Home
Version 22H2
Installed on β28/β12/β2023
OS build 19045.4651
Experience Windows Feature Experience Pack 1000.19060.1000.0

It looks like this happens if the regex parser goes over any non-single byte character while traversing the file.
So with search direction as "down" and `|` being the cursor position:
```
ππππππ |
<p> test</p>
```
with the same search will correctly match `test`, while:
```
|ππ
<p> test</p>
```
will match "> te" (probably because of the 2 extra bytes).
Also note:
search was in "regex" mode not case sensitive , wrap was on and "dot all" was off