[Upstream] Regular Expression Search for circumflex by itself does not find beginning of a paragraph

Bug #465309 reported by jimav
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
LibreOffice
Invalid
Wishlist
OpenOffice
New
Unknown
libreoffice (Ubuntu)
Confirmed
Medium
Unassigned
openoffice.org (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Binary package hint: openoffice.org

1) lsb_release -rd
Description: Ubuntu 12.04 LTS
Release: 12.04

2) apt-cache policy libreoffice-calc
libreoffice-calc:
  Installed: 1:3.5.3-0ubuntu1
  Candidate: 1:3.5.3-0ubuntu1
  Version table:
 *** 1:3.5.3-0ubuntu1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise-updates/main i386 Packages
        100 /var/lib/dpkg/status
     1:3.5.2-2ubuntu1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise/main i386 Packages

3) What is expected to happen in Writer, Calc, or the Macro Editor is when one opens the Find & Replace window, with Regular Expression checkbox checked, in the Search for drop down put a circumflex in, and the beginning of every paragraph is found. Consulting the LO Wiki and built-in LO help, it is implied that using a circumflex by itself in the find field should match the beginning of a paragraph:
http://help.libreoffice.org/Common/List_of_Regular_Expressions

4) What happens instead is nothing is found.

WORKAROUND: Notepad++ 6.1.2:
http://notepad-plus-plus.org/download/v6.1.2.html

via WINE.

apt-cache policy wine1.5
wine1.5:
  Installed: 1.5.4-0ubuntu1~ppa1~precise1+pulse17
  Candidate: 1.5.4-0ubuntu1~ppa1~precise1+pulse17
  Version table:
 *** 1.5.4-0ubuntu1~ppa1~precise1+pulse17 0
        500 http://ppa.launchpad.net/ubuntu-wine/ppa/ubuntu/ precise/main i386 Packages
        100 /var/lib/dpkg/status

ProblemType: Bug
Architecture: amd64
Date: Fri Oct 30 11:36:03 2009
DistroRelease: Ubuntu 9.10
Package: openoffice.org-core 1:3.1.1-4ubuntu2 [modified: var/lib/openoffice/basis3.1/program/services.rdb]
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-12.41-generic
SourcePackage: openoffice.org
Uname: Linux 2.6.31-12-generic x86_64

Revision history for this message
jimav (james-avera) wrote :
Revision history for this message
WeatherGod (ben-v-root) wrote :

I can confirm that '^' does not work in OO 3.1.1 on Fedora 11, but '$' does. This should probably be directed upstream as well.

Changed in openoffice.org (Ubuntu):
status: New → Confirmed
Revision history for this message
WeatherGod (ben-v-root) wrote :

Finally upstreamed it.

Changed in openoffice:
importance: Undecided → Unknown
status: New → Unknown
Changed in openoffice:
status: Unknown → New
Chris Cheney (ccheney)
tags: added: karmic
jimav (james-avera)
summary: - regular expression ^ by itself does not work
+ regular expression ^ by itself does not work in Find & Replace of Basic
+ code
description: updated
Jack Leigh (leighman)
Changed in libreoffice (Ubuntu):
status: New → Confirmed
status: Confirmed → Triaged
Changed in openoffice.org (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
papukaija (papukaija) wrote : Re: regular expression ^ by itself does not work in Find & Replace of Basic code

This bug is related to bug 669849.

Changed in openoffice.org (Ubuntu):
status: Triaged → Won't Fix
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote : migrating packaging from OpenOffice.org to Libreoffice

[This is an automated message.]
There are no new official OpenOffice.org releases in Ubuntu packaging anymore => Won't Fix

If the problem persists, please mark this bug as "also affects project Libreoffice" or "also affects distribution Libreoffice (Ubuntu)" if that has not happened already.

Please leave references to upstream OpenOffice.org bugs in place to allow cross pollination.

penalvch (penalvch)
summary: - regular expression ^ by itself does not work in Find & Replace of Basic
- code
+ Regular Expression Search for cirumflex by itself does find beginning of
+ a paragraph
Revision history for this message
penalvch (penalvch) wrote : Re: Regular Expression Search for cirumflex by itself does not find beginning of a paragraph

jimav, thank you for taking the time to report this bug and helping to make Ubuntu better. The issue you are reporting is an upstream one and it would be nice if somebody having it could send the bug to the developers of the software by following the instructions at http://wiki.documentfoundation.org/BugReport . If you have done so, please tell us the number of the upstream bug (or the link), so we can add a bugwatch that will inform us about the status. Thanks in advance.

description: updated
tags: added: i386 precise
summary: - Regular Expression Search for cirumflex by itself does find beginning of
- a paragraph
+ Regular Expression Search for cirumflex by itself does not find
+ beginning of a paragraph
Changed in df-libreoffice:
status: New → Incomplete
Changed in libreoffice (Ubuntu):
importance: Undecided → Medium
Bob Bib (bobbib)
summary: - Regular Expression Search for cirumflex by itself does not find
+ Regular Expression Search for circumflex by itself does not find
beginning of a paragraph
Revision history for this message
In , jimav (james-avera) wrote :

What is expected to happen in Writer, Calc, or the Macro Editor is when one opens the Find & Replace window, with Regular Expression checkbox checked, in the Search for drop down put a circumflex in, and the beginning of every paragraph is found. Consulting the LO Wiki and built-in LO help, it is implied that using a circumflex by itself in the find field should match the beginning of a paragraph:
http://help.libreoffice.org/Common/List_of_Regular_Expressions

What happens instead is nothing is found.

NOTE: A dollarsign ($) by itself *does* work as expected, i.e., it matches the end of each line.

Revision history for this message
In , jimav (james-avera) wrote :

What is expected to happen in Writer, Calc, or the Macro Editor is when one opens the Find & Replace window, with Regular Expression checkbox checked, in the Search for drop down put a circumflex in, and the beginning of every paragraph is found. Consulting the LO Wiki and built-in LO help, it is implied that using a circumflex by itself in the find field should match the beginning of a paragraph:
http://help.libreoffice.org/Common/List_of_Regular_Expressions

What happens instead is nothing is found.

NOTE: A dollarsign ($) by itself *does* work as expected, i.e., it matches the end of each line.

Revision history for this message
jimav (james-avera) wrote : Re: Regular Expression Search for circumflex by itself does not find beginning of a paragraph
penalvch (penalvch)
Changed in df-libreoffice:
importance: Undecided → Unknown
status: Incomplete → Unknown
summary: - Regular Expression Search for circumflex by itself does not find
- beginning of a paragraph
+ [Upstream] Regular Expression Search for circumflex by itself does not
+ find beginning of a paragraph
Changed in df-libreoffice:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
In , Cno (cno) wrote :

Hi Jim,

Pls use "^." (without the quotes) to find the first character of a paragraph.
I think the ^ only is used in combinations.
See some examples/explanation in the help .

Regards,
Cor

Revision history for this message
In , Cno (cno) wrote :

Hi Jim,

Pls use "^." (without the quotes) to find the first character of a paragraph.
I think the ^ only is used in combinations.
See some examples/explanation in the help .

Regards,
Cor

Revision history for this message
In , jimav (james-avera) wrote :

No. ^. is not equivalent. ^. means to match the first character on the line, and if doing a replace then the first character would be deleted. ^ by itself matches the start of the line (not including any characters), and replacing it with something effectively inserts the "replacement" text at the start of the line. You could use something ugly like replacing ^(.) with ${1}PREFIX to avoid deleting the first character, but that would fail on blank lines which don't have any characters in them.

In any case, ^ (by itslef) is a standard, well-defined regular expression syntax used everywhere else (Perl, Python, vim etc. etc.) and Libre Office should not do something incompatible.

Revision history for this message
In , jimav (james-avera) wrote :

No. ^. is not equivalent. ^. means to match the first character on the line, and if doing a replace then the first character would be deleted. ^ by itself matches the start of the line (not including any characters), and replacing it with something effectively inserts the "replacement" text at the start of the line. You could use something ugly like replacing ^(.) with ${1}PREFIX to avoid deleting the first character, but that would fail on blank lines which don't have any characters in them.

In any case, ^ (by itslef) is a standard, well-defined regular expression syntax used everywhere else (Perl, Python, vim etc. etc.) and Libre Office should not do something incompatible.

Revision history for this message
In , jimav (james-avera) wrote :

If you are unsure how regular expression syntax should work (in industry-wide practice), there are many books and online references, for example

http://en.wikipedia.org/wiki/Regular_expression#POSIX_Basic_Regular_Expressions

Revision history for this message
In , jimav (james-avera) wrote :

If you are unsure how regular expression syntax should work (in industry-wide practice), there are many books and online references, for example

http://en.wikipedia.org/wiki/Regular_expression#POSIX_Basic_Regular_Expressions

Revision history for this message
In , Cno (cno) wrote :

Hi Jim,

OK, sorry & thanks for explanantion. (In the mena time I understood that the same applies for $, that cannot be used on itself to find the end of a paragraph).
Did it ever work as is expected, or is it something that has to be implemented..
In that case, this would be an enhancement...

Revision history for this message
In , Cno (cno) wrote :

Hi Jim,

OK, sorry & thanks for explanantion. (In the mena time I understood that the same applies for $, that cannot be used on itself to find the end of a paragraph).
Did it ever work as is expected, or is it something that has to be implemented..
In that case, this would be an enhancement...

Changed in df-libreoffice:
importance: Medium → Wishlist
Revision history for this message
In , jimav (james-avera) wrote :

AFAIK ^ has never worked correctly. I doubt anyone intentionally made Open Office regular expressions incompatible with industry practice, so I think this is a bug, not a missing feature.

-Jim

Revision history for this message
In , jimav (james-avera) wrote :

AFAIK ^ has never worked correctly. I doubt anyone intentionally made Open Office regular expressions incompatible with industry practice, so I think this is a bug, not a missing feature.

-Jim

Revision history for this message
In , jimav (james-avera) wrote :

Incidentally $ does match the end of paragraphs (as documented), but seems to match the paragraph break (not just tne -position- at the end of the paragraph), so paragraphs are merged forming a single new paragraph. Except only one of a group of successive empty paragraphs is matched.

Matching the para-break itself seems odd to me (as usually unhelpful), but might be intentional. However the fact that only some empty paragraphs are matched is almost certainly a bug.

EXAMPLE: In the following 1-line paragraphs, there are two empty paras between b and c (<P> indicates the paragraph symbol which is shown when displaying non-printing characters):
a<P>
b<P>
<P>
<P>
c<P>
Find-and-replace of $ with X replaces the 5 paragraphs with 2 paragraphs:
aXbX<P>
Xc<P>
As you can see, the 5 paragraphs were collapsed into two paragraphs, except the "paragraph break" was not removed for one of the empty paragraps.

Revision history for this message
In , jimav (james-avera) wrote :

Incidentally $ does match the end of paragraphs (as documented), but seems to match the paragraph break (not just tne -position- at the end of the paragraph), so paragraphs are merged forming a single new paragraph. Except only one of a group of successive empty paragraphs is matched.

Matching the para-break itself seems odd to me (as usually unhelpful), but might be intentional. However the fact that only some empty paragraphs are matched is almost certainly a bug.

EXAMPLE: In the following 1-line paragraphs, there are two empty paras between b and c (<P> indicates the paragraph symbol which is shown when displaying non-printing characters):
a<P>
b<P>
<P>
<P>
c<P>
Find-and-replace of $ with X replaces the 5 paragraphs with 2 paragraphs:
aXbX<P>
Xc<P>
As you can see, the 5 paragraphs were collapsed into two paragraphs, except the "paragraph break" was not removed for one of the empty paragraps.

Revision history for this message
In , jimav (james-avera) wrote :

Any thoughts about fixing this? It's still a problem in 4.3-alpha1

Note that searching for ^. is not a work-around because it will not match the start of empty paragraphs (the "." does not match). So if you want to prepend something to every paragraph in a selection which includes empty paragraphs, then ^ alone is necessary.

Revision history for this message
In , jimav (james-avera) wrote :

Any thoughts about fixing this? It's still a problem in 4.3-alpha1

Note that searching for ^. is not a work-around because it will not match the start of empty paragraphs (the "." does not match). So if you want to prepend something to every paragraph in a selection which includes empty paragraphs, then ^ alone is necessary.

Revision history for this message
In , Cno (cno) wrote :

Isn't your case just covered by using
 & in search and
 \nFOO in replace?

For me that works in Writer

Revision history for this message
In , Cno (cno) wrote :

Isn't your case just covered by using
 & in search and
 \nFOO in replace?

For me that works in Writer

Revision history for this message
In , jimav (james-avera) wrote :

> Isn't your case just covered by using
> & in search and
> \nFOO in replace?

Maybe that was a typo. The above does not work (does nothing--not matched).
Can you suggest a work-around which inserts some text at the start of every line in Calc's Basic macro editor (including empty lines)? That's the problem this bug was originally about and which *should* be easy by replacing ^ with the desired text. That is standard regex behavior everywhere else in computerdom.

^ on its own should work (just like $ on its own does).

Revision history for this message
In , jimav (james-avera) wrote :

> Isn't your case just covered by using
> & in search and
> \nFOO in replace?

Maybe that was a typo. The above does not work (does nothing--not matched).
Can you suggest a work-around which inserts some text at the start of every line in Calc's Basic macro editor (including empty lines)? That's the problem this bug was originally about and which *should* be easy by replacing ^ with the desired text. That is standard regex behavior everywhere else in computerdom.

^ on its own should work (just like $ on its own does).

Revision history for this message
In , Cno (cno) wrote :

(In reply to comment #9)

> Can you suggest a work-around which inserts some text at the start of every
> line in Calc's Basic macro editor (including empty lines)?

The component of this issue is Writer .. ?

Revision history for this message
In , Cno (cno) wrote :

(In reply to comment #9)

> Can you suggest a work-around which inserts some text at the start of every
> line in Calc's Basic macro editor (including empty lines)?

The component of this issue is Writer .. ?

Revision history for this message
In , jimav (james-avera) wrote :

Not sure where the regex code is. It manifests in writer and and ing Basic macro editor in Calc.

Revision history for this message
In , jimav (james-avera) wrote :

Not sure where the regex code is. It manifests in writer and and ing Basic macro editor in Calc.

Revision history for this message
In , Cno (cno) wrote :

Still a problem in 4.4.0alpha1
 > New

Revision history for this message
In , Cno (cno) wrote :

Still a problem in 4.4.0alpha1
 > New

Revision history for this message
In , jimav (james-avera) wrote :

Maybe Component should be changed to Spreadsheet, because the problem is more simply visible when editing Basic macro code. It is common to want to insert spaces at the start of every line in a range (e.g. to "indent" the code one level), and replacing ^ with spaces does not work.

Revision history for this message
In , jimav (james-avera) wrote :

Maybe Component should be changed to Spreadsheet, because the problem is more simply visible when editing Basic macro code. It is common to want to insert spaces at the start of every line in a range (e.g. to "indent" the code one level), and replacing ^ with spaces does not work.

Revision history for this message
In , Gordon1drake (gordon1drake) wrote :

To add text to the beginning of every paragraph you can do it in two passes. The first finds the start of the paragraph and the first character and replaces it with whatever text and the first character. The second pass will find empty paragraphs and replace it with whatever text and a paragraph break.

Search For: ^.
Replace With: <text>&

Search For: ^$
Replace With: <text>\n

I don't know if LO has anything for the start of a line whether it is the beginning of a paragraph or a line that has been word-wrapped .

Windows Vista 64
Version: 4.4.3.2
Build ID: 88805f81e9fe61362df02b9941de8e38a9b5fd16

Revision history for this message
In , jimav (james-avera) wrote :

> Search For: ^. etc.

No, that does not work as explained in comment#2 (empty lines have nothing for the "." to match).

The ^ alone is supposed to match the start (but doesn't).

Revision history for this message
In , Gordon1drake (gordon1drake) wrote :

You have four paragraphs like this:
this works

this works
this works

First, run this:
Search For: ^.
Replace With: yes &

Then, run this:
Search For: ^$
Replace With: this also works\n

The result looks like this:
yes this works
this also works
yes this works
yes this works

Revision history for this message
In , jimav (james-avera) wrote :

Ok, I see what you are doing. That two-step procedure will work (but should not be needed).

Thanks for pointing it out.

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

This release of Ubuntu is no longer receiving maintenance updates. If this is still an issue on a maintained version of Ubuntu please let us know.

Changed in libreoffice (Ubuntu):
status: Triaged → Incomplete
Changed in df-libreoffice:
importance: Wishlist → Unknown
status: Confirmed → Unknown
Changed in df-libreoffice:
importance: Unknown → Wishlist
status: Unknown → Confirmed
Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

Synchronising bug status with upstream.

Changed in libreoffice (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
In , Michael-warner-ut+libreoffice (michael-warner-ut+libreoffice) wrote :

*** This bug has been marked as a duplicate of bug 135538 ***

Changed in df-libreoffice:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.