LibreOffice Writer Document Compare (regression)

Bug #782406 reported by Gretha on 2011-05-13
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
libreoffice (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: libreoffice

Comparing two files has reverted to the old OOo behaviour of flagging entire long sequences of paragraphs as different, when in fact they are actually identical except maybe for just one or two minor typographical differences.

Gretha (g-r-e-cramer) wrote :

Nearly six months down the road, and this issue is still listed as "Status: New", receiving zero attention.

Yet, this is perhaps the most important drawback of LibreOffice in practice in a professional production environment, where documents are edited and updated on a routine basis and where document integrity is of paramount importance.

Moreover, as improved document comparison featured in the latest versions of OOo before migration to LibreOffice, why not at least simply reinstating that?

Gretha, thank you for taking the time to report this and helping to make Ubuntu better. Please execute the following command, as it will automatically gather debugging information, in a terminal:
apport-collect 782406
When reporting bugs in the future please use apport by using 'ubuntu-bug' and the name of the package affected. You can learn more about this functionality at https://wiki.ubuntu.com/ReportingBugs.

In order to compare documents one would click Edit -> Compare Document...

As well, could you please post 2 documents that when you compare them, demonstrates the comparison problem?

Changed in libreoffice (Ubuntu):
status: New → Incomplete
Gretha (g-r-e-cramer) wrote :

Dear Christopher, I am attaching 3 documents which together illustrate the comparison problem.

Version-1 is the original as submitted by the "author" to the "editor".
Version-2 is the final version after the editor has made some minor changes.
Version-1-to-2 is the result of LibreOffice Document Compare.

The changes made by the editor are as follows:

(1) one textual change ("so at" > "in order")
(2) two formatting changes ("essential" in bold and "refine it" in italics)
(3) split one long paragraph in three places for better readability

The editor has been a loyal user of OOo for over a decade. During that time, she has seen ample effort spent on changing the colour of the icons, while the totally dysfunctional and unhelpful Document Compare functionality has been left in the Stone Age. Irked by this, she also completely rewrites the footnote of the original version-1.

What help does Document Compare now offer her in identifying the changes from version-1 to version-2?

First, a whole multicolour flap of text is identified by LibreOffice as having been altered. The main body of text (excluding the footnote) now comprises 374 words. Of these, LibreOffice marks 261 (70%) as changed (inserted / deleted). Yet, she changed only two words (0.5%), inserted three paragraph splits (0.8%), and made two formatting changes (0.5%).

Second, the complete rewrite of the footnote, entirely changing both its meaning and its formatting, is totally ignored by Document Compare. LibreOffice does not flag any change whatsoever. Thereby it actually gives the totally false impression that, from version-1 to version-2, the footnote has not been touched at all.

Now imagine, she does not have a one-page document as here, but a 50-page report or a 300-page book.

What help does LibreOffice give a writer, a collaborative group of writers, or an editor when it comes to version control? They are simply swamped by massive swathes of text identified as having been altered, when in fact there might well be only an occasional change here and there.

And footnotes, so important for technical and other precise detail, are simply skipped - and worse, skipped without warning - by LibreOffice's Document Compare. (Perhaps the same holds true for other text, formatting or graphical elements as well, I have not checked.)

The version comparison algorithm of LibreOffice is just lazy, incompetent and misleading, and it simply offloads its task onto the human editor.

We might as well go back to our old Triumph and Olivetti typewriters of the 1970s. This is also why, after more than 10 years of OpenOffice and now LibreOffice, we still use Microsoft Office -- just for version control.

It would seem a challenge relished by any good trainee programmer to develop a competent file comparison algorithm for LibreOffice. At least, had I got the programming skills, I for one would love the challenge to solve it.

Gretha (g-r-e-cramer) wrote :
  • draft Edit (34.3 KiB, application/vnd.oasis.opendocument.text)
Gretha (g-r-e-cramer) wrote :
  • final Edit (31.3 KiB, application/vnd.oasis.opendocument.text)
Gretha (g-r-e-cramer) wrote :
Gretha (g-r-e-cramer) wrote :

Christopher, the terminal command apport-collect 782406 opens Opera which then displays a page saying "Invalid OpenID transaction".

Gretha, regarding your comments:
Christopher, the terminal command apport-collect 782406 opens Opera which then displays a page saying "Invalid OpenID transaction".

This may be due to Opera rejecting Launchpad cookies. If this is not the case please delete all cookies and other private data from Opera and try again. If this does not work please try making Firefox your default browser and then try it again.

Gretha, regarding your comments in the Bug Description:
+ "This ability to select Document Compare options (via Tools > Options...) has now been dropped again in LibreOffice 3.3.2..."

The same Document Compare options in OOo are still found in LO at Tools -> Options -> LibreOffice Writer -> Changes.

+ "Note: Document Comparison also still does not work at all in footnotes. Two documents with differences only in the text of footnotes are erroneously flagged as identical."

Quoting from http://help.libreoffice.org/Common/Compare_Document:
"When in Writer: The contents of footnotes, headers, frames and fields are ignored."

description: updated
Gretha (g-r-e-cramer) wrote :

Christopher, re #7 and #8 - terminal command apport-collect 782406

In Application Preferences, I changed the preferred browser from Opera to Firefox. The terminal command, however, ignores this, even after a full system restart, and stubbornly opens Opera. So I'm stuck there.

I then set Opera with the least possible restrictions for the Launchpad domain, including "Accept all cookies" (rather that just the normal "Accept only cookies from the site I visit"). On the terminal apport-collect 782406 instruction, Opera now opens a Launchpad login page. I enter the login details, and the page changes to one which headlines "Your Page Was Stale", with a link to the Launchpad home page. I click this link and login on the Launchpad home page, and the "Your Page Was Stale" reappears. Beats me.

Gretha (g-r-e-cramer) wrote :

Christopher, re #9

(1) "The same Document Compare options in OOo are still found in LO at Tools -> Options -> LibreOffice Writer -> Changes."

This is not quite correct. If I remember well, the ability to choose the more fine-grained comparison option was in what in LO now is Tools -> Options -> LibreOffice -> General. (I know it was a rather counter-intuitive place, and not on the Changes tab.) The option has, however, been dropped in LO since LO replaced OOo in Ubuntu.

Also, by the way, as you can see in the sample documents uploaded yesterday, the "Changed Attributes" selection in Tools -> Options -> LibreOffice Writer -> Changes does not work.

(2) You quote http://help.libreoffice.org/Common/Compare_Document:

I have also seen this, and it is simply stunning. It effectively tells you that you should never use footnotes, headers, frames or fields if there is even the slightest chance that you or someone else will ever edit the document and you then want to identify what has changed.

(3) Overall, the point is, as the 3 uploaded documents show by way of typical example, LO Writer Document Compare identifies 70% changed contents, when the actual changes are only 1.8% in total (0.5% in the text and 1.3% just formatting). And it does not even flag that it is unable to check your footnotes. It's pathetic for a top word processor.

This lack of even the most elementary decent document version comparison forces clients off Ubuntu back to Microsoft Windows with MS Office. I just get depressed when we've put another client on Ubuntu and they come back with this. For many, there is nothing more important than tight document version management and control - otherwise aircraft crash, oil platforms sink, telecomms satellites go dead, confidentiality cannot be guaranteed, legal compliance cannot be enforced, contracts end up in court, you name it.

Please push it hard with whoever pulls the strings on LO. Given the vast efforts spent on OOo and LO over the years, writing a decent fine-grained comparison algorithm must be trivial. (If you fail to convince the powers that be, though, then at least sincere thanks for trying!)

Gretha, let us move past the apport for now while I continue to look into the bug details. Marking back to New.

Changed in libreoffice (Ubuntu):
status: Incomplete → New
Gretha (g-r-e-cramer) wrote :

Christopher, re #12:
The apport issue is not important here, so I fully agree.

Gretha, keeping in mind the Writer limitations previously mentioned, Compare Document'ing the originally attached files via the Terminal:

cd ~/Desktop && wget -c https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/782406/+attachment/2610679/+files/LibreOffice-CompareDoc-Version-1.odt -O draft.odt && wget -c https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/782406/+attachment/2610680/+files/LibreOffice-CompareDoc-Version-2.odt -O final.odt && lowriter -nologo draft.odt

Edit -> Compare Document... -> final.odt -> Insert button

Writer marks 4 differences correctly, 2 Insertions and 2 Deletions.

Stripping out these differences, the footnote, and unflagged similarities, one has via the Terminal:

cd ~/Desktop && wget https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/782406/+attachment/2619286/+files/simple-draft.odt && wget https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/782406/+attachment/2619287/+files/simple-final.odt && lowriter -nologo simple-draft.odt

Edit -> Compare Document... -> simple-final.odt -> Insert button

which flags 2 differences:
1 Insertion: in order
1 Deletion: so at

Looking at how Word compares via the Terminal:

cd ~/Desktop && wget https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/782406/+attachment/2619289/+files/simple-draft.doc && wget https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/782406/+attachment/2619290/+files/simple-final.doc && wine ~/.wine/drive_c/Program\ Files/Microsoft\ Office/OFFICE11/WINWORD.EXE simple-draft.doc

Tools -> Compare and Merge Documents... -> uncheck Legal blackline -> uncheck Find formatting -> Merge drop down -> Merge into current document flags 1 difference:
1 Deletion: so at

So while Writer does not match ones functionality expectation of how Word compares documents in this pocket case, Writer is enhanced past Word to more accurately represent what actually occurred.

To boil this down, this report has been created in ignorance of how Writer performs document comparisons, which was discussed. Feel free to report any future bugs you may find.

lsb_release -rd
Description: Ubuntu 11.10
Release: 11.10

apt-cache policy libreoffice-writer
libreoffice-writer:
  Installed: 1:3.4.4-0ubuntu1
  Candidate: 1:3.4.4-0ubuntu1
  Version table:
 *** 1:3.4.4-0ubuntu1 0
        500 http://us.archive.ubuntu.com/ubuntu/ oneiric-updates/main i386 Packages
        100 /var/lib/dpkg/status
     1:3.4.3-3ubuntu2 0
        500 http://us.archive.ubuntu.com/ubuntu/ oneiric/main i386 Packages

Changed in libreoffice (Ubuntu):
status: New → Invalid
Gretha (g-r-e-cramer) wrote :
Download full text (4.8 KiB)

Christopher

I make two observations.

First, I am sorry to have to note that you repeatedly digress and go off-topic. Issues such as, first, apport (which clearly does not work properly) and, then, Microsoft Office are entirely besides the point. The point is the incompetence of the Compare Document functionality in LibreOffice Writer. This is all the more remarkable and disappointing, because OpenOffice in version 3.2 had finally made at least some significant progress here just when LibreOffice was forked.

Second, I am, amongst other things, first-line responsible for, at the last count, close to 18,000 installations of Ubuntu and -- since the investment-destroying and inefficient Unity iphone-lookalike desktop -- Xubuntu at over 100 clients across the world. This gives me a reasonably balanced picture of the most common issues clients struggle with. For (X)Ubuntu, top-ranking at present is Network Manager when using 3G dongles on netbooks (see bug 796872 which is still awaiting attention and remains unresolved). Second in key parts of the world still comes the absence of a synthetic fax modem simulator for Skype or similar. For LibreOffice, the primary issue still remains the dysfunctionality of Compare Document (this bug) and the regression here relative to OpenOffice.

For us, it is quite all-right if the LibreOffice developers wish to use LO as a toy to hone their software writing skills. Just tell me, and we'll put clients back on Microsoft and on Macs, systems still run by the vast majority of our clients.

Specifically, in post #18, you meander off in a complex attempt

(a) to assert the equivalence of Microsoft Word and LibreOffice Writer or even the superiority of the latter; as well as

(b) to claim our ignorance of how LibreOffice actually performs document comparisons.

No doubt, as regards (a), you can construct a scenario with a carefully crafted text and/or manual intervention which happens to support your claim, but in general the assertion is both false and, more importantly, it is completely besides the point.

As regards (b), the document compare described in my earlier posts here was carried out by the book and to the letter. (See also LO Writer Help. And please don't bother and scare routine users of Office applications in a document production environment with terminal commands, otherwise I'll be more than happy to dish out some rolls of paper tape and stacks of uninterpreted punch cards left over from my earlier days in computing long before it was called IT.) You even go so far as to use your assertion to bump the matter and simply declare the issue invalid.

The key point, however, and I am sorry to have to repeat it, is this:

(1) A precise, complete, effective and efficient LibreOffice Compare Document functionality is an absolutely elementary function in any real document production environment where routine version management and control are standard and where document quality and integrity are important or even vital;

(2) LibreOffice Compare Document has unnecessarily and avoidably regressed since LibreOffice was forked from OpenOffice; and

(3) LibreOffice Compare Document is badly in need of f...

Read more...

Gretha (g-r-e-cramer) wrote :
Changed in libreoffice (Ubuntu):
status: Invalid → Confirmed
Gretha (g-r-e-cramer) wrote :

Key points ... (continued from #19)

(3) LibreOffice Compare Document is badly in need of fundamental overhaul, radical improvement and proper fine-graining, and this also includes the removal of its implicit exclusion of frames, footnotes, headers, fields and other essential items of information contained in documents.

I am attaching (see #20) a single random document which once more illustrates these three points I am making. Since LibreOffice Document Compare is so bad that it in fact makes even this short document crash, I can only salvage it in the form of three screen grabs. I have compiled these in a single png image with added explanation and comments. It very much saddens me that this presentation (#20) is by no means an advertisement or recommendation for LibreOffice.

Open source software is an invaluable phenomenon. However, it can only make true and broad inroads beyond hobbyists and servers if the developers, on the one hand, and those like us, on the other, who promote it and actually put it in the workplace -- often in key production-critical contexts -- work hand in hand and in sympathy to effectively and efficiently resolve the issues reported by actual users of the systems and applications. We resolve almost all such issues in-house, but some are elusive, defying practical work-arounds, and often systemically rooted in key core components and applications.

We do our very best to support open source and to get it deployed wherever suitable. But when the effort proves one-way and when positive feedback sincerely aimed to improve matters routinely falls on deaf ears and is left unattended or, as in this case of LibreOffice, simply bumped and dumped, then, I am very sad indeed to say, we are clearly on the wrong track.

Gretha, this bug is being marked Won't Fix as your stacking multiple Wishlist reports and potential bugs (crash mentioned in attachment) into this report. Feel free to file a new report, one Wishlist or bug per report, so they may be dealt with quickly and efficiently. Thank you for your understanding.

Changed in libreoffice (Ubuntu):
status: Confirmed → Invalid
status: Invalid → Won't Fix
Leo H (leo-h-hildebrandt) wrote :

@ Gretha

I was searching for reports on LibreOffice Writer Compare Document, because mostly it doesn't seem to work. Your submission hits the nail right on the head in my experience:

I start with an .odt document (saved as version 1) with just the following text (without the quotes):

"LibreOffice Writer. Compare Document."

I then modify it just by inserting a space at the start and a paragraph marker after the first period as follows (saved as version 2) (again without the quotes):

" LibreOffice Writer.
 Compare Document."

If I now compare versions 2 and 1, then I simply get version 1 inserted into version 2 (ie, everything double) and everything marked as changed.

CONCLUSION: Compare Document in LibreOffice Writer is completely clueless and completely useless.

And Gretha, as regards Christopher Penalver's responses, I really admire your perseverance. He is just stalling. He systematically evades the issue (see CONCLUSION). With all due respect, he is probably one of those technocrats who loves coding (and who might well be very good at it). But as so often in open source, real users and their needs are just seen as an ignorant meddling bunch and as a troublesome interfering nuisance.

Back to Microsoft Word, I guess (sorry).

Don't be discouraged!

Leo

Denis Prost (denis-prost) wrote :

Hi everyone,

To me, like Gretha, this bug is the most annoying one for a professionnal use of libreoffice.
Unfortunately, it is still there on ubuntu 12.04 beta 2 (libreoffice 3.5.2.2).
Hope it will be fixed soon.
Regards

Denis

Denis Prost (denis-prost) wrote :

oops, I realized the documents I used for my comparison test were more different than I thought.
So the comparator was right.
Sorry for the noise.

Denis

Michael Grivas (mgrivas) wrote :

@ Gretha:
It seems to me that you're quite an expert in OSS. But, in that case, you should know that most of OSS developers are not getting paid for what they do (at least not for the whole thing or in the same rate) .
Then, one realizes that they will not fix any problem, because they do not have to do it.
Yes, an OSS project may be bad or even doomed by the users. But, OSS developers do not always care about that.

Now, in your case, I understand that you work for a big firm with an obvious large budget.
If you want the fixes you mention done, why don't you arrange a payment for a developer to fix it?
I would expect it to cost about half a mid-range western European salary.
And you would then help the whole community. And you would be happy (your clients especially).
And many more would be happy. And the developer would be happy. And it would not cost too much.

I am sure that if you ask, plenty would do that for you (and the whole community) for no more than 1000 euros.
Why don't you try that path?

mewalig (mewalig) wrote :

Have these issues been fixed? If not, I certainly agree with Gretha that doc compare must be taken seriously if anyone wants this office package to be taken seriously, and I would suspect that a large percentage of work that is being done on this package would be less important, to fewer people, than having doc compare functionality that is on par with MS Word (which, by the way, has plenty of room for improvement / bug fixes as well, especially when handling large documents with a large number of tables).

In any case, if it hasn't been fixed, could anyone point me to where the related code in the package is (eg a link to the specific online svn repository page or anything else that helps to cut down on having to navigate to find the right code)?

bitinerant (bitinerant) wrote :

Maybe I'm missing something, but this seems like a huge deal and still an issue in LibreOffice (4.2.8.2 on Ubuntu 14.04). I have tried variations of this procedure many times, always with the same result. The two texts differ in a few minor places.

To reproduce:

  * open Writer
  * copy text 1 from another application and paste into Write via Edit > Paste Special > Unformatted Text
  * save and close
  * open a new Writer document and repeat the above with text 2
  * open document 1
  * click Edit > Compare Document and choose document 2
  * expected result: a few small insertions and deletions representing the differences between the texts
  * result: one huge insertion and one huge deletion

Why is this marked "won't fix"?

bitinerant (bitinerant) wrote :

Here is the help documentation which describes this process:

    https://help.libreoffice.org/Common/Comparing_Versions_of_a_Document

This problem occurs on some documents but not others.

I had success in fixing the problem (I have tried it on only one document) by changing Tools > Options > LibreOffice Writer > Comparison from "Auto" to "By word".

That may narrow the bug down a bit. I can supply an example privately and file another bug report if necessary, but will do so only if someone will take a look at it.

Sorry, I forgot to mention that I have a 642 word document with 4 changes totalling 6 words, and yet the whole document is marked as changed, until I changed the Comparison option. I even changed the two documents to plain text, and it still marked the whole document as changed.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers