Do not consider timestamp differences as corruption

Bug #1097748 reported by Björn Michaelsen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
LibreOffice
Fix Released
Critical
libreoffice (Ubuntu)
Fix Released
Undecided
Björn Michaelsen

Bug Description

On LibreOffice 3.5, see fdo#49819, fdo#54609 for details

Revision history for this message
In , Xing Li (diegomontoya) wrote :

Created attachment 61470
a user submitted docx file

We have attempted to load this particular .docx file on both 3.5.3 and 3.5.1rc1 on Windows 7 64bit and Centos 6 64bit and on both systems we are not able to load the file with libreoffice. The error is just a input/output popup error dialog.

The file appears to be properly zipped and of good structure so not sure where it went wrong during the loading process.

Expected result: .docx file loads.

Actual result: error popup.

Revision history for this message
In , S-joyemusequna (s-joyemusequna) wrote :

Confirmed with LOdev 3.6 (2012-05-10) version 3.6.0alpha0+ (Build ID: 9980e69) and LibO 3.4.5 on Windows Vista 64.

Revision history for this message
In , Korrawit Pruegsanusak (detective-conan-1412) wrote :

[REPRODUCIBLE] 3.5.3.2 Windows XP, show error popup.

Version field should be the earliest one with problem. <http://wiki.documentfoundation.org/BugReport_Details#Version> So, change to 3.4.5 per comment 1.

Revision history for this message
In , S-joyemusequna (s-joyemusequna) wrote :

Same problem with version 3.3.4 (tested under Windows XP).

Revision history for this message
In , Xing Li (diegomontoya) wrote :

Increasing the priority of this ticket as it is high priority with bug confirmations and reproducible test file.

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :
Download full text (4.2 KiB)

Looks like we detect a problem with the zip file:

(gdb) bt 15
#0 __cxxabiv1::__cxa_throw (obj=0x95443b8, tinfo=0xb06c5438 <typeinfo for com::sun::star::packages::zip::ZipIOException>, dest=
    0xb0689dc6 <com::sun::star::packages::zip::ZipIOException::~ZipIOException()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:63
#1 0xb068abab in ZipFile::readLOC (this=0x9545670, rEntry=...) at /ssd/opt/libreoffice/master/package/source/zipapi/ZipFile.cxx:706

704 if ( bBroken && !bRecoveryMode )
705 throw ZipIOException("The stream seems to be broken!",
706 uno::Reference< XInterface >() );

#2 0xb068c05e in ZipFile::getDataStream (this=0x9545670, rEntry=..., rData=..., bIsEncrypted=0 '\000', aMutexHolder=...)
    at /ssd/opt/libreoffice/master/package/source/zipapi/ZipFile.cxx:577
#3 0xb06a9c0f in ZipPackageStream::getDataStream (this=0xad6363e0)
    at /ssd/opt/libreoffice/master/package/source/zippackage/ZipPackageStream.cxx:551
#4 0xac4df7b3 in OWriteStream_Impl::GetStream_Impl (this=0x9544278, nStreamMode=1, bHierarchyAccess=1 '\001')
    at /ssd/opt/libreoffice/master/package/source/xstor/owriteablestream.cxx:1357
#5 0xac4e2b0f in OWriteStream_Impl::GetStream (this=0x9544278, nStreamMode=1, bHierarchyAccess=1 '\001')
    at /ssd/opt/libreoffice/master/package/source/xstor/owriteablestream.cxx:1337
#6 0xac4fb209 in OStorage::openStreamElementByHierarchicalName (this=0xad632458, aStreamPath=..., nOpenMode=1)
    at /ssd/opt/libreoffice/master/package/source/xstor/xstorage.cxx:6241
#7 0xac4d3da1 in OHierarchyElement_Impl::GetStreamHierarchically (this=0xaf28ea38, nStorageMode=1,
    aListPath=std::vector of length 0, capacity 2, nStreamMode=1, aEncryptionData=...)
    at /ssd/opt/libreoffice/master/package/source/xstor/ohierarchyholder.cxx:106
#8 0xac4d404f in OHierarchyElement_Impl::GetStreamHierarchically (this=0xaf28e618, nStorageMode=1,
    aListPath=std::vector of length 0, capacity 2, nStreamMode=1, aEncryptionData=...)
    at /ssd/opt/libreoffice/master/package/source/xstor/ohierarchyholder.cxx:148
#9 0xac4d432d in OHierarchyHolder_Impl::GetStreamHierarchically (this=0xad63132c, nStorageMode=1,
    aListPath=std::vector of length 0, capacity 2, nStreamMode=1, aEncryptionData=...)
    at /ssd/opt/libreoffice/master/package/source/xstor/ohierarchyholder.cxx:42
#10 0xac4fb2bb in OStorage::openStreamElementByHierarchicalName (this=0xa2de4e04, aStreamPath=..., nOpenMode=1)
    at /ssd/opt/libreoffice/master/package/source/xstor/xstorage.cxx:6253
#11 0xa2f5cd3b in oox::docprop::(anonymous namespace)::lclGetRelatedStreams (rxStorage=..., rStreamType=...)
    at /ssd/opt/libreoffice/master/oox/source/docprop/ooxmldocpropimport.cxx:89
#12 0xa2f5d184 in oox::docprop::DocumentPropertiesImport::importProperties (this=0xad630368, rxSource=..., rxDocumentProperties=...)
    at /ssd/opt/libreoffice/master/oox/source/docprop/ooxmldocpropimport.cxx:155
#13 0xa096b8bc in writerfilter::dmapper::DomainMapper::DomainMapper (this=0x9540b90, xContext=..., xInputStream=..., xModel=..., eDocumentType=
    writerfilter::dmapper::DOCUMENT_OOXML) at /ssd/opt/libreoffice/master/writerfilter/source/dmapper/DomainM...

Read more...

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

So - why would the directory timestamp differ from the stream header:

1083022683 = Mon, 26 Apr 2004 23:38:03 GMT
1083088142 = Tue, 27 Apr 2004 17:49:02 GMT

As an immediate workaround, unzipping and re-zipping the file works fine :-)

The question would be: how was this .docx produced ? and/or damaged.

Secondly - it looks like we don't re-try loading with a "this file is damaged" prompt and being more tolerant as/when we hit this sort of error for .docx.

I guess that needs fixing too.

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

*** Bug 45207 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

*** Bug 54968 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

bug#54609 is a band-aid for basically the same issue as this - but of course the band-aid only works for some files.

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

Created attachment 67516
debugging patch

Attached patch allows the document to load by first detecting the zip exception and returning the right error - so we can get repair mode turned on.

However - I then force repair-mode on - since there seems to be no way to force it down through the domain-mapper & associated logic.

We get the flag set right coming into:

Breakpoint 1, WriterFilter::filter (this=0xaca5fa00, aDescriptor=uno::Sequence of length 13 = {...})
    at /ssd/opt/libreoffice/master/writerfilter/source/filter/ImportFilter.cxx:50
...
{Name = "RepairPackage", Handle = 0, Value = uno::Any 1 '\001', State =
    com::sun::star::beans::PropertyState_DIRECT_VALUE}

But that needs pushing down.

End goal: throw up a dialog, offering to repair, and import the file anyway. I can at least see the contents now with that hard-coded.

Revision history for this message
In , Xing Li (diegomontoya) wrote :

This is great news that this bug is traced and squashed.

Mike, for your proposed end goal of "throw(ing) up a dialog" might be a problem for some that use the uno or cli component for file conversion where GUI popup dialog interaction is not feasible in a --headless environment.

Perhaps the default should be forced-repair as your patch currently has or only popup repair dailog when "--headless" is not enabled and force-repair otherwise.

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

The bug is not yet fixed; this is a prototype patch. I still really want to know *why* these documents have inconsistent file / time-stamps in them, that's really unclear to me.

Xing - where did this document come from ? and/or how was it made ? - can you find that out ?

Revision history for this message
In , Xing Li (diegomontoya) wrote :

Just emailed the original user of this test file for more information but the chance of response is very low. However and hopefully with some luck I will try to find another test-case/subject over the next few days.

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

pushed a fix to master, I'd appreciate widespread testing - it should complain the file is broken then allow it to be 'repaired' (ie. a sloppier more accepting import).

Unlikely to make 3.6.2 - perhaps (with some feedback) into 3.6.3 :-)

Revision history for this message
In , Libreoffice-bugs (libreoffice-bugs) wrote :

Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ff300e59e74ee88aa6a4981b57a51af416c9e991

fdo#49819 - allow slightly inconsistent docx files to be repaired

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.

Revision history for this message
In , Libreoffice-bugs (libreoffice-bugs) wrote :

Fridrich Å trba committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=5db7ac239278634c39cbb15f0173db0524b5dcd6

fdo#49819, fdo#54609: Do not consider timestamp differences as corruption

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.

Revision history for this message
In , Libreoffice-bugs (libreoffice-bugs) wrote :

Fridrich Å trba committed a patch related to this issue.
It has been pushed to "libreoffice-3-6-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=736b9ee7bdd5f9fd0a65a7ab3d9ae3c283007f09&g=libreoffice-3-6-2

fdo#49819, fdo#54609: Do not consider timestamp differences as corruption

It will be available already in LibreOffice 3.6.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.

Revision history for this message
In , Libreoffice-bugs (libreoffice-bugs) wrote :

Fridrich Å trba committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=afb9212cd39efcabd8a2f444d2f2979abb325a6a&g=libreoffice-3-6

fdo#49819, fdo#54609: Do not consider timestamp differences as corruption

It will be available in LibreOffice 3.6.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.

Revision history for this message
In , Xing Li (diegomontoya) wrote :

This corrupted docx file was created on Windows using Microsoft Word 2010. The user didn't provide much info from our feedback request.

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

Marking fixed, as it is fixed ;-)

I guess Office 2010 is just producing bad .zip output - which is a shame.

Thanks for the pointer :-)

Revision history for this message
In , Harri Pitkänen (hatapitk) wrote :

*** Bug 44853 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Lo-bugs (lo-bugs) wrote :

With master cc1a112 pulled 2012-10-01, the problem is indeed fixed.

Not that there was much doubt after Michael's assurance, but what else
would I do with my just-completed build? <grin />

Changed in libreoffice (Ubuntu):
status: New → In Progress
assignee: nobody → Björn Michaelsen (bjoern-michaelsen)
Changed in df-libreoffice:
importance: Unknown → Critical
status: Unknown → Fix Released
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :
Changed in libreoffice (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

lp#1097748: released on Fedora 17 and upstream, one-line change, making us more generous in reading broken MSO2010 files

Changed in libreoffice (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
In , Lo-bugs (lo-bugs) wrote :

Markus,

Are you sure about the mime type you assigned to the attachment
"Chapter 2 - Pink Ball, Knight & Penguin.docx"? `file` reports
"Microsoft Word 2007+".

Terry.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.