[SRU quantal] boost::unordered_multimap<>::erase(iterator, iterator) broken in boost1.49

Bug #1017125 reported by Björn Michaelsen
34
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Boost
Fix Released
Unknown
LibreOffice
Won't Fix
Medium
boost-mpi-source1.49 (Ubuntu)
Fix Released
Undecided
Unassigned
Quantal
Fix Released
Undecided
Unassigned
boost1.49 (Debian)
Fix Released
Unknown
boost1.49 (Fedora)
Won't Fix
Undecided
boost1.49 (Gentoo Linux)
New
Medium
boost1.49 (Ubuntu)
Fix Released
High
Unassigned
Quantal
Fix Released
High
Unassigned
gcc-4.7 (Ubuntu)
Invalid
Medium
Unassigned
Quantal
Invalid
Medium
Unassigned
libreoffice (Ubuntu)
Fix Released
Undecided
Unassigned
Quantal
Invalid
Undecided
Matthias Klose

Bug Description

[Impact]

 * possible root cause of bug 1067907 and essentially ever other client using boost::unordered
 * bug 1067907 alone has ~50 reported crashes per day

[Test Case]

 * compile and run the attached testcase

[Regression Potential]

 * miminal patch provided as a patch by upstream -- has been fixed in later boost versions

[Other Info]

These was the original symptoms in LibreOffice causing the bug hunt -- it has been evaded (without fixing the root cause in boost up to now) by not using the broken boost method in LibreOffice with http://cgit.freedesktop.org/libreoffice/core/commit/?id=861e55bd889d9f5f5b37724b3615e9355e2d5c15&g=libreoffice-3-6 :

subsequentcheck sometimes crashes in
xmloff.Impress.XMLContentImporter::com::sun::star::document::XImporter

Testcode:
http://opengrok.libreoffice.org/xref/core/qadevOOo/tests/java/ifc/document/_XImporter.java
against service:
http://opengrok.libreoffice.org/xref/core/qadevOOo/tests/java/mod/_xmloff/Impress/XMLContentImporter.java

steps to reproduce:
cd xmloff
echo "-o xmloff.Impress.XMLContentImporter" > qa/unoapi/xmloff.sce
echo > qa/unoapi/knownissues.xcl << EOF
xmloff.Impress.XMLContentImporter::com::sun::star::lang::XInitialization
xmloff.Impress.XMLContentImporter::com::sun::star::document::XFilter
xmloff.Impress.XMLContentImporter::com::sun::star::container::XNamed
EOF
R=T; while test "$R" = "T"; do make subsequentcheck || R=F; done

expected result:
test passes without a crash

actual result:
crash

Revision history for this message
In , Anton (anton-redhat-bugs) wrote :
Download full text (3.1 KiB)

libreport version: 2.0.8
abrt_version: 2.0.7
backtrace_rating: 4
cmdline: /usr/lib64/libreoffice/program/soffice.bin --impress file:///home/anton/Customer%20Portal%20Internal_03202012.odp --splash-pipe=6
crash_function: std::list<Link, std::allocator<Link> >::begin
executable: /usr/lib64/libreoffice/program/soffice.bin
kernel: 3.3.0-3.tip0.uprobes.fc17.x86_64
pid: 18424
pwd: /home/anton
reason: Process /usr/lib64/libreoffice/program/soffice.bin was killed by signal 11 (SIGSEGV)
time: Fri 23 Mar 2012 09:42:27 CET
uid: 1000
username: anton

backtrace: Text file, 69241 bytes
dso_list: Text file, 20606 bytes
maps: Text file, 88339 bytes

environ:
:GIT_PS1_SHOWDIRTYSTATE=true
:XDG_VTNR=1
:XDG_SESSION_ID=2
:HOSTNAME=bandura.brq.redhat.com
:LC_MONETARY=en_GB.utf8
:IMSETTINGS_INTEGRATE_DESKTOP=yes
:GIO_LAUNCHED_DESKTOP_FILE_PID=18410
:GPG_AGENT_INFO=/tmp/keyring-XLyyPc/gpg:0:1
:SHELL=/bin/bash
:TERM=xterm-256color
:DESKTOP_STARTUP_ID=nautilus-18399-bandura.brq.redhat.com-libreoffice-0_TIME6183650
:HISTSIZE=1000
:XDG_SESSION_COOKIE=140f39c967431aadf43a7e3200000010-1332485955.765921-668108218
:GJS_DEBUG_OUTPUT=stderr
:G_MESSAGES_DEBUG=all
:LC_NUMERIC=en_GB.utf8
:OLDPWD=/usr/lib64/libreoffice/program
:QTDIR=/usr/lib64/qt-3.3
:GNOME_KEYRING_CONTROL=/tmp/keyring-XLyyPc
:QTINC=/usr/lib64/qt-3.3/include
:'GJS_DEBUG_TOPICS=JS ERROR;JS LOG'
:IMSETTINGS_MODULE=none
:USER=anton
:SSH_AUTH_SOCK=/tmp/keyring-XLyyPc/ssh
:USERNAME=anton
:SESSION_MANAGER=local/unix:@/tmp/.ICE-unix/1117,unix/unix:/tmp/.ICE-unix/1117
:GIO_LAUNCHED_DESKTOP_FILE=/usr/share/applications/libreoffice-impress.desktop
:MAIL=/var/spool/mail/anton
:PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/home/anton/.local/bin:/home/anton/bin
:DESKTOP_SESSION=gnome
:QT_IM_MODULE=xim
:PWD=/home/anton
:XMODIFIERS=@im=none
:EDITOR=vim
:KDE_IS_PRELINKED=1
:GNOME_KEYRING_PID=1109
:LANG=en_GB.utf8
:GDM_LANG=en_GB.utf8
:KDEDIRS=/usr
:LC_MEASUREMENT=en_GB.utf8
:GIT_PS1_SHOWUNTRACKEDFILES=true
:GDMSESSION=gnome
:SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
:HISTCONTROL=ignoredups
:XDG_SEAT=seat0
:HOME=/home/anton
:SHLVL=1
:GNOME_DESKTOP_SESSION_ID=this-is-deprecated
:SAL_ENABLE_FILE_LOCKING=1
:LOGNAME=anton
:PRINTER=brno1-3rd-cafe
:QTLIB=/usr/lib64/qt-3.3/lib
:CVS_RSH=ssh
:DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-7mbsOmI4UG,guid=a900ddf0010ccf66132eecc70000003f
:'LESSOPEN=||/usr/bin/lesspipe.sh %s'
:BROWSER=google-chrome
:WINDOWPATH=1
:XDG_RUNTIME_DIR=/run/user/anton
:DISPLAY=:0.0
:LC_TIME=en_GB.utf8
:XAUTHORITY=/var/run/gdm/auth-for-anton-S0MDkT/database
:LD_LIBRARY_PATH=/usr/java/jre1.7.0_03/lib/amd64/client:/usr/java/jre1.7.0_03/lib/amd64/server:/usr/java/jre1.7.0_03/lib/amd64/native_threads:/usr/java/jre1.7.0_03/lib/amd64

var_log_messages:
:Mar 23 09:42:27 bandura kernel: [ 6247.104768] soffice.bin[18424]: segfault at a0 ip 0000003bc9fba958 sp 00007fff1af30e50 error 4 in libvcllo.so[3bc9e00000+7dd000]
:Mar 23 09:42:28 bandura abrt[18500]: Saved core dump of pid 18424 (/usr/lib64/libreoffice/program/soffice.bin) to /var/spool/abrt/ccpp-2012-03-23-09:42:27-18424 (1...

Read more...

Revision history for this message
In , Anton (anton-redhat-bugs) wrote :

Created attachment 572211
File: dso_list

Revision history for this message
In , Anton (anton-redhat-bugs) wrote :

Created attachment 572212
File: maps

Revision history for this message
In , Anton (anton-redhat-bugs) wrote :

Created attachment 572214
File: backtrace

Revision history for this message
In , Caolan (caolan-redhat-bugs) wrote :

some kind of busted lifecycle.

Was this on closing an impress frame ?

Revision history for this message
In , Anton (anton-redhat-bugs) wrote :

I can guess only, but IIRC it was on the close of impress.

Revision history for this message
In , Michael (michael-redhat-bugs) wrote :

*** Bug 823272 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Caolan (caolan-redhat-bugs) wrote :

got to be on exit, and got to be the dtor of the "layouts" part of the right panel

Revision history for this message
In , Caolan (caolan-redhat-bugs) wrote :

But I can't reproduce it with having the layouts visible and closing impress, or activating layouts and moving elsewhere and shutting down. Nothing visible under valgrind. *maybe* this is effectively a duplicate of bug 805743 in which cases its fixed now. Or more likely there is some unknown set of circumstances that makes it crash.

If it happens for you again try and attach the presentation that was opened at the time of the crash in case its a certain amount of layouts in a presentation that's the trigger.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

subsequentcheck sometimes crashes in xmloff.Impress.XMLContentImporter::com::sun::star::document::XImporter

Testcode:
http://opengrok.libreoffice.org/xref/core/qadevOOo/tests/java/ifc/document/_XImporter.java
against service:
http://opengrok.libreoffice.org/xref/core/qadevOOo/tests/java/mod/_xmloff/Impress/XMLContentImporter.java

steps to reproduce:
cd xmloff
echo "-o xmloff.Impress.XMLContentImporter" > qa/unoapi/xmloff.sce
echo > qa/unoapi/knownissues.xcl << EOF
xmloff.Impress.XMLContentImporter::com::sun::star::lang::XInitialization
xmloff.Impress.XMLContentImporter::com::sun::star::document::XFilter
xmloff.Impress.XMLContentImporter::com::sun::star::container::XNamed
EOF
R=T; while test "$R" = "T"; do make subsequentcheck || R=F; done

expected result:
test passes without a crash

actual result:
crash

attaching backtrace and log

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Created attachment 63332
stacktrace

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Created attachment 63334
log

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

@Thorsten: Backtrace looks like something messing up during document teardown. Do you have any suspicion/hint which one of the many destructors might be going wrong there?

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Created attachment 63336
stacktrace with all threads

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

in case it helps: all the other impress testcases in xmloff trigger this too, so the bug description is just there to get a reproducable minimal testcase.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

http://opengrok.libreoffice.org/xref/core/sd/source/ui/toolpanel/TaskPaneFocusManager.cxx#243
looks suspicious erase() invalidates iterators, yet it is used in a loop

Revision history for this message
In , julien2412 (serval2412-6) wrote :

I'm not sure but :
"erase" invalidates the iterator in the for loop.
Then it breaks so we exit the inner/for loop but we keep on the outer/do loop (since bLinkRemoved =true), then iLink iterator var is recreated and reinitialized with begin.
In brief, yep "erase" invalidates, but then iLink is valid again.
Now perhaps I miss something obvious.

    234 do
    235 {
    236 bLinkRemoved = false;
    237 LinkMap::iterator iLink;
    238 for (iLink=mpLinks->begin(); iLink!=mpLinks->end(); ++iLink)
    239 {
    240 if (iLink->second.mpTargetWindow == pWindow)
    241 {
    242 RemoveUnusedEventListener(iLink->first);
    243 mpLinks->erase(iLink);
    244 bLinkRemoved = true;
    245 break;
    246 }
    247 }
    248 }
    249 while (bLinkRemoved);

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Julien: no, you are right.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Created attachment 63364
debug (vcl,sd) stacktrace

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

nasty: this one seems to be there only on gcc-4.7. Might still be our bug though.

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

Just to give some early warning to the gcc team -- this might be a bug in LibreOffice just as well as in gcc itself. No trouble with gcc-4.6 though.

Revision history for this message
In , julien2412 (serval2412-6) wrote :

Created attachment 63401
done.log

On pc Debian x86-64, with master sources udpated today, I reproduced the bug. In my case, it failed each time, not sometimes only.

I followed this link to try to debug :
http://wiki.documentfoundation.org/Development/How_to_debug#Debugging_the_subsequent_testslow
but I didn't understand how it worked :
- I couldn't switch off TUI with "C-x a" (or perhaps I badly interpreted, I tried "Ctrl+x" then "a")
- LO StartCenter launched but nothing then
So I stopped everything with Ctrl-c

For information, here are some config elements :
Linux kernel : 3.2.0-2-amd64
gcc (Debian 4.6.3-1) 4.6.3
ldd (Debian EGLIBC 2.13-33) 2.13
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.1) (6b24-1.11.1-6)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

autogen.lastrun :
--with-system-odbc
--enable-ext-mysql-connector
--with-system-mysql
--enable-symbols
--enable-ext-barcode
--enable-ext-diagram
--enable-ext-google-docs
--enable-ext-hunart
--enable-ext-nlpsolver
--enable-ext-ct2n
--enable-ext-numbertext
--enable-ext-oooblogger
--enable-ext-pdfimport
--enable-postgresql-sdbc
--enable-ext-presenter-console
--enable-ext-presenter-minimizer
--enable-ext-report-builder
--enable-ext-scripting-beanshell
--enable-ext-scripting-javascript
--enable-ext-typo
--enable-ext-validator
--enable-ext-watch-window
--enable-ext-wiki-publisher
--enable-dbus
--enable-graphite
--enable-evolution2
--enable-werror
--enable-debug
--enable-dbgutil
--enable-crashdump
--enable-kde4
--enable-dependency-tracking
--enable-online-update

Changed in df-libreoffice:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

I can *not* reporduce this with Ubuntu/Linaro 4.6.3-5ubuntu2 on Ubuntu Quantal, but with Ubuntu/Linaro 4.7.0-13ubuntu1 on Ubuntu Quantal

Revision history for this message
In , Dtardon (dtardon) wrote :

Created attachment 63427
possible fix

I cannot reproduce it here, but the attached patch might help

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Created attachment 63452
stacktrace after applying patch

patch does not seem to help, see attached stacktrace.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

oh, great. Just had this on a pure gcc 4.6 build.

Since I have not ever seen this on LibreOffice 3.5, I am assuming this to be a regression.

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote : Re: LibreOffice crash in xmloff.Impress.XMLContentImporter::com::sun::star::document::XImporter with gcc-4.7

removing gcc-4.7 -- has been reproduced on a pure 4.6 build.

no longer affects: gcc-4.7 (Ubuntu)
Revision history for this message
In , Mstahl (mstahl) wrote :

we have similar crashes in rhbz for LO 3.5 Fedora packages,
though i think we closed them as they were not reproducible.

the 3.5 packages are for Fedora 17, which uses GCC 4.7.

so it seems this doesn't just affect tests, but real users also.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

yes, this will affect endusers as it hits on closing an impress doc.
as for reproducibility: yes, this is a heisenbug to make things interesting. with looping the unoapi test, it can be reproduced after a few iterations. Maybe a race condition, or something other funky?

As said above I can reproduce this on both 4.7.1-2ubuntu1 (SVN 20120623/r188906) and 4.6.3-8ubuntu1 (SVN 20120624/r188916) on Ubuntu quantal, but so far have not reproduced this on 4.6.3-1ubuntu5 (only a few selected backports).

So either this, a) in gcc between 4.6.3 and SVN r188916 b) boost: 1.48.0.2 on precise (not reproducable) vs. 1.49.0.1 on quantal (reproducable) c) something else changing in the toolchain.

Hints for candidates of the "something else" kind are most welcome.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Hmmm, seeems Im too stupid to read .spec files:
http://pkgs.fedoraproject.org/gitweb/?p=libreoffice.git;a=blob;f=libreoffice.spec;h=e53252babd085b6621b37fbc0d6a51825069427f;hb=refs/heads/f17
has a BuildRequires boost-devel but no --with-system-libs or --with-system-boost, so I dont know if you are building with internal boost 1.44 or f17s boost 1.48.

But both ways, this seems to suggest our boost update (1.48->1.49) is innocent.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Possibly related:

https://lists.ubuntu.com/archives/ubuntu-devel/2012-June/035310.html

also note that I didnt recompile all build deps with 4.6 on quantal.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

\o/
Yep, that seems to be the root cause.

Brutally forcing CXX0X off with:
http://anonscm.debian.org/gitweb/?p=pkg-openoffice/libreoffice.git;a=blob;f=patches/trying-to-force-CXX0X-off-for-ABI-incompatibility.diff;h=1fa52e078613a9125ba43823e0a68c2d6085aab5;hb=4d4f4d6035b67dc42d21bb705d3da6570f9f2c05

helps a lot. xmloff unoapi subsequentcheck surviving >20 iterations and counting ...

So I guess we need to get rid of "autodetecting" CXX0X and make it an explicit option at least (to be activated once the distro have all system libs moved over to CXX0X -- likely in one big incompatible ABI step).

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

meh, saw the crasher again. But that might not be our (LibreOffice) error, but one of the system packages (or our packages having their own build like libwp*) using the crappy and useless CXX0X ABI (useless as it is even incompatible with itself between 4.6 and 4.7 might continue to do so).
So we need to make sure all those do *not* use --std=..cxx0x or use our internal version.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

just rechecked on Ubuntu 12.04 LTS precise: 1282 full runs of xmloff_unoapi without one hickup. So yeah, something did creep in cxx0x compiled stuff in Ubuntu 12.10 quantal in one/any of our deps. The fun is to find out what.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Seems we are dodging the bullet for now:

 http://gcc.gnu.org/ml/gcc/2012-07/msg00024.html

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote : Re: LibreOffice crash in xmloff.Impress.XMLContentImporter::com::sun::star::document::XImporter with gcc-4.7

attaching up-to-date stacktrace on quantal.

 this=0xa0

doesnt look quite right, so I assume the "wrong" std::list having done its thing unharmed (and thereby corrupting the memory) and this only being the fallout.

Revision history for this message
In , Michael Meeks (michael-meeks) wrote :

So - IMHO this is not a libreoffice bug :-) and should be closed NOTOURBUG ...

Of course, if we can add a configure check to catch systems that are compiled with an older version of libstdc++ or somesuch that'd be great - we could prolly compile a small file that did some sizeof() checks in configure.

But hopefully the issue has gone away...

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

So, I did some painful research on this:

On Ubuntu precise (build and run) the bug is not there.

On Ubuntu quantal with gcc 4.7 the bug is there even after fixing the ABI incompatibility.

On Ubuntu precise with LibreOffice packages build on quantal (sticking to quantal versions of non-LibreOffice packages), the bug is still there. So whatever is the root cause it is introducing the bug already at build time.

So I recompiled the packages on quantal with gcc 4.6 and retested on Ubuntu precise. Bug is still there, so it also is not the compiler update.

To make sure, I recompiled the exact compiler package from precise on quantal and then recompiled LibreOffice with that on quantal and tested the result on precise. The bug is _still_ there.

So, I am getting humbled by these results not to make any bold claims, but it seems to me that the bug is introduced by something changing _at_buildtime_ between precise and quantal (e.g. some dependencies) and it is _not_ gcc.

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

So, I did some painful research on this:

On Ubuntu precise (build and run) the bug is not there.

On Ubuntu quantal with gcc 4.7 the bug is there even after fixing the ABI incompatibility.

On Ubuntu precise with LibreOffice packages build on quantal (sticking to quantal versions of non-LibreOffice packages), the bug is still there. So whatever is the root cause it is introducing the bug already at build time.

So I recompiled the packages on quantal with gcc 4.6 and retested on Ubuntu precise. Bug is still there, so it also is not the compiler update.

To make sure, I recompiled the exact compiler package from precise on quantal and then recompiled LibreOffice with that on quantal and tested the result on precise. The bug is _still_ there.

So, I am getting humbled by these results not to make any bold claims, but it seems to me that the bug is introduced by something changing _at_buildtime_ between precise and quantal (e.g. some dependencies) and it is _not_ gcc.

summary: LibreOffice crash in
xmloff.Impress.XMLContentImporter::com::sun::star::document::XImporter
- with gcc-4.7
Changed in df-libreoffice:
status: Confirmed → Won't Fix
Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

Reopening, finally found the root cause of this it seems and LibreOffice is not really innocent. At:

http://opengrok.libreoffice.org/xref/core/sd/source/ui/toolpanel/TaskPaneFocusManager.cxx#229

we are generating an equal_range on an unsorted container, just to delete those in the next line. As erasing elements from that container invalidates iterators that is clearly illegal and one has to wonder how that ever worked at all.

Replacing line 229-230 with "mpLinks->erase(pWindow)" is not only simpler, cleaner and easier to read, it might actually be legal. There are some other abuses in that file that need a close look too.

Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

commited to master, waiting for review at:
https://gerrit.libreoffice.org/#/c/373/

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote : Re: LibreOffice crash in xmloff.Impress.XMLContentImporter::com::sun::star::document::XImporter

As per:
https://gerrit.libreoffice.org/#/c/373/

boost::unordered_map<>::erase(iterator, iterator) is broken in boost-1.49/gcc-4.7 on quantal (and likely too on gcc-4.6).

summary: - LibreOffice crash in
- xmloff.Impress.XMLContentImporter::com::sun::star::document::XImporter
+ boost::unordered_map<>::erase(iterator, iterator) broken on quantal
affects: boost (Ubuntu) → boost1.49 (Ubuntu)
summary: - boost::unordered_map<>::erase(iterator, iterator) broken on quantal
+ boost::unordered_multimap<>::erase(iterator, iterator) broken on quantal
Revision history for this message
In , Björn Michaelsen (bjoern-michaelsen) wrote :

tested again with internal boost 1.44 on quantal: 167 interations without a problem so far. so closing as "not our bug" again, but still would welcome the patch below to be integrated to 3.6 as it might help and cant hurt.

Changed in boost1.49 (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote : Re: boost::unordered_multimap<>::erase(iterator, iterator) broken on quantal

As per upstream bug:
Building LibreOffice with its internal boost version (1.44) evades the bug on quantal. It works fine with boost 1.48 on precise, so its a regression.
Using internal boost for LibreOffice is workaround, but not a good one:
- there was a hickup in the build with 1.44 (I had to rebuild codemaker manually once for the build to complete. Though that might be fixable).
- it would need careful reexamination of the interfaces between LibreOffice and its close deps for boost classes that might be incompatible between boost versions
- boost 1.49 would still broken on quantal
- as this is a heisenbug is possibly affects a _lot_ of apps unknowingly
- there is no working alternative on quantal

As this is a regression between 1.48 and 1.49 I had a look at the diff in boost/unordered, but its 1.5 KLOC of heavy template code, so so obvious quick fix from my side.

@doko: Can we have 1.48 on quantal too as a nonbroken boost implementation?

Revision history for this message
In , Libreoffice-bugs (libreoffice-bugs) wrote :

Bjoern Michaelsen committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=861e55bd889d9f5f5b37724b3615e9355e2d5c15&g=libreoffice-3-6

fdo#51324 lp#1017125 rhbz#806236 rhbz#823272: erase on invalid iterators

It will be available in LibreOffice 3.6.1.

Revision history for this message
Steve Langasek (vorlon) wrote :

Bjoern, we definitely should not be including two copies of boost in main to work around this bug. Can you provide a minimal test case showing the problem with the boost code, so we can work on getting that fixed?

Revision history for this message
Steve Langasek (vorlon) wrote :

Also, is there any bug report with boost upstream for this?

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

No, there is no upstream bug at boost about that that I know of yet. I see if I get to providing a minimal testcase.

Changed in libreoffice (Ubuntu):
status: New → In Progress
Revision history for this message
In , Sbergman (sbergman) wrote :

(In reply to comment #26)
> Reopening, finally found the root cause of this it seems and LibreOffice is not
> really innocent.

Just for the record, it more looks like a problem of boost than of LibreOffice to me (though the commit that happens to fix it is fine in and of itself anyway, of course); quoting recent #libreoffice-dev:

Aug 07 11:56:18 <Sweetshark> caolan: could you please review https://gerrit.libreoffice.org/#/c/373/ for libreoffice-3-6 ?
Aug 07 11:57:28 <sberg> Sweetshark, but aCandidates.first/second are not used after erase, so the original code should be fine?
Aug 07 12:01:21 <Sweetshark> sberg: afaik the iterator are not guaranteed to be stable _inside_ an erase (at least a few stl pages warned about that).
Aug 07 12:02:15 <sberg> Sweetshark, and "Remove the links [plural!] from the given window" suggests that there can indeed be multiple entries for pWindow (after all, its an unordered_multimap)
Aug 07 12:02:47 <sberg> Sweetshark, "not guaranteed to be stable": that would render erase(iterator,iterator) completely useless
Aug 07 12:07:38 <Sweetshark> sberg: fact is: without that I crash after ~10 iterations, with the change it crashes after >100 iterations on a different issue here. So either we are doing something illegal (which -- as you say is unlikely), or boost-1.49/gcc4.7 is broken wrt that.
Aug 07 12:15:04 <sberg> Sweetshark, or, only removing a single entry per pWindow instead of all of them happens to mask some other error
Aug 07 12:17:05 <Sweetshark> sberg: huh? according to boost docs erase(key&) also kills _all_ pWindows
Aug 07 12:18:39 <sberg> Sweetshark, ah, right; odd, then
Aug 07 12:20:19 <Sweetshark> sberg: note I also replaced some of the std::lists with std::deques before to evade ABI breakage. however that did not fix the issue.

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote : Re: boost::unordered_multimap<>::erase(iterator, iterator) broken on quantal

Patched around with:
http://anonscm.debian.org/gitweb/?p=pkg-openoffice/libreoffice.git;a=commitdiff;h=576f40c827638e002752fee256c1f67b7b493007
in Ubuntu packaging. Upstreamed and backported to 3.6 upstream as:
https://gerrit.libreoffice.org/#/c/373/

However I could not reproduce this in a simple testcase so far although I tried. Looking further I found that LibreOffice uses an icky cast in this for the map, although unless I am mistaken, that should not lead to any issues on our archs (it might cause unhelpful hash collisions on windows 64-Bit though). Jut to be save and sane I also commited a better cast there:
https://gerrit.libreoffice.org/gitweb?p=core.git;a=commitdiff;h=03d64b736ac612f7ce2e7c40a0be04a6e23ae489

Hints or corrections (on why that have might indeed be troublesome) most welcome.

Changed in libreoffice (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote : Re: boost::unordered_multimap<>::erase(iterator, iterator) broken on quantal

Reproduction scenario: Note that the rand() stuff is needed, a sequence of 500 subsequent entries does not cut it (hash collisions needed to trigger the bug maybe?).
Compile with gcc (Ubuntu/Linaro 4.7.1-6ubuntu1) 4.7.1 and:
g++ -I/usr/include lp1017125.cxx && ./a.out
on quantal and it crashes here everytime.

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

note: the crash happens on destruction of the map, not while iterating curiously.

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

Walking through the powers of 2 and then bisecting, it seems it is stable up to 12 elements and with 13 it begins to fall apart.

Revision history for this message
Steve Langasek (vorlon) wrote :

Marking triaged for gcc now that we have a test case.

Changed in gcc-4.7 (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
Steve Langasek (vorlon)
Changed in gcc-4.7 (Ubuntu):
assignee: nobody → Matthias Klose (doko)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libreoffice - 1:3.6.0~rc4-0ubuntu3

---------------
libreoffice (1:3.6.0~rc4-0ubuntu3) quantal-proposed; urgency=low

  * backport patch to evade fdo#51324 (LP: #1017125)
  * pure white progress bar is better for now (LP: #1026059)
  * reenable subsequentcheck
  * update patch queue:
    - readd split-binfilters-and-evo.diff from debian
    - add backported dont-let-autoextension-interfere-with-kfiledialog.diff
      from debian
    - remove obsolete force C++ ABI patch
    - remove obsolete lp-904212-add-missing-mimetypes-to-impress.desktop.diff
    - remove obsolete remove-broken-mysqlcon-version-check.diff
    - remove unoapi-test disabling patch, now that we seem to evade lp#1017125
  * reenable reportbuilder for universe ppa builds only (LP: #992232)
  * add packagekit patch
  * add unitymenus patch
 -- Bjoern Michaelsen <email address hidden> Tue, 21 Aug 2012 02:46:51 +0200

Changed in libreoffice (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Matthias Klose (doko) wrote :

using std::unordered_map in c++11 mode instead of the boost version does work, so I don't see what can/should be fixed on the GCC side. Is there now a boost upstream report as suggested earlier?

Changed in gcc-4.7 (Ubuntu):
status: Triaged → Invalid
Revision history for this message
Matthias Klose (doko) wrote :

forgot to say that the testcase has the same behaviour for 4.6 and 4.7

Changed in boost:
status: Unknown → New
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

Claimed upstream to be fixed with https://svn.boost.org/trac/boost/changeset/80894.

@doko: Since https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1067907 likely is caused by the same root cause, is it possible to backport that fix, so that the next LibreOffice SRU is fixed with this?

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

The attached branch with the backported fix seems to indeed fix the testcase. Can someone pick this up and sponsor it? I would also suggest SRUing it to quantal as it possibly fixes bug 1067907.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Subscribing ~ubuntu-sponsors.

For SRU, please fill out the template in the bug description as per https://wiki.ubuntu.com/StableReleaseUpdates#SRU_Bug_Template

Changed in boost1.49 (Ubuntu Quantal):
importance: Undecided → High
Changed in gcc-4.7 (Ubuntu Quantal):
status: New → Invalid
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package boost1.49 - 1.49.0-3.1ubuntu4

---------------
boost1.49 (1.49.0-3.1ubuntu4) raring; urgency=low

  * backport fix for boost unordered (LP: #1017125)
 -- Bjoern Michaelsen <email address hidden> Thu, 08 Nov 2012 02:02:20 +0100

Changed in boost1.49 (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :
description: updated
summary: - boost::unordered_multimap<>::erase(iterator, iterator) broken on quantal
+ [SRU quantal] boost::unordered_multimap<>::erase(iterator, iterator)
+ broken in bosst1.49
summary: [SRU quantal] boost::unordered_multimap<>::erase(iterator, iterator)
- broken in bosst1.49
+ broken in boost1.49
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

please find attached the minimal debdiff to review far SRU to quantal. Package is available for pickup on chinstrap too.

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

Humbly assigning to doko for consideration as a quantal SRU.

Changed in libreoffice (Ubuntu Quantal):
assignee: nobody → Matthias Klose (doko)
Revision history for this message
In , Bjoern-michaelsen-e (bjoern-michaelsen-e) wrote :

boost::unordered_multimap<>::erase(iterator, iterator) broken in boost1.49-1.51 causing spurious segfaults in applications using it including LibreOffice

Reproducible: Always

Steps to Reproduce:
compile and run https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1017125/+attachment/3271642/+files/lp1017125.cxx
Actual Results:
crash

Expected Results:
no crash

Revision history for this message
In , Diego Elio Pettenò (flameeyes) wrote :

Lovely.

Well, 1.52 is in ~arch right now ... but it'll take a while to get this in stable I'm afraid. Is there a patch?

Revision history for this message
In , Bjoern-michaelsen-e (bjoern-michaelsen-e) wrote :
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Do not accept boost1.49 without a matching boost-mpi-source1.49.
Otherwise the mpi package will become uninstallable with the sru.

Changed in boost-mpi-source1.49 (Ubuntu):
status: New → Fix Released
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

adding minimal debdiff for mpi

Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

boost1.49_1.49.0-3.1ubuntu1.1 and boost-mpi-source1.49_1.49.0-3.1ubuntu1.1 for quantal SRU prepared for pickup.

Matthias Klose (doko)
Changed in gcc-4.7 (Ubuntu):
assignee: Matthias Klose (doko) → nobody
Revision history for this message
Adam Conrad (adconrad) wrote : Please test proposed package

Hello Björn, or anyone else affected,

Accepted boost1.49 into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/boost1.49/1.49.0-3.1ubuntu1.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in boost1.49 (Ubuntu Quantal):
status: New → Fix Committed
tags: added: verification-needed
Changed in boost-mpi-source1.49 (Ubuntu Quantal):
status: New → Fix Committed
Revision history for this message
Adam Conrad (adconrad) wrote :

Hello Björn, or anyone else affected,

Accepted boost-mpi-source1.49 into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/boost-mpi-source1.49/1.49.0-3.1ubuntu1.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in boost1.49 (Gentoo Linux):
importance: Unknown → Medium
status: Unknown → New
Changed in boost1.49 (Debian):
status: Unknown → New
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

after installing the -proposed package, the test doesnt segfault anymore for me on amd64 ...

tags: added: verification-done
removed: verification-needed
Revision history for this message
Scott Kitterman (kitterman) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Scott Kitterman (kitterman) wrote :

SRU released to updates.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package boost-mpi-source1.49 - 1.49.0-3.1ubuntu1.1

---------------
boost-mpi-source1.49 (1.49.0-3.1ubuntu1.1) quantal-proposed; urgency=low

  * Backport fix for boost unordered (the mpi/universe portion of the patch).
    LP: #1017125.
 -- Bjoern Michaelsen <email address hidden> Wed, 14 Nov 2012 15:32:16 +0100

Changed in boost-mpi-source1.49 (Ubuntu Quantal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package boost1.49 - 1.49.0-3.1ubuntu1.1

---------------
boost1.49 (1.49.0-3.1ubuntu1.1) quantal; urgency=low

  * SRU/cherrypick fix for boost unordered (LP: #1017125)
 -- Bjoern Michaelsen <email address hidden> Thu, 08 Nov 2012 02:02:20 +0100

Changed in boost1.49 (Ubuntu Quantal):
status: Fix Committed → Fix Released
Revision history for this message
In , Michael (michael-redhat-bugs) wrote :

just for the record this was most likely caused by
a bug in boost that is fixed in newer boost and
worked around in newer LibreOffice upstream.

Changed in boost:
status: New → Fix Released
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Fixed in boost instead.

Changed in libreoffice (Ubuntu Quantal):
status: New → Invalid
Changed in boost1.49 (Debian):
status: New → Fix Released
Revision history for this message
anonymous mouse (eeefafe) wrote :

I just had LibreOffice 3.5.7.2 Build ID: 350m1(Build:2) crash on me with a document. The crash report said that bug 1067907 was the bug. That bug says it is a duplicate of this bug. As such, I'm attaching the relevant document here.

It's a docx that I opened in Abiword and then saved as a Abiword .doc (which I think might be RTF actually, with merely a .doc extension). But, it crashes LibreOffice. (The original .docx doesn't show up correctly in either darn program, but the bit I'm interested in does show up better in Abiword.)

I'm using Ubuntu 12.04 with Gnome 3.

Changed in boost1.49 (Fedora):
importance: Unknown → Undecided
status: Unknown → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.