Comment 6 for bug 1915828

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

I'm pasting an email answer here so we keep track of all that has been discussed for this issue:

----

> Me and Dan have been working recently on a customer case reported as LP: #1915828. Turns out that some linker "magic" used inside libqb broke pacemaker (and potentially any package using QB_LOG_INIT_DATA from the library or even every packaged linked against libqb).

I have checked libqb commits and: https://paste.ubuntu.com/p/vTdYJdC4Hc/ . I see that upstream added some options in configure.ac related to the linking issue, even quoting debian/ubuntu option to override libtool variable by force. There is also an existing test for the __attribute__((section("__verbose"))) breakage detection now (gcc_has_attribute_section_visible variable in configure.ac.

> Dan found an ABI-compatible way of mitigating the problem by rebuilding pacemaker with the QB_KILL_ATTRIBUTE_SECTION defined. This strips the magic and makes pacemaker useful again.

I see the mitigation (QB_KILL_ATTRIBUTE_SECTION) comes from:

--
32555d8 tests: add a script to generate callsite-heavy logging client…
 ...so as to evaluate use of resources. In particular, the intention
 here is to uncover the observable differences between the same logging
 code built with callsite section (default when available) and
 purposefully (overriding that default by force) without it.
...
--

And they even added “tests/functional/log_callsite_bench_gen.py” to measure the impact of this mitigation.

I’m particularly worried with:


    Based on the above, we can conclude that leveraging the callsite
    section for logging as facilitated by the toolchain intrinsics is
    beneficial, especially for performance-critical applications (corosync
    being the showcase here). Therefore it's desired to struggle for
    retaining this nifty trick despite some troubles emerged with recent
    binutils releases (starting with 2.29) and the changed behaviour we
    relied on so far in respective ld.bfd linkers (as mentioned in
    preceding commits). That motive is immediately followed -- well,
    judging the impact fairly, actually outclassed -- with the intention
    to preserve binary compatibility (incl. continuous library support for
    callsite section offloading spread in the existing client space widely
    for quite some years already) to the utmost extent possible.
--

I believe this will be accepted by the SRU team but this, for sure, has to be mentioned in the public bug. I would add to [regression potential] the fact that the logging mechanism would stir heap more often (commit log has even a time execution delta).

Shouldn’t this bug also affect all the libqb0 rdepends ? I can see pacemaker, sbd, corosync and usbguard source packages.

> The problem is (more details in comments #3-#5) some symbols disappear from the package. Those symbols doesn't seem to be used anywhere explicitly, but we were wondering if it's ok to just drop those symbols or maybe to implement a change in libqb to create dummy constructors (e.g. https://pastebin.canonical.com/p/Y4fk747YfK/) to ensure the symbols are available just in case.

For the pacemaker fix I’ll let the SRU team to discuss whether they would like to have symbols (or not) the symbols in the new binary (after this fix). I don’t think those symbols are used elsewhere (from rdepends of libcrm* they would only be used by either pacemaker OR sbd).

I think next step here is to offer a patch and ask for the SRU team input/review.