Comment 5 for bug 2060275

Revision history for this message
Nathan Scott (nathans) wrote : Re: Fwd: [Bug 2060275] [NEW] pmproxy crash at startup in libpcp_web.so.1

Hi Martin,

On Tue, Apr 9, 2024 at 6:09 PM Martin Pitt <email address hidden> wrote:
>
> Nathan Scott [2024-04-09 17:30 +1000]:
> > > It's not really unknown, it's "just" a file conflict:
> >
> > Yeah - the unknown bit for me is "why tho" - I cannot see conflicting
> > files in those packages that would have any debug symbols (there's
> > some common directories... but no binaries shared AFAICS).
> >
> > > | dpkg: error processing archive build/deb/pcp-pmda-infiniband-dbgsym_6.2.1-0.20240409.f312285_amd64.deb (--install):
> > > | trying to overwrite '/usr/lib/debug/.build-id/57/02df011cfaf166b948e1fefde236eaf3a6ee65.debug', which is also in package pcp-dbgsym 6.2.1-0.20240409.f312285
> > > |
> > > | dpkg: error processing archive build/deb/pcp-testsuite-dbgsym_6.2.1-0.20240409.f312285_amd64.deb (--install):
> > > | trying to overwrite '/usr/lib/debug/.build-id/17/6edc7e590f766a2ea87b5decaeb994d7c48d24.debug', which is also in package pcp-dbgsym 6.2.1-0.20240409.f312285
> > >
> > > I.e. these are shipped in two different packages.
> >
> > "these"?
>
> These two files, i.e.
> /usr/lib/debug/.build-id/57/02df011cfaf166b948e1fefde236eaf3a6ee65.debug exists
> both in pcp-pmda-infiniband-dbgsym and pcp-dbgsym. Presumably they shouldn't be
> in the latter.

Yep, understood - but again, I'm not understanding why. AFAICS, there
are no files with the same names or contents between those packages.

> > OK ... so that's pointing towards v3 archives a little bit, good.
> >
> > > > The limited stack we have suggests we're in pmproxy log discovery
> > > > code, in an inotify/libuv event, which does have v3-specific code.
> > > >
> > > > For those who can reproduce this, it'd be worth experimenting and
> > > > setting the following field back to 2 ... (requires pmlogger restart).
> > > >
> > > > $ grep PCP_ARCHIVE_VERSION /etc/pcp.conf
> > > > PCP_ARCHIVE_VERSION=3
>
> I created https://github.com/cockpit-project/cockpit/pull/20275 with an x120
> test amplification, and intererestinly there the overwhelming majority of test
> runs actually crashes there. So with that I have a fairly high confidence in
> the significance of test results when trying a change.
>
> I tested with
>
> sed -i 's/PCP_ARCHIVE_VERSION=3/PCP_ARCHIVE_VERSION=2/' /etc/pcp.conf
>
> This runs on image preparation, i.e. clean /var/log and no daemons running. The
> VM is freshly booted for each test, so no running pmlogger. There is no
> observed change, it still crashes the same way and with the same frequency
> ("almost every time").

OK, so it's not related to the recent v3 archive changes then.

I have a fairly recent Debian VM locally - I tried reproducing the problem
there but had no luck, it always starts fine and runs fine. Tried running it
under valgrind too just in case, but again nothing.

We also don't see this issue on Fedora, CentOS or RHEL and SuSE are
also not reporting this. Since (I think) all Debian versions are fine (can we
confirm?) it might be a Ubuntu-specific issue. Are there patches or some
global compiler option(s) unique to Ubuntu versions where this is failing?
Otherwise, I'm fresh out of ideas.

I can't really justify time chasing this any further - I think we'll need some
assistance from Ubuntu folk to help us gain additional insights.

cheers.

--
Nathan