On Tue, Apr 9, 2024 at 6:09 PM Martin Pitt <email address hidden> wrote:
>
> Nathan Scott [2024-04-09 17:30 +1000]:
> > > It's not really unknown, it's "just" a file conflict:
> >
> > Yeah - the unknown bit for me is "why tho" - I cannot see conflicting
> > files in those packages that would have any debug symbols (there's
> > some common directories... but no binaries shared AFAICS).
> >
> > > | dpkg: error processing archive build/deb/pcp-pmda-infiniband-dbgsym_6.2.1-0.20240409.f312285_amd64.deb (--install):
> > > | trying to overwrite '/usr/lib/debug/.build-id/57/02df011cfaf166b948e1fefde236eaf3a6ee65.debug', which is also in package pcp-dbgsym 6.2.1-0.20240409.f312285
> > > |
> > > | dpkg: error processing archive build/deb/pcp-testsuite-dbgsym_6.2.1-0.20240409.f312285_amd64.deb (--install):
> > > | trying to overwrite '/usr/lib/debug/.build-id/17/6edc7e590f766a2ea87b5decaeb994d7c48d24.debug', which is also in package pcp-dbgsym 6.2.1-0.20240409.f312285
> > >
> > > I.e. these are shipped in two different packages.
> >
> > "these"?
>
> These two files, i.e.
> /usr/lib/debug/.build-id/57/02df011cfaf166b948e1fefde236eaf3a6ee65.debug exists
> both in pcp-pmda-infiniband-dbgsym and pcp-dbgsym. Presumably they shouldn't be
> in the latter.
Yep, understood - but again, I'm not understanding why. AFAICS, there
are no files with the same names or contents between those packages.
> > OK ... so that's pointing towards v3 archives a little bit, good.
> >
> > > > The limited stack we have suggests we're in pmproxy log discovery
> > > > code, in an inotify/libuv event, which does have v3-specific code.
> > > >
> > > > For those who can reproduce this, it'd be worth experimenting and
> > > > setting the following field back to 2 ... (requires pmlogger restart).
> > > >
> > > > $ grep PCP_ARCHIVE_VERSION /etc/pcp.conf
> > > > PCP_ARCHIVE_VERSION=3
>
> I created https://github.com/cockpit-project/cockpit/pull/20275 with an x120
> test amplification, and intererestinly there the overwhelming majority of test
> runs actually crashes there. So with that I have a fairly high confidence in
> the significance of test results when trying a change.
>
> I tested with
>
> sed -i 's/PCP_ARCHIVE_VERSION=3/PCP_ARCHIVE_VERSION=2/' /etc/pcp.conf
>
> This runs on image preparation, i.e. clean /var/log and no daemons running. The
> VM is freshly booted for each test, so no running pmlogger. There is no
> observed change, it still crashes the same way and with the same frequency
> ("almost every time").
OK, so it's not related to the recent v3 archive changes then.
I have a fairly recent Debian VM locally - I tried reproducing the problem
there but had no luck, it always starts fine and runs fine. Tried running it
under valgrind too just in case, but again nothing.
We also don't see this issue on Fedora, CentOS or RHEL and SuSE are
also not reporting this. Since (I think) all Debian versions are fine (can we
confirm?) it might be a Ubuntu-specific issue. Are there patches or some
global compiler option(s) unique to Ubuntu versions where this is failing?
Otherwise, I'm fresh out of ideas.
I can't really justify time chasing this any further - I think we'll need some
assistance from Ubuntu folk to help us gain additional insights.
Hi Martin,
On Tue, Apr 9, 2024 at 6:09 PM Martin Pitt <email address hidden> wrote: pcp-pmda- infiniband- dbgsym_ 6.2.1-0. 20240409. f312285_ amd64.deb (--install): debug/. build-id/ 57/02df011cfaf1 66b948e1fefde23 6eaf3a6ee65. debug', which is also in package pcp-dbgsym 6.2.1-0. 20240409. f312285 pcp-testsuite- dbgsym_ 6.2.1-0. 20240409. f312285_ amd64.deb (--install): debug/. build-id/ 17/6edc7e590f76 6a2ea87b5decaeb 994d7c48d24. debug', which is also in package pcp-dbgsym 6.2.1-0. 20240409. f312285 debug/. build-id/ 57/02df011cfaf1 66b948e1fefde23 6eaf3a6ee65. debug exists infiniband- dbgsym and pcp-dbgsym. Presumably they shouldn't be
>
> Nathan Scott [2024-04-09 17:30 +1000]:
> > > It's not really unknown, it's "just" a file conflict:
> >
> > Yeah - the unknown bit for me is "why tho" - I cannot see conflicting
> > files in those packages that would have any debug symbols (there's
> > some common directories... but no binaries shared AFAICS).
> >
> > > | dpkg: error processing archive build/deb/
> > > | trying to overwrite '/usr/lib/
> > > |
> > > | dpkg: error processing archive build/deb/
> > > | trying to overwrite '/usr/lib/
> > >
> > > I.e. these are shipped in two different packages.
> >
> > "these"?
>
> These two files, i.e.
> /usr/lib/
> both in pcp-pmda-
> in the latter.
Yep, understood - but again, I'm not understanding why. AFAICS, there
are no files with the same names or contents between those packages.
> > OK ... so that's pointing towards v3 archives a little bit, good. VERSION= 3 /github. com/cockpit- project/ cockpit/ pull/20275 with an x120 ARCHIVE_ VERSION= 3/PCP_ARCHIVE_ VERSION= 2/' /etc/pcp.conf
> >
> > > > The limited stack we have suggests we're in pmproxy log discovery
> > > > code, in an inotify/libuv event, which does have v3-specific code.
> > > >
> > > > For those who can reproduce this, it'd be worth experimenting and
> > > > setting the following field back to 2 ... (requires pmlogger restart).
> > > >
> > > > $ grep PCP_ARCHIVE_VERSION /etc/pcp.conf
> > > > PCP_ARCHIVE_
>
> I created https:/
> test amplification, and intererestinly there the overwhelming majority of test
> runs actually crashes there. So with that I have a fairly high confidence in
> the significance of test results when trying a change.
>
> I tested with
>
> sed -i 's/PCP_
>
> This runs on image preparation, i.e. clean /var/log and no daemons running. The
> VM is freshly booted for each test, so no running pmlogger. There is no
> observed change, it still crashes the same way and with the same frequency
> ("almost every time").
OK, so it's not related to the recent v3 archive changes then.
I have a fairly recent Debian VM locally - I tried reproducing the problem
there but had no luck, it always starts fine and runs fine. Tried running it
under valgrind too just in case, but again nothing.
We also don't see this issue on Fedora, CentOS or RHEL and SuSE are
also not reporting this. Since (I think) all Debian versions are fine (can we
confirm?) it might be a Ubuntu-specific issue. Are there patches or some
global compiler option(s) unique to Ubuntu versions where this is failing?
Otherwise, I'm fresh out of ideas.
I can't really justify time chasing this any further - I think we'll need some
assistance from Ubuntu folk to help us gain additional insights.
cheers.
--
Nathan