git-annex 10.20230126-3 fails to build from source

Bug #2019992 reported by Benjamin Drung
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ghc (Ubuntu)
Fix Released
Undecided
Unassigned
git-annex (Debian)
Fix Released
Unknown
git-annex (Ubuntu)
Fix Released
High
Sergio Durigan Junior

Bug Description

git-annex 10.20230126-3 fails to build from source:

```
[618 of 681] Compiling Assistant.WebApp.Types ( Assistant/WebApp/Types.hs, dist/build/git-annex/git-annex-tmp/Assistant/WebApp/Types.o, dist/build/git-annex/git-annex-tmp/Assistant/WebApp/Types.dyn_o )
/usr/bin/ld.gold: error: dist/build/git-annex/git-annex-tmp/Utility/Yesod.dyn_o: requires dynamic R_X86_64_PC32 reloc against 'UtilityziYesod_widgetFilezuw_closure' which may overflow at runtime; recompile with -fPIC
collect2: error: ld returned 1 exit status
`x86_64-linux-gnu-gcc' failed in phase `Linker'. (Exit code: 1)
```

Full amd64 build log: https://launchpadlibrarian.net/664477495/buildlog_ubuntu-mantic-amd64.git-annex_10.20230126-3_BUILDING.txt.gz

Benjamin Drung (bdrung)
Changed in git-annex (Ubuntu):
importance: Undecided → High
status: New → Confirmed
tags: added: update-excuse
Revision history for this message
Simon Chopin (schopin) wrote :

I looked at this issue for quite a while. The package does *not* FTBFS in Debian. I tried passing various combinations of PIC and PIE flags to ghc, to no avail. Sadly, I wasn't able to even determine that the flags I gave made their way down to the linker invocation, since it's not possible for ghc to be loud in its invocation.

However, comparing the results of the Debian and Ubuntu compilation, the .o file contains in both cases the R_X86_64_PC32 relocation, which makes me think that it's a difference in linker behaviour. I wasn't able to find anything suspicious in our binutils delta with Debian.

I'm assigning this to GHC because it seems clearly a toolchain issue to me.

For whoever picks it up again, note that build parallelism is disabled at the top of d/rules. For your builds you might want to enable it again (simply use -j instead of -j1).

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hey,

So, I picked it up from where Simon left off, and I managed to figure out what's causing the FTBFS. It's our use of -Wl,-Bsymbolic-functions. If we strip this flag from LDFLAGS, the build succeeds.

I would like to leave a few notes here, though.

- It's possible to obtain a verbose build from gch by passing "-v" to it. Because the build system is using cabal, we actually need to instruct it to pass the flag down to ghc. You can do it by:

export BUILDEROPTIONS += --ghc-option=-v

- By comparing the actual linker invocation between Debian and Ubuntu, I noticed that it isn't only -Wl,-Bsymbolic-functions (and LTO) that differs. Debian is building the package using "-no-pie -fno-PIC", while Ubuntu builds it using "-shared" (but no "-fPIC"). I tried finding the reason for this discrepancy, but everything other than the versions of the toolchain tools seems to be the same.

- Passing "-fPIC" to ghc (by using --ghc-option=-fPIC) doesn't solve the issue, either. There is something else at play here. Unfortunately, I don't have the time to dive deeper into the issue right now.

I'll upload a new version of git-annex with -Wl,-Bsymbolic-functions stripped soon, and will also propose the change to Debian.

Changed in git-annex (Ubuntu):
assignee: nobody → Sergio Durigan Junior (sergiodj)
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Something else worth noting: initially I thought that the problem could be related to gold, so I tried using ld and ld.bfd, but neither worked.

Interesting discussions on bugs that look similar to this one:

- https://github.com/haskell/haskell-ide-engine/issues/830

- https://gitlab.haskell.org/ghc/ghc/-/issues/20689

Changed in git-annex (Debian):
status: Unknown → New
Changed in git-annex (Debian):
status: New → Fix Released
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Sigh... This is more involved than I initially thought. After uploading the -Wl,-Bsymbolic-functions fix, I noticed that ppc64el, armhf, s390x and riscv64 were still failing to build the package. Then, I tracked down the problems on ppc64el and s390x to be LTO-related, so I did a new upload which fixed the issue. Now, I'm trying to track down the problem with armhf and riscv64.

I'll leave another comment if I can find anything useful.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :
Changed in ghc (Ubuntu):
status: New → Fix Released
Changed in git-annex (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.