Comment 2 for bug 1940505

Revision history for this message
In , Yitingwang16 (yitingwang16) wrote :

If there are two callers in two .text sections (for example sections A, B), followed by two sections with R_RISCV_ALIGN type relocations (section C, D), then followed by the called functions in sections behind (section E), there is a chance to reproduce the issue.

I draw a picture to illustrate the problem:

                   ____________________________________
       high | |
        ^ -> | section E: cc, dd | <-
  | | |___________________________________| |
        | | | | |
        | | | section D: align2 (R_RISCV_ALIGN) | |
        | | |___________________________________| |
        | | | | |
        | | | section C: align1 (R_RISCV_ALIGN) | |
        | | |___________________________________| |
        | | | | |
        | | | section B : ff (R_RISCV_CALL) |--
        | | |___________________________________|
        | | | |
  address low -- | section A : _start (R_RISCV_CALL) |
                   |___________________________________|

Consider this situation: Section A, B, C, D, E are linked in sequence. Section C and D have R_RISCV_ALIGN type relocations. Function _start() calls function cc(), function ff() calls dd(). The size of each section is 8, 8, 64, 0xfffbc, 10 bytes before relaxation.

In the first relaxation pass (info->relax_pass ==0), the R_RISCV_CALL relocations in section A and B couldn't be relaxed to R_RISCV_JAL, because the call from _start() to cc() and the call from ff() to dd() don't fit with 21-bit offset. The offset is 0x10001c indeed. In the first relaxation pass, nothing is done.

In the third relaxation pass (info->relax_pass ==2), the R_RISCV_ALIGN relocations in section C and D could be relaxed. The sizes of section C and D were changed to 34 and 0xfffba bytes. Now, the offset of the call from _start() to cc() is 0xffffe, the offset from ff() to dd() equals to this value too.

In the second round of relaxation, in the first pass (info->relax_pass ==0), the R_RISCV_CALL relocation in section A could be relaxed to R_RISCV_JAL. After the relaxation, section A's size is reduced by 4 bytes. So, section B's base address and every symbol go down 4 bytes forward. However section C, D and E will not go down 4 bytes, because of the .balign restriction placed in section C and D.
But, when linker processes the relaxation of section B, it uses the original section B symbol addresses to calculate the offset of R_RISCV_CALL relocation (from ff() to dd()). The offset is within 1M bytes offset (0xffffe). So, the R_RISCV_CALL relocation is relaxed to R_RISCV_JAL. This is a mistake!
When linker finally performs the relocation (in perform_relocation()), the section B's symbol has been adjusted, then the bug is exposed.

I created a simple test case to trigger this issue:

    riscv64-unknown-linux-gnu-gcc -nostdlib -o a.out a.s align_1.s align_2.s c.s
    /tmp/ccgFgKhw.o: In function `ff':
    (.text+0x0): relocation truncated to fit: R_RISCV_JAL against symbol `dd' defined in .text section in /tmp/ccR0Ha6r.o