fails to build on armel/lucid

Bug #503185 reported by Alexander Sack on 2010-01-05
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
blcr (Ubuntu)
High
Unassigned

Bug Description

http://launchpadlibrarian.net/37339875/buildlog_ubuntu-lucid-armel.blcr_0.8.2-7_FAILEDTOBUILD.txt.gz

/bin/bash ../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../../libcr -I.. -D_GNU_SOURCE -D_REENTRANT -I../include -I../../include -I../../libcr/arch/arm/ -Wall -Wno-unused-function -Wall -g -O2 -MT libcr_la-cr_async.lo -MD -MP -MF .deps/libcr_la-cr_async.Tpo -c -o libcr_la-cr_async.lo `test -f 'cr_async.c' || echo '../../libcr/'`cr_async.c
libtool: compile: gcc -DHAVE_CONFIG_H -I. -I../../libcr -I.. -D_GNU_SOURCE -D_REENTRANT -I../include -I../../include -I../../libcr/arch/arm/ -Wall -Wno-unused-function -Wall -g -O2 -MT libcr_la-cr_async.lo -MD -MP -MF .deps/libcr_la-cr_async.Tpo -c ../../libcr/cr_async.c -fPIC -DPIC -o .libs/libcr_la-cr_async.o
/tmp/ccb5iUfU.s: Assembler messages:
/tmp/ccb5iUfU.s:722: Error: selected processor does not support `swp r1,r2,[r5]'
/tmp/ccb5iUfU.s:724: Error: selected processor does not support `swpne r0,r1,[r5]'
/tmp/ccb5iUfU.s:776: Error: selected processor does not support `swp r1,r2,[r5]'
/tmp/ccb5iUfU.s:778: Error: selected processor does not support `swpne r2,r1,[r5]'
/tmp/ccb5iUfU.s:885: Error: selected processor does not support `swp r1,r3,[r5]'
/tmp/ccb5iUfU.s:887: Error: selected processor does not support `swpne r0,r1,[r5]'
/tmp/ccb5iUfU.s:907: Error: selected processor does not support `swp r0,r1,[r3]'
/tmp/ccb5iUfU.s:909: Error: selected processor does not support `swpne r1,r0,[r3]'
make[3]: *** [libcr_la-cr_async.lo] Error 1
make[3]: Leaving directory `/build/buildd/blcr-0.8.2/build/libcr'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/build/buildd/blcr-0.8.2/build'

Related branches

Alexander Sack (asac) on 2010-01-05
Changed in blcr (Ubuntu):
status: New → Confirmed
importance: Undecided → High
assignee: nobody → David Sugar (dyfet)
tags: added: armel thumb2
Alan (awoodland) wrote :

Am I correct in thinking armel for Ubuntu is targeting a different arm varient than armel in Debian then? Any arm gurus know of an easy workaround?

It builds fine on armel Debian:
 https://buildd.debian.org/fetch.cgi?pkg=blcr;ver=0.8.2-7;arch=armel;stamp=1261371421

Alan (awoodland) wrote :

Further research indicates that SWP has been deprecated and subsequently dropped from the instruction set in newer arm processors:

http://www.doulos.com/knowhow/arm/Hints_and_Tips/Implementing_Semaphores/

I'll see about putting together a patch with upstream to correct this.

Matthias Klose (doko) wrote :

the atomic builtins available in gcc-4.4 and newer should be used instead of inline assembler if possible

Hi,

I noticed today that BLCR is failing to build on armel on Ubuntu. This
problem is reported in [1]. It would seem that ARMv7 no longer
includes swp/swpne instructions. It's relatively trivial to
conditionally replace this with appropriate ldrex/strex instructions
instead though. Before I produce a patch though I have a few
questions:

1) CR_KCODE___kuser_cmpxchg clearly isn't defined at build time here.
The test doesn't get run because we're only building userspace parts
at this point. Is that correct? Are we always expecting to be in the
#else oflibcr/arch/arm/cr_atomic.h when building userspace bits? I
can't help but feel I've missed something in my reading of this
somehow though.
2) Is a 2nd alternative for
cri_atomic_inc/cri_atomic_dec_and_test/cri_cmp_swap the correct way to
fix this? The comments in the source seem to suggest that using
ldrex/strex would be correct and better where they exist. They also
imply that we shouldn be using __kernel_cmpxchg on newer kernels
anyway though.

Thanks,
Alan

[1] https://bugs.launchpad.net/ubuntu/+source/blcr/+bug/503185

tags: added: armv7
Dave Martin (dave-martin-arm) wrote :
Download full text (3.6 KiB)

Hi

> [...] It would seem that ARMv7 no longer includes swp/swpne
> instructions. It's relatively trivial to conditionally
> replace this with appropriate ldrex/strex instructions

See https://wiki.ubuntu.com/ARM/Thumb2 for a bit more background on this. ARMv7 still supports SWP, but it's inefficient and not guaranteed to work correctly on all platforms in the future, especially in the case of multi-core platforms. In addition, you're getting build errors because lucid is building for the Thumb-2 instruction set by default now, which doesn't support SWP at all.

It looks like most of the functions in the header should be straightforward to implement using the GCC atomics: See lucid gcc-4.4 info pages: node Atomic Builtins. These are already supported on various architectures, not just ARM.

The best way to detect support for these is to add a test in your configure script which tries to compile and link calls to a couple of the atomic builtins.

This is the best approach, since the calling code can be kept cleaner and more portable--- also you don't need to worry too much about memory barriers because the builtins generally do the appropriate things to ensure SMP-safety.

For cri_atomic_read and cri_atomic_write, some extra care is likely to be needed for SMP-safety. Memory ordering barriers are generally needed for these operations to have the desired effects in multithreaded environments.

The existing __asm__ __volatile__("": : :"memory") statements (which provide a barrier at the C language level only) should be replaced with a call to __sync_synchronize, if the GCC atomic intrinsics are detected as available. This should work for Ubuntu and any distro based on GCC >= 4.4 generally.

> 1) CR_KCODE___kuser_cmpxchg clearly isn't defined at build
> time here. The test doesn't get run because we're only
> building userspace parts at this point. Is that correct?
> Are we always expecting to be in the #else of
> libcr/arch/arm/cr_atomic.h when building userspace bits? I
> can't help but feel I've missed something in my reading of
> this somehow though.

The kuser helpers are some functions exported by the kernel and callable by userspace programs. These are called at fixed addresses, rather than via the dynamic linker.
The helpers provied some fundamental atomic operations appropriate to the platform you're running on, in a portable way, though the kuser helpers are obscure and not well known.

The GCC atomic intrinsics are built round these, so it's better to use the intrinsics rather than calling the kuser helpers directly.

I'm not familiar with blcr--- if this code was pasted from another package, it may be that the original package knew how to define CR_KCODE___kuser_cmpxchg in its configure scripts but maybe blcr does not. The issue is kernel version --- 2.6.16 and greater have the helpers; earlier kernels do not. We can ignore this at least for Ubuntu, since all kernels are new enough.

> 2) Is a 2nd alternative for
> cri_atomic_inc/cri_atomic_dec_and_test/cri_cmp_swap the
> correct way to fix this? The comments in the source seem to
> suggest that using ldrex/strex would be correct and better
> where they exist. They als...

Read more...

Download full text (3.6 KiB)

2010/1/5 Paul H. Hargrove <email address hidden>:
> Alan,
>
>   I believe you are mistaken in reading the comments to imply "we
> shouldn['t] be using __kernel_cmpxchg on newer kernels anyway".   That
> feature became available in 2.6.12 and before that there was no way to
> perform an atomic operation from user code w/ pre-ARMv6 hardware.  Since
> 2.6.12 this has been the portable way to get an atomic C-A-S.  If there is
> specific misleading text in the comment, please show me and I'll fix/clarify
> it.

Thanks, my confusion was in how a kernel test seemed to be resulting
in something getting called directly from userspace, and why it didn't
seem to be getting used in the build despite the more recent kernel.
The comments are pretty clear now I see how __kuser_cmpxchg is
working.

>  You ARE correct that CR_KCODE___kuser_cmpxchg is not defined at build time
> IF one does separate configuration as the debian packaging apparently does.
>  That possibility was not considered when this code was first written, and
> is the root cause of your problem.  One should always be in
> CR_KCODE___kuser_cmpxchg case for a kernel >= 2.6.12.
>
>  It was my intention that the swp/swpne code only be used for kernel <
> 2.6.12, but probing the kernel for __kuser_cmpxchg was chosen to be a more
> reliable probe (a distro could back port the feature to an older base kernel
> version).  The fact that you reach this code is because the separate
> user/kernel builds have resulted in not probing for
> CR_KCODE___kuser_cmpxchg.

That makes sense.

> If one were to refuse to build for a kernel w/o this feature, then the
>  #if defined(CR_KCODE___kuser_cmpxchg)
> would/could effectively become equivalent to
>  #if 1
> since a userspace built w/o a corresponding kernel module is "harmless".
>
>  So, I believe the solution to your problem is to force use of the
> __kuser_cmpxchg path and require (via packaging magic?) that one have kernel
>>= 2.6.12.

That seems like the simplest solution to me, a 1 line patch to force
that path on arm. I'm not quite sure how to document >= 2.6.12 as a
dependency for a library package. Probably the easiest way to get a
sensible dependency would be to depend on a more modern libc6 than the
automatically generated one.

Ubuntu arm people: Shall I just do this and upload to Debian/unstable
with LP: #503185 in the changelog? It makes sense for the Debian
userspace to be using __kuser_cmpxchg on modern kernels also.

(I assume on a kernel prior to 2.6.12 this would build, but give
segfault/bus error at run time?)

>  I am open to suggestions as to how to deal in general with building the
> ARMuser space w/o configuring the kernel module, while still allowing for
> the possibility of a pre-2.6.12 kernel.  I am not certain that disallowing
> pre-2.6.12 kernels would be popular in the ARM community.

For Debian/Ubuntu I think it's safe enough to ignore that problem -
Sarge was the last Debian release to see a kernel < 2.6.12, the next
Ubuntu release looks set to have at least 2.6.32 as the default
kernel.

>  In response to your #2: one COULD write an additional #if case to use
> ldrex/strex for ARMv6 and ARMv7.  This would gain some efficiency in...

Read more...

There is another issue with the alternate code-path:

/tmp/cckH6Lu5.s: Assembler messages:
/tmp/cckH6Lu5.s:730: Error: only SUBS PC, LR, #const allowed -- `sub pc,r3,#(0xffff0fff-0xffff0fc0)'
/tmp/cckH6Lu5.s:791: Error: only SUBS PC, LR, #const allowed -- `sub pc,r3,#(0xffff0fff-0xffff0fc0)'
/tmp/cckH6Lu5.s:910: Error: only SUBS PC, LR, #const allowed -- `sub pc,r3,#(0xffff0fff-0xffff0fc0)'
/tmp/cckH6Lu5.s:937: Error: only SUBS PC, LR, #const allowed -- `sub pc,r3,#(0xff

Dave Martin (dave-martin-arm) wrote :

ARMv7 generally does not allow the PC to be used as a destination register for random operations. The code will also not work in Thumb because the "Thumb bit" in the return address (bottom bit of lr) is not set properly.

The best way to fix this is to precompute the address of the kuser helper code to call (in r3) and call it with BLX r3. This only works on v5 and above; this is reasonable because the code won't build for Thumb on ARMv4T anyway for various other reasons.

The attached patch attempts to do this when appropriate.

I will try to test it, but haven't been able to do so yet.

Dave Martin (dave-martin-arm) wrote :

I get a successful build, but I'm not sure how to run the testsuite. Any ideas?

2010/1/18 Dave Martin <email address hidden>:
> I get a successful build, but I'm not sure how to run the testsuite.
> Any ideas?

Testsuite needs to be built separately using:
./configure --with-installed-libcr --with-installed-modules && make
check IIRC. There's a bug in the current release which prevents
--with-installed-util from working correctly, which should also be
used here, but I don't think it matters too much that it's not.

(You'll also need to have the kernel module part (blcr-dkms) installed
and built for the running kernel in order for the tests to actually
work)

Alan

Dave Martin (dave-martin-arm) wrote :

I managed to build the kernel modules against the linux-fsl-imx51 headers and install them; make check then ran, using the configure args you suggest (I added --prefix=/usr --mandir=/usr/share/man for consistency with the package build proper, but I doubt this affects the test results). See the attached log for the results.

A couple of tests are skipped; the prctl test fails. The atomics tests appear to pass.

I have not currently done anything in my patch to cause R_KCODE___kuser_cmpxchg to be defined: I defined it in my build using a masquerade wrapper around GCC. This will need to be addressed in the build scripts.

Dave Martin (dave-martin-arm) wrote :

(typo in the above post: for "R_KCODE___kuser_cmpxchg", please read "CR_KCODE___kuser_cmpxchg")

Alan (awoodland) wrote :

2010/1/18 Dave Martin <email address hidden>:
> I managed to build the kernel modules against the linux-fsl-imx51
> headers and install them; make check then ran, using the configure args
> you suggest (I added --prefix=/usr --mandir=/usr/share/man for
> consistency with the package build proper, but I doubt this affects the
> test results).  See the attached log for the results.
>
> A couple of tests are skipped; the prctl test fails.  The atomics tests
> appear to pass.
That all looks good - bug2524 test is PPC specific, prctl test fails
because of known problems with configure and the test itself.

> I have not currently done anything in my patch to cause
> R_KCODE___kuser_cmpxchg to be defined: I defined it in my build using a
> masquerade wrapper around GCC.  This will need to be addressed in the
> build scripts.

I have a trivial patch I'm going to apply that assumes this will
always be supported. It's the better code path to use, and it's only
untrue pre 2.6.12, which isn't a problem for Debian or Ubuntu.

I'll upload 0.8.2-8 to Debian now, with a reference in the Changelog
to the LP bug.

Alan

Dave Martin (dave-martin-arm) wrote :

OK, sounds good :)

Dave Martin (dave-martin-arm) wrote :

alan woodland wrote (19 January 2010 09:55):

> I think there might be a problem in that patch still...
>
> On the Debian machine it built on last night __ARM_ARCH_4__
> was not defined, __ARM_ARCH_4T__ was defined, and __thumb__
> wasn't defined.
> This meant that it ended up using the ARMv5 of the #else,
> which used the blx instruction and failed to build because of
> it.
>
> I think changing it to read:
> #if defined(__ARM_ARCH_4__) || defined(__ARM_ARCH_4T__)
> should fix things. Does that seem sane to you?
>
> Thanks,
> Alan

Yes, that makes sense.

A modified, slightly tidied-up patch which I prepared yesterday is attached here. For some reason, I thought I'd already uploaded it for some reason, but it looks I didn't...

Alan (awoodland) wrote :

The fix for this in 0.8.2-9 has hit testing now, so as soon as that gets pulled across...

Loïc Minier (lool) wrote :

DIF is next week and there are no Ubuntu changes; -9 is already in testing too. So -9 should get copied the next time an Ubuntu archive admin imports Debian packages, probably this week.

Changed in blcr (Ubuntu):
status: Confirmed → Fix Committed
assignee: David Sugar (dyfet) → nobody
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package blcr - 0.8.2-9

---------------
blcr (0.8.2-9) unstable; urgency=low

  * Updated patch thanks to Dave Martin, fixes FTBFS on armel

blcr (0.8.2-8) unstable; urgency=low

  * Apply ARMv7 __kuser Thumb fix patch from Dave Martin
  * Force ARM to use __kuser_cmpxchg on ARM.
    - Will break on kernels older than 2.6.12
    - Fixes problem with newer ARM instruction sets
      LP: #503185
 -- Ubuntu Archive Auto-Sync <email address hidden> Mon, 01 Feb 2010 23:57:20 +0000

Changed in blcr (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers