Ubuntu

[armel] java fails to start with eglibc-2.12-0ubuntu4

Reported by Matthias Klose on 2010-07-13
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro Toolchain Miscellanies
Medium
Yao Qi
Release Notes for Ubuntu
Undecided
Unassigned
eglibc (Debian)
Fix Released
Unknown
eglibc (Ubuntu)
High
Matthias Klose
Lucid
Undecided
Unassigned
Maverick
High
Unassigned
Natty
High
Matthias Klose
linux-fsl-imx51 (Ubuntu)
Undecided
Unassigned
Lucid
Critical
Jeremy Kerr
Maverick
Undecided
Unassigned
Natty
Undecided
Unassigned
openjdk-6 (Ubuntu)
High
Unassigned
Lucid
Undecided
Unassigned
Maverick
High
Unassigned
Natty
High
Unassigned

Bug Description

reverting back to eglibc-2.12-0ubuntu3 works around the problem

$ strace java -version
[...]
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/libm.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\3701\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=407156, ...}) = 0
mmap2(NULL, 438440, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x40424000
mprotect(0x40487000, 28672, PROT_NONE) = 0
mmap2(0x4048e000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x62) = 0x4048e000
close(3) = 0
mprotect(0x4048e000, 4096, PROT_READ) = 0
mprotect(0x40169000, 2572288, PROT_READ|PROT_WRITE) = 0
mprotect(0x40169000, 2572288, PROT_READ|PROT_EXEC) = 0
cacheflush(0x40169000, 0x403dd000, 0, 0x40169000, 0x19448 <unfinished ...>
+++ killed by SIGSEGV +++
Segmentation fault

Unable to handle kernel paging request at virtual address 401a1000
pgd = cd108000
[401a1000] *pgd=99a5b031, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#4455]
Modules linked in: ov3640_camera v4l2_int_device uio_pdrv_genirq joydev uio
CPU: 0 Tainted: G D (2.6.31-608-imx51 #14-Ubuntu)
PC is at v7_coherent_kern_range+0x18/0x44
LR is at arm_syscall+0x2a8/0x2c4
pc : [<c003e1e8>] lr : [<c003a858>] psr: 80000013
sp : cc81be80 ip : dc172bb0 fp : cc81bfa4
r10: 4001e568 r9 : cc81a000 r8 : 00000000
r7 : 000f0002 r6 : 00000000 r5 : 40169000 r4 : 403dd000
r3 : 0000003f r2 : 00000040 r1 : 403dd000 r0 : 401a1000
Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c5387d Table: 9d108019 DAC: 00000015
Process java (pid: 6558, stack limit = 0xcc81a2f0)
Stack: (0xcc81be80 to 0xcc81c000)
be80: cc81bed4 cc81be90 c047edc0 c005d290 c047edc0 c006e6dc c060d26c c006e6dc
bea0: c060d288 c060d270 cc81befc 00000005 cc81a000 00000005 cc81bee8 c0036ae4
bec0: cc81a000 00000005 cc81bee4 cc81bed8 c006e724 c006bd04 cc81bf84 cc81bee8
bee0: c006e7e0 c006e64c 00000005 00000000 00000005 0000199e 000009db 00000000
bf00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bf20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bf40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bf60: 00000000 00000000 cc81a000 cc81bfb0 000f0002 000f0002 cc81bfa4 cc81bf88
bf80: c0037a50 c006e76c 40169000 00019448 be970fa8 00000050 00000000 cc81bfa8
bfa0: c0036ac0 c003a5bc 00019448 be970fa8 40169000 403dd000 00000000 40169000
bfc0: 00019448 be970fa8 00000050 000f0002 40169000 00000cc0 4001e568 4001e8b8
bfe0: 000f0002 be970f90 400093cd 4000fd96 00000030 40169000 696c4075 2e737473
Backtrace:
[<c003a5b0>] (arm_syscall+0x0/0x2c4) from [<c0036ac0>] (__sys_trace_return+0x0/0x20)
 r6:00000050 r5:be970fa8 r4:00019448
Code: e3a02010 e1a02312 e2423001 e1c00003 (ee070f3b)
mxc_ipu mxc_ipu: Channel already disabled 9
mxc_ipu mxc_ipu: Channel already uninitialized 9
DMFC high resolution has set, will not change
---[ end trace b707ea3bd34d5698 ]---

Matthias Klose (doko) on 2010-07-13
Changed in eglibc (Ubuntu):
importance: Undecided → Critical
milestone: none → maverick-alpha-3
status: New → Triaged
summary: - java fails to start with eglibc-2.12-0ubuntu4
+ [armel] java fails to start with eglibc-2.12-0ubuntu4
Changed in eglibc (Ubuntu Maverick):
assignee: nobody → Matthias Klose (doko)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eglibc - 2.12-0ubuntu5

---------------
eglibc (2.12-0ubuntu5) maverick; urgency=high

  * Revert upstream change:
    2010-06-02 Kirill A. Shutemov <email address hidden>
        * elf/dl-reloc.c: Flush cache after solving TEXTRELs if arch
        requires it.
    Breaks the OpenJDK ARM assembler interpreter. LP: #605042.
  * expected-results-arm-linux-gnueabi-libc: Remove scanf15, scanf17
    and tst-eintr1, passing the tests on the buildds.
 -- Matthias Klose <email address hidden> Wed, 14 Jul 2010 01:06:39 +0200

Changed in eglibc (Ubuntu Maverick):
status: Triaged → Fix Released
Matthias Klose (doko) wrote :

opening openjdk-6 task; is openjdk-6 wrong, or the eglibc change?

Changed in openjdk-6 (Ubuntu Maverick):
importance: Undecided → High
milestone: none → maverick-alpha-3
status: New → Confirmed
Xerxes Rånby (xranby) wrote :

Testcase:
// gcc testcase.c -ldl
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
int main(void){
  void *libjvm;

  libjvm = dlopen("./libjvm.so", RTLD_NOW + RTLD_GLOBAL);
  if (!libjvm) {
        fprintf (stderr, "%s\n", dlerror());
        exit(1);
  }
  printf("%X",(int)libjvm);
}

the crash happens during ldopen of the libjvm.so file from java
loading other librarys seems to work fine.

Xerxes Rånby (xranby) wrote :

The libjvm.so can be obtained from the armel openjdk-6-jre-headless package
https://launchpad.net/ubuntu/maverick/armel/openjdk-6-jre-headless/6b18-1.8-2ubuntu2

located in usr/lib/jvm/java-6-openjdk/jre/lib/arm/server/libjvm.so

Xerxes Rånby (xranby) wrote :

The java libjvm.so file contains an asm interpreter that defines a .init_array section in the libjvm.so
defined in the file cppInterpreter_arm.S:
http://icedtea.classpath.org/hg/icedtea6/file/0b656f7601bd/ports/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S

this .init_array section makes dlopen execute initialization code for the asm interpreter that seems to trigger the crash.

Changed in linaro-toolchain-wg:
assignee: nobody → Linaro Toolchain Developers (linaro-toolchain-dev)
Yao Qi (yao-codesourcery) wrote :

There is no crash on openjdk-6-jre-lib_6b20~pre1-1ubuntu1, using test case in comment #3.

# cp /usr/lib/jvm/java-6-openjdk/jre/lib/arm/server/libjvm.so .
# gcc pr605042.c -o pr605042 -ldl
# ./pr605042
12018

My gcc is Ubuntu 4.4.4-7ubuntu1~ppa2.

Yao Qi (yao-codesourcery) wrote :

I can start java normally.

# java -version
java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-2ubuntu2)
OpenJDK Zero VM (build 14.0-b16, mixed mode)

Xerxes Rånby (xranby) wrote :

Hi Yao Qi

Are you using eglibc - 2.12-0ubuntu4 or eglibc compiled from upstream while testing?

The current Ubuntu eglibc - 2.12-0ubuntu5 contains a workaround for this issue by removing a cache flush that are part of eglibc upstream.

eglibc (2.12-0ubuntu5) maverick; urgency=high

  * Revert upstream change:
    2010-06-02 Kirill A. Shutemov <email address hidden>
        * elf/dl-reloc.c: Flush cache after solving TEXTRELs if arch
        requires it.
    Breaks the OpenJDK ARM assembler interpreter. LP: #605042.

Xerxes Rånby (xranby) wrote :

While triaging this bug i have been using a i.MX.51 armv7 babbage-1 hardware.

I have been able to trigger this bug using
Ubuntu Maverick eglibc - 2.12-0ubuntu4 and the testcase in comment #3
Debian Squeeze eglibc (2.11.2-2) and the testcase in comment #3

Yao Qi (yao-codesourcery) wrote :

I am using eglibc 2.12-0ubuntu5, so this bug is filed against 2.12-0ubuntu4.

Loïc Minier (lool) wrote :

Yao, we want to fix eglibc upstream so that we don't have to carry the patch reverting the elf/dl-reloc.c changes.

Would you please try after downgrading your eglibc to 2.12-0ubuntu4 or after rebuilding 2.12-0ubuntu5 without the new patch?

You can get the older binaries from https://launchpad.net/ubuntu/+source/eglibc/2.12-0ubuntu4

On 22.07.2010 12:08, Yao Qi wrote:
> I am using eglibc 2.12-0ubuntu5, so this bug is filed against
> 2.12-0ubuntu4.

correct, but 2.12-0ubuntu5 just backs out the patch triggering the crash. We
need to investigate why it crashes with upstream eglibc.

Martin Pitt (pitti) wrote :

I understand that eglibc worked around this, and it's alpha-3 time now, so moving milestone.

Changed in openjdk-6 (Ubuntu Maverick):
milestone: maverick-alpha-3 → ubuntu-10.10-beta
Loïc Minier (lool) on 2010-08-19
affects: linaro-toolchain-wg → linaro-toolchain-misc
Changed in linaro-toolchain-misc:
assignee: Linaro Toolchain Developers (linaro-toolchain-dev) → Yao Qi (yao-codesourcery)
importance: Undecided → Medium
Yao Qi (yao-codesourcery) wrote :

Can't reproduce this problem, here are steps,

1 Remove cvs-revert-flush-cache-textrels.diff from patch/series.
2 Rebuild eglibc
3 dpkg -i libc6_2.12.1-0ubuntu1_armel.deb
4 cp /usr/lib/jvm/java-6-openjdk/jre/lib/arm/server/libjvm.so .
5 Run ./testcase, no seg fault.
6. strace -o 3.log java -version
.....
mmap2(0x40499000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x64) = 0x40499000
close(3) = 0
mprotect(0x40499000, 4096, PROT_READ) = 0
mprotect(0x40172000, 2572288, PROT_READ|PROT_WRITE) = 0
mprotect(0x40172000, 2572288, PROT_READ|PROT_EXEC) = 0
cacheflush(0x40172000, 0x403e6000, 0, 0x274000, 0x19448) = 0 // <---- [1]
mprotect(0x403ed000, 57344, PROT_READ) = 0
open("/proc/self/auxv", O_RDONLY) = 3
read(3, "\20\0\0\0\3278\0\0\6\0\0\0\0\20\0\0\21\0\0\0d\0\0\0\3\0\0\0004\200\0\0"..., 256) = 144
.....

It got seg fault on [1] in comment#1. In my case, cacheflush is executed without seg faults.

# java -version
java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-2ubuntu2)
OpenJDK Zero VM (build 14.0-b16, mixed mode)

Yao Qi (yao-codesourcery) wrote :

https://launchpad.net/ubuntu/maverick/+source/eglibc/+changelog shows that eglibc (2.12.1-0ubuntu1) is based on eglibc svn r11211, while eglibc (2.12-0ubuntu4) is based on eglibc svn r10817. Run diff between r10817 and r11211, don't see something special may fix this bug.

Loïc Minier (lool) wrote :

Matthias, could you revert that glibc revert patch, since apparently it's not broken anymore?

Changed in linaro-toolchain-misc:
status: New → Incomplete
Changed in openjdk-6 (Ubuntu Maverick):
status: Confirmed → Incomplete
Matthias Klose (doko) wrote :

reverted in 2.12.1-0ubuntu3. please recheck once the build is finished.

Matthias Klose (doko) wrote :

still seen in maverick on a babbage board running the lucid kernel

Changed in eglibc (Ubuntu Maverick):
status: Fix Released → Triaged
Changed in linaro-toolchain-misc:
status: Incomplete → Confirmed
Changed in eglibc (Ubuntu Maverick):
milestone: maverick-alpha-3 → ubuntu-10.10
Yao Qi (yao-codesourcery) wrote :

Matthias,
I am building latest eglibc on pavo1, which may take some hours. If you have some arm boxes at hand, and see seg fault on it, I'd like to log in to have a look.

Yao Qi (yao-codesourcery) wrote :

I can reproduce it on imx51, but can't reproduce it on other two boards. Both openjdk and eglibc are the same on these three boards,
openjdk-6-jre-headless: 6b18-1.8-2ubuntu2
eglibc: 2.12.1-0ubuntu3

Run 'java -version', and here is the result,
imx51-1 2.6.31-203-gee1fdae SEGV
2.6.33.5-l3 OK
pavo1 2.6.33.7 OK

Failure on imx51 is still the same as reported in this bug.
# java -version
Segmentation fault

# strace -o 1.log java -version
....
mmap2(0x4049a000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x64) = 0x4049a000
close(3) = 0
mprotect(0x4049a000, 4096, PROT_READ) = 0
mprotect(0x40173000, 2572288, PROT_READ|PROT_WRITE) = 0
mprotect(0x40173000, 2572288, PROT_READ|PROT_EXEC) = 0
cacheflush(0x40173000, 0x403e7000, 0, 0x274000, 0x19448 <unfinished ...>
+++ killed by SIGSEGV +++

Matthias Klose (doko) wrote :

needing a fix for lucid (developer's box and builldds (?) are running this lucid kernel). needs to be investigated with the maverick kernel.

Changed in linux-fsl-imx51 (Ubuntu Maverick):
milestone: none → ubuntu-10.10
Changed in eglibc (Ubuntu Lucid):
status: New → Invalid
Changed in openjdk-6 (Ubuntu Lucid):
status: New → Invalid
Changed in linux-fsl-imx51 (Ubuntu Lucid):
importance: Undecided → High
milestone: none → lucid-updates
status: New → Confirmed
Oliver Grawert (ogra) wrote :

we dont have an imx51 maverick kernel :(

Loïc Minier (lool) wrote :

We have a Linaro imx51 build based on upstream's imx51 support; it's very rough, but it might be good enough for buildd use.

Loïc Minier (lool) wrote :

NB: upstream only supports Babbage right now, AFAIK

Amit Kucheria (amitk) wrote :

If all you need for a buildd is a serial console and ethernet, then mainline u-boot + kernel supports Babbage 2.5 and 3.0 boards.

So the buildds could be upgraded to maverick after some testing.

Bryan Wu (cooloney) wrote :

Just after chatting with Yao Qi, I was told this issue was only on imx51 board with 2.6.31 kernel, while it doesn't show up on other 2 omap3 boards with 2.6.33 kernel.

Amit,
Could you please try the 2.6.35 mainline kernel on you Babbage board and run the testcase? So if it works, we might need backport some fixing patches.

-Bryan

Amit Kucheria (amitk) wrote :

Can't see this problem on a Babbage 3.0 board with a mainline kernel and nfs-root maverick filesystem. Hopefully, I did the right test.

amit@arm-ubuntu:~$ uname -a
Linux arm-ubuntu 2.6.36-rc3+ #21 Thu Sep 9 12:24:31 EEST 2010 armv7l GNU/Linux

amit@arm-ubuntu:~$ cat /proc/cpuinfo
Processor : ARMv7 Processor rev 5 (v7l)
BogoMIPS : 799.53
Features : swp half thumb fastmult vfp edsp neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc08
CPU revision : 5

Hardware : Freescale MX51 Babbage Board
Revision : 51130
Serial : 0000000000000000

amit@arm-ubuntu:~$ java -version
java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8.1) (6b18-1.8.1-2ubuntu1)
OpenJDK Zero VM (build 14.0-b16, mixed mode)

As there doesn't exist an fsl-imx51 linux kernel in Maverick and this wasn't seen on 2.6.33 based kernels nor a 2.6.36-rc3 kernel, it would appear this would only affect the Lucid fsl-imx51 kernel if any. I'm therefore closing the linux-fsl-imx51 Maverick nomination for now.

Changed in linux-fsl-imx51 (Ubuntu Maverick):
status: New → Invalid
Matthias Klose (doko) wrote :

is there any progress on an updated kernel for lucid? we do *need* that fix for the buildds.

Changed in linux-fsl-imx51 (Ubuntu Lucid):
assignee: nobody → darkrevival (kernel)
importance: High → Critical
assignee: darkrevival (kernel) → Canonical Kernel Team (canonical-kernel-team)
Matthias Klose (doko) wrote :

closing the openjdk-6 task, works as long as the eglibc upstream change is not applied.

Changed in openjdk-6 (Ubuntu Maverick):
milestone: ubuntu-10.10-beta → none
status: Incomplete → Invalid
Matthias Klose (doko) wrote :

delaying the eglibc task for maverick updates

Changed in eglibc (Ubuntu Maverick):
importance: Critical → High
milestone: ubuntu-10.10 → maverick-updates
Changed in eglibc (Ubuntu Natty):
milestone: maverick-updates → natty-alpha-1
Changed in eglibc (Ubuntu Natty):
milestone: natty-alpha-1 → natty-alpha-2
Jeremy Kerr (jk-ozlabs) wrote :

Seems to be OK here, on a babbage 2.0:

$ java -version
java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8.3) (6b18-1.8.3-0ubuntu1)
OpenJDK Zero VM (build 14.0-b16, mixed mode)

$ cat /proc/cpuinfo
Processor : ARMv7 Processor rev 1 (v7l)
BogoMIPS : 799.53
Features : swp half thumb fastmult vfp edsp thumbee vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc08
CPU revision : 1

Hardware : Freescale MX51 Babbage Board
Revision : 51020
Serial : 0000000000000000
package versions:
libc6: 2.12.1-0ubuntu4
openjdk-6-jre-headless: 6b18-1.8.3-0ubuntu1

$ uname -a
Linux babbage 2.6.31-608-imx51 #20-Ubuntu Tue Sep 28 13:29:06 UTC 2010 armv7l GNU/Linux

$ dpkg -l libc6 openjdk-6-jre-headless | tail -2
ii libc6 2.12.1-0ubuntu4 Embedded GNU C Library: Shared libraries
ii openjdk-6-jre-headless 6b18-1.8.3-0ubuntu1 OpenJDK Java runtime, using Hotspot Zero (headless)

$ /lib/libc.so.6 --version
GNU C Library (Ubuntu EGLIBC 2.12.1-0ubuntu4) stable release version 2.12.1, by Roland McGrath et al.
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.4.5 20100902 (prerelease).
Compiled on a Linux 2.6.35 system on 2010-09-07.
Available extensions:
        crypt add-on version 2.1 by Michael Glad and others
        GNU Libidn by Simon Josefsson
        Native POSIX Threads Library by Ulrich Drepper et al
        Support for some architectures added on, not maintained in glibc core.
        BIND-8.2.3-T5B
libc ABIs: UNIQUE
For bug reporting instructions, please see:
<http://www.debian.org/Bugs/>.

(running in a maverick nfs chroot)

Colin Watson (cjwatson) wrote :

Does this match the hardware used by the buildds?

Jeremy Kerr (jk-ozlabs) wrote :

OK, I *can* reproduce this if I use a root FS on MMC (earlier tests were with an NFS root). Same kernel & package versions.

Andy Whitcroft (apw) wrote :

@Jeremy -- as you have a reproducer for this does this occur with the v2.6.35 kernel on the same userspace ?

Martin Pitt (pitti) wrote :

Does not really block the alpha-2 release, and not realistic to get fixed by tomorrow.

Changed in eglibc (Ubuntu Natty):
milestone: natty-alpha-2 → natty-alpha-3
Andy Whitcroft (apw) wrote :

@Jeremy -- any update on this reproduce?

Jeremy Kerr (jk-ozlabs) wrote :

@Andy: v2.6.35 has no mmc driver, so no way to reproduce with this image. We don't have a maverick imx build, so can't test that. Would it help to test one of the BSP (or other) kernels instead?

Jeremy Kerr (jk-ozlabs) wrote :

Looks like I can trip this with any glibc version, using the attached testcase.

Basically, this does an anoymous mmap, then a cacheflush on the address returned from the mmap. We get an oops from the cacheflush on the actual coprocessor instruction:

 mcr p15, 0, r0, c7, c11, 1

- r0 is the start address given to cacheflush, and will be the address which we see the invalid paging operation on.

I'm unsure why this instruction is generating an access to this address.

Jeremy Kerr (jk-ozlabs) wrote :

Amit - could you test this on your babbage board? The testcase shouldn't be sensitive to the NFS root.

Steve Langasek (vorlon) on 2011-02-15
tags: added: arm-porting-queue
Jeremy Kerr (jk-ozlabs) wrote :

Looks like the mrc instruction is causing the translation fault if the memory hasn't been accessed previously.

This is fixed by upstream commit 32cfb1b16f2b68d2296536811cadfffe26a06c1b, some initial testing shows this change to fix the problem.

I'm building a package with this fix, will post links when the build completes.

Jeremy Kerr (jk-ozlabs) wrote :

Test kernel build up at:

 http://people.canonical.com/~jk/605042/

md5:

 ae0e1649dc85318a0ac1a888a4bf7357 linux-image-2.6.31-608-imx51_2.6.31-608.22+605042jk1_armel.deb

Please test and let me know how you go.

Jeremy Kerr (jk-ozlabs) on 2011-02-16
Changed in linux-fsl-imx51 (Ubuntu Lucid):
assignee: Canonical Kernel Team (canonical-kernel-team) → Jeremy Kerr (jk-ozlabs)
status: Confirmed → In Progress

Hi Matthias,

I'm waiting on a test restuld for 605042 - are we still tracking this one for
relase? If so, could you let me know whether the test build addresses the
issue?

Cheers,

Jeremy

Martin Pitt (pitti) wrote :

Too late for a3, moving to beta-1

Changed in eglibc (Ubuntu Natty):
milestone: natty-alpha-3 → ubuntu-11.04-beta-1
Matthias Klose (doko) wrote :

assign to ogra for testing

Changed in eglibc (Ubuntu Maverick):
assignee: Matthias Klose (doko) → Oliver Grawert (ogra)
Jeremy Kerr (jk-ozlabs) wrote :

Hi Matthias,

> there is no rebuild needed for testing. The eglibc version mentioned in
> this report can be used.

sorry, restuld = result. Just to clarify - the eglibc version is fine, but I
need the previously linked kernel .debs tested.

> I don't have a working babbage board anymore. Asked Oliver to test it. If
> this is not possible, we would need to ask IS to test it in the data
> center.

OK, thanks. Let me know if you need anything to help this along.

Jeremy

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eglibc - 2.13-0ubuntu4

---------------
eglibc (2.13-0ubuntu4) natty; urgency=low

  * Merge with Debian (r4564, 2.13 branch).
  * Merge Debian 2.11.2-12.
    - Fix a typo in debian/patches/any/local-rtld.diff. Closes: #615806.
  * Merge Debian 2.11.2-13.
    [ Aurelien Jarno ]
    - Re-enable build failure in case of testsuite regressions.
    - Add patches/any/cvs-fnmatch-alloca.patch from upstream to fix a
      memory corruption in fnmatch() that can lead to code execution.
      Closes: #615120.
    - Add patches/any/cvs-qsort-race.diff from upstream to fix race in
      qsort_r(). Closes: #614892.
    [ Samuel Thibault ]
    - patches/any/submitted-sched_h.diff: Synchronize bits/sched.h with
      sysdeps/unix/sysv/linux/bits/sched.h (Closes: #527589), rename to
      cvs-sched_h.diff.
    - patches/hurd-i386/cvs-if_freereq.diff: Fix crash when siocgifconf
      actually succeeds.
    [ Clint Adams ]
    - Patch from Nobuhiro Iwamatsu to cope with the removal of
      patch --unified-reject-files. closes: #612540.
    [ Steve Langasek ]
    - Merge parts of multiarch patch:
      - Use the correct path in the ldd script as well
      - Set default rtlddir to /lib and override it when needed.
      - Install xen library in $(libdir)/xen instead of /usr/lib/xen.
  * On ppc64, build with -O3 -fno-tree-vectorize.
  * Update to r13065 from the eglibc-2.13 branch.
    - debian/patches/any/cvs-rtld-prelink.diff: Remove, applied upstream.
    - debian/patches/ppc64/submitted-loader-no-vsx.diff: Likewise.
  * Re-enable the upstream change:
    2010-06-02 Kirill A. Shutemov <email address hidden>
        * elf/dl-reloc.c: Flush cache after solving TEXTRELs if arch
        requires it.
    Working OpenJDK ARM assembler interpreter. LP: #605042.
 -- Matthias Klose <email address hidden> Tue, 08 Mar 2011 00:47:30 +0100

Changed in eglibc (Ubuntu Natty):
status: Triaged → Fix Released
Tobin Davis (gruemaster) wrote :

I was able to reproduce the segfault on my babbage 3.0 running Lucid+updates, using a maverick chroot environment with libc6 2.12-0ubuntu4 abd the attached testcase source. I then installed the kernel in comment #42 above and rebooted. Running this time, the test no longer segfaults.

Jeremy Kerr (jk-ozlabs) wrote :

Tobin: excellent, thanks for testing. I'll submit the patch for lucid fsl-imx51.

From: Catalin Marinas <email address hidden>

BugLink: http://launchpad.net/bugs/605042

This is needed because applications using the sys_cacheflush system call
can pass a memory range which isn't mapped yet even though the
corresponding vma is valid. The patch also adds unwinding annotations
for correct backtraces from the coherent_user_range() functions.

Signed-off-by: Catalin Marinas <email address hidden>
Signed-off-by: Russell King <email address hidden>

cherry-picked from upstream commit 32cfb1b16f2b68d2296536811cadfffe26a06c1b

Signed-off-by: Jeremy Kerr <email address hidden>

---
 arch/arm/mm/cache-v6.S | 20 ++++++++++++++++++--
 arch/arm/mm/cache-v7.S | 19 +++++++++++++++++--
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S
index 8f5c13f..295e25d 100644
--- a/arch/arm/mm/cache-v6.S
+++ b/arch/arm/mm/cache-v6.S
@@ -12,6 +12,7 @@
 #include <linux/linkage.h>
 #include <linux/init.h>
 #include <asm/assembler.h>
+#include <asm/unwind.h>

 #include "proc-macros.S"

@@ -121,11 +122,13 @@ ENTRY(v6_coherent_kern_range)
  * - the Icache does not read data from the write buffer
  */
 ENTRY(v6_coherent_user_range)
-
+ UNWIND(.fnstart )
 #ifdef HARVARD_CACHE
  bic r0, r0, #CACHE_LINE_SIZE - 1
-1: mcr p15, 0, r0, c7, c10, 1 @ clean D line
+1:
+ USER( mcr p15, 0, r0, c7, c10, 1 ) @ clean D line
  add r0, r0, #CACHE_LINE_SIZE
+2:
  cmp r0, r1
  blo 1b
 #endif
@@ -143,6 +146,19 @@ ENTRY(v6_coherent_user_range)
  mov pc, lr

 /*
+ * Fault handling for the cache operation above. If the virtual address in r0
+ * isn't mapped, just try the next page.
+ */
+9001:
+ mov r0, r0, lsr #12
+ mov r0, r0, lsl #12
+ add r0, r0, #4096
+ b 2b
+ UNWIND(.fnend )
+ENDPROC(v6_coherent_user_range)
+ENDPROC(v6_coherent_kern_range)
+
+/*
  * v6_flush_kern_dcache_page(kaddr)
  *
  * Ensure that the data held in the page kaddr is written back
diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
index be93ff0..3290dac 100644
--- a/arch/arm/mm/cache-v7.S
+++ b/arch/arm/mm/cache-v7.S
@@ -13,6 +13,7 @@
 #include <linux/linkage.h>
 #include <linux/init.h>
 #include <asm/assembler.h>
+#include <asm/unwind.h>

 #include "proc-macros.S"

@@ -147,13 +148,16 @@ ENTRY(v7_coherent_kern_range)
  * - the Icache does not read data from the write buffer
  */
 ENTRY(v7_coherent_user_range)
+ UNWIND(.fnstart )
  dcache_line_size r2, r3
  sub r3, r2, #1
  bic r0, r0, r3
-1: mcr p15, 0, r0, c7, c11, 1 @ clean D line to the point of unification
+1:
+ USER( mcr p15, 0, r0, c7, c11, 1 ) @ clean D line to the point of unification
  dsb
- mcr p15, 0, r0, c7, c5, 1 @ invalidate I line
+ USER( mcr p15, 0, r0, c7, c5, 1 ) @ invalidate I line
  add r0, r0, r2
+2:
  cmp r0, r1
  blo 1b
  mov r0, #0
@@ -161,6 +165,17 @@ ENTRY(v7_coherent_user_range)
  dsb
  isb
  mov pc, lr
+
+/*
+ * Fault handling for the cache operation above. If the virtual address in r0
+ * isn't mapped, just try the next page.
+ */
+9001:
+ mov r0, r0, lsr #12
+ mov r0, r0, lsl #12
+ add r0, r0, #4096
+ b 2b
+ UNWIND(.fnend )
 ENDPROC(v7_coherent_kern_range)
 ENDPROC(v7_coherent_user_range)

Download full text (4.0 KiB)

On Tue, Mar 8, 2011 at 10:03 AM, Jeremy Kerr <email address hidden> wrote:
> From: Catalin Marinas <email address hidden>
>
> BugLink: http://launchpad.net/bugs/605042
>
> This is needed because applications using the sys_cacheflush system call
> can pass a memory range which isn't mapped yet even though the
> corresponding vma is valid. The patch also adds unwinding annotations
> for correct backtraces from the coherent_user_range() functions.
>
> Signed-off-by: Catalin Marinas <email address hidden>
> Signed-off-by: Russell King <email address hidden>
>
> cherry-picked from upstream commit 32cfb1b16f2b68d2296536811cadfffe26a06c1b
>
> Signed-off-by: Jeremy Kerr <email address hidden>
>

That's great, it looks like it fixed an very old issue for fsl-imx51 in Lucid.
So I think this patch is for [Lucid] [fsl-imx51] and there is no such
issue in Maverick kernel, right?

-Bryan

> ---
>  arch/arm/mm/cache-v6.S |   20 ++++++++++++++++++--
>  arch/arm/mm/cache-v7.S |   19 +++++++++++++++++--
>  2 files changed, 35 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S
> index 8f5c13f..295e25d 100644
> --- a/arch/arm/mm/cache-v6.S
> +++ b/arch/arm/mm/cache-v6.S
> @@ -12,6 +12,7 @@
>  #include <linux/linkage.h>
>  #include <linux/init.h>
>  #include <asm/assembler.h>
> +#include <asm/unwind.h>
>
>  #include "proc-macros.S"
>
> @@ -121,11 +122,13 @@ ENTRY(v6_coherent_kern_range)
>  *     - the Icache does not read data from the write buffer
>  */
>  ENTRY(v6_coherent_user_range)
> -
> + UNWIND(.fnstart               )
>  #ifdef HARVARD_CACHE
>        bic     r0, r0, #CACHE_LINE_SIZE - 1
> -1:     mcr     p15, 0, r0, c7, c10, 1          @ clean D line
> +1:
> + USER( mcr     p15, 0, r0, c7, c10, 1  )       @ clean D line
>        add     r0, r0, #CACHE_LINE_SIZE
> +2:
>        cmp     r0, r1
>        blo     1b
>  #endif
> @@ -143,6 +146,19 @@ ENTRY(v6_coherent_user_range)
>        mov     pc, lr
>
>  /*
> + * Fault handling for the cache operation above. If the virtual address in r0
> + * isn't mapped, just try the next page.
> + */
> +9001:
> +       mov     r0, r0, lsr #12
> +       mov     r0, r0, lsl #12
> +       add     r0, r0, #4096
> +       b       2b
> + UNWIND(.fnend         )
> +ENDPROC(v6_coherent_user_range)
> +ENDPROC(v6_coherent_kern_range)
> +
> +/*
>  *     v6_flush_kern_dcache_page(kaddr)
>  *
>  *     Ensure that the data held in the page kaddr is written back
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index be93ff0..3290dac 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -13,6 +13,7 @@
>  #include <linux/linkage.h>
>  #include <linux/init.h>
>  #include <asm/assembler.h>
> +#include <asm/unwind.h>
>
>  #include "proc-macros.S"
>
> @@ -147,13 +148,16 @@ ENTRY(v7_coherent_kern_range)
>  *     - the Icache does not read data from the write buffer
>  */
>  ENTRY(v7_coherent_user_range)
> + UNWIND(.fnstart               )
>        dcache_line_size r2, r3
>        sub     r3, r2, #1
>        bic     r0, r0, r3
> -1:     mcr     p15, 0, r0, c7, c11, 1          @ clean D line to the point of unification
> +1:
> + USER( mcr     p1...

Read more...

Jeremy Kerr (jk-ozlabs) wrote :

Hi Bryan,

> That's great, it looks like it fixed an very old issue for fsl-imx51 in
> Lucid.

It's only shown up with relatively recent versions of eglibc - it seems like
they added a call to cacheflush() in the mprotect() syscall wrapper.

> So I think this patch is for [Lucid] [fsl-imx51] and there is no
> such issue in Maverick kernel, right?

Yeah, this change went upstream prior to the maverick base, so anything post-
lucid already has this fix.

Cheers,

Jeremy

Tim Gardner (timg-tpi) on 2011-03-08
Changed in linux-fsl-imx51 (Ubuntu Lucid):
status: In Progress → Fix Committed
Tim Gardner (timg-tpi) wrote :

SRU Justification

Impact: Java won't start with older versions of eglibc

Patch: some assembler code

Tim Gardner (timg-tpi) wrote :

On 03/08/2011 02:03 AM, Jeremy Kerr wrote:
> From: Catalin Marinas<email address hidden>
>
> BugLink: http://launchpad.net/bugs/605042
>
> This is needed because applications using the sys_cacheflush system call
> can pass a memory range which isn't mapped yet even though the
> corresponding vma is valid. The patch also adds unwinding annotations
> for correct backtraces from the coherent_user_range() functions.
>
> Signed-off-by: Catalin Marinas<email address hidden>
> Signed-off-by: Russell King<email address hidden>
>
> cherry-picked from upstream commit 32cfb1b16f2b68d2296536811cadfffe26a06c1b
>
> Signed-off-by: Jeremy Kerr<email address hidden>
>

Applied and uploaded linux-fsl-imx51_2.6.31-608.23 to
https://launchpad.net/~canonical-kernel-team/+archive/ppa

rtg
--
Tim Gardner <email address hidden>

Tim Gardner (timg-tpi) wrote :

On 03/08/2011 10:18 PM, Matthias Klose wrote:
> [CCing Lamont and Colin]
>
> On 08.03.2011 14:51, Tim Gardner wrote:
>> On 03/08/2011 02:03 AM, Jeremy Kerr wrote:
>>> From: Catalin Marinas<email address hidden>
>>>
>>> BugLink: http://launchpad.net/bugs/605042
>>>
>>> This is needed because applications using the sys_cacheflush system call
>>> can pass a memory range which isn't mapped yet even though the
>>> corresponding vma is valid. The patch also adds unwinding annotations
>>> for correct backtraces from the coherent_user_range() functions.
>>>
>>> Signed-off-by: Catalin Marinas<email address hidden>
>>> Signed-off-by: Russell King<email address hidden>
>>>
>>> cherry-picked from upstream commit 32cfb1b16f2b68d2296536811cadfffe26a06c1b
>>>
>>> Signed-off-by: Jeremy Kerr<email address hidden>
>>>
>> Applied and uploaded linux-fsl-imx51_2.6.31-608.23 to
>> https://launchpad.net/~canonical-kernel-team/+archive/ppa
> II: Checking ABI for imx51...
> EE: Previous or current ABI file missing!
>
> /build/buildd/linux-fsl-imx51-2.6.31/debian.fsl-imx51/abi/2.6.31-608.22/armel/imx51
> make[1]: *** [abi-check-imx51] Error 1
> make: *** [binary-arch] Error 2
>
> [Colin, please could you revert the eglibc change for natty,
> if we cannot get a fixed kernel on the buildds?]
>
> Matthias, mostly away for this week.

Chill dude, I'll get it fixed tomorrow.

--
Tim Gardner <email address hidden>

Accepted linux-fsl-imx51 into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Tobin Davis (gruemaster) wrote :

Changed the eglibc portion to invalid as nothing has been done to it and the new kernel patch fixes the issue.

Changed in eglibc (Ubuntu Maverick):
assignee: Oliver Grawert (ogra) → nobody
status: Triaged → Invalid
Tobin Davis (gruemaster) wrote :

I just finished installing the linux-fsl-imx51_2.6.31-608.25 kernel and rerunning the testcase, system works fine.

Tobin Davis (gruemaster) on 2011-03-16
tags: added: verification-done
Matthias Klose (doko) wrote :

reopening the eglibc task for maverick. should be re-enabled when the kernel is on the buildds

Changed in eglibc (Ubuntu Maverick):
status: Invalid → Confirmed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-fsl-imx51 - 2.6.31-608.25

---------------
linux-fsl-imx51 (2.6.31-608.25) lucid; urgency=low

linux-fsl-imx51 (2.6.31-608.24) lucid; urgency=low

  * Fix packaging FTBS

linux-fsl-imx51 (2.6.31-608.23) lucid; urgency=low

  [ Upstream Kernel Changes ]

  * ARM: 5746/1: Handle possible translation errors in ARMv6/v7
    coherent_user_range
    - LP: #605042
 -- Tim Gardner <email address hidden> Thu, 10 Mar 2011 01:53:53 -0700

Changed in linux-fsl-imx51 (Ubuntu Lucid):
status: Fix Committed → Fix Released
Michael Hope (michaelh1) wrote :

Invalid as it's caused by a kernel issue.

Changed in linaro-toolchain-misc:
status: Confirmed → Invalid
Martin Pitt (pitti) wrote :

Accepted eglibc into maverick-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in eglibc (Ubuntu Maverick):
status: Confirmed → Fix Committed
tags: removed: verification-done
tags: added: verification-needed
Colin Watson (cjwatson) wrote :

This doesn't look like it needs a release note (any more, anyway).

Changed in ubuntu-release-notes:
status: New → Invalid
Changed in eglibc (Debian):
status: Unknown → Fix Released
Steve Beattie (sbeattie) wrote :

An eglibc security update was released for maverick that does not include the fixes from this package in -proposed. A new package needs to be built for -proposed.

Tobin Davis (gruemaster) wrote :

Marked one task as fix-released as I was actively running java & jenkins on Maverick omap/omap4 SRU testing.

Changed in eglibc (Ubuntu Maverick):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.