Android boot test failed on TC2

Bug #1104753 reported by Naresh Kamboju
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro Android
Fix Released
High
vishal

Bug Description

I have submitted an job on TC2 with Android image. where i have notice boot_linaro_android_image failed.
issues seems to be kernel issue so i have reported on to "linaro-landing-team-arm" project
LAVA JOB:
--------------
http://validation.linaro.org/lava-server/scheduler/job/45998/log_file
http://validation.linaro.org/lava-server/scheduler/job/45969/log_file

Error log:
--------------
The system is going down for reboot NOW!
udevd[1974]: '/sbin/blkid -o udev -p /dev/mmcblk0p6' [2013] terminated by signal 15 (Terminated)

 * Asking all remaining processes to terminate...  [ OK ]
 * All processes ended within 1 seconds....  [ OK ]
 * Deconfiguring network interfaces...  [ OK ]
 * Deactivating swap...  [ OK ]
umount: /run/lock: not mounted
umount: /run/shm: not mounted
 * Will now restart
[ 2642.612250]
[ 2642.616702] ======================================================
[ 2642.635205] [ INFO: possible circular locking dependency detected ]
[ 2642.653973] 3.7.0-rc8+ #1 Not tainted
[ 2642.664923] -------------------------------------------------------
[ 2642.683688] reboot/2183 is trying to acquire lock:
[ 2642.698025] (s_active#27){++++.+}, at: [<c010174f>] sysfs_remove_dir+0x5b/0x78
[ 2642.719970]
[ 2642.719970] but task is already holding lock:
[ 2642.737433] (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at: [<c02e8bb3>] lock_policy_rwsem_write+0x27/0x6c
[ 2642.766522]
[ 2642.766522] which lock already depends on the new lock.
[ 2642.766522]
[ 2642.791018]
[ 2642.791018] the existing dependency chain (in reverse order) is:
[ 2642.813430]
-> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}:
[ 2642.830923] [<c005cc3f>] lock_acquire+0x5f/0xbc
[ 2642.846317] [<c03dea07>] down_write+0x27/0x34
[ 2642.861191] [<c02e8bb3>] lock_policy_rwsem_write+0x27/0x6c
[ 2642.879447] [<c02e8c47>] store+0x23/0x60
[ 2642.893015] [<c0100319>] sysfs_write_file+0xad/0xf8
[ 2642.909447] [<c00bb325>] vfs_write+0x69/0xcc
[ 2642.924059] [<c00bb4d3>] sys_write+0x2f/0x50
[ 2642.938667] [<c000cc81>] ret_fast_syscall+0x1/0x52
[ 2642.954843]
-> #0 (s_active#27){++++.+}:
[ 2642.967137] [<c005c493>] __lock_acquire+0x10af/0x14e4
[ 2642.984091] [<c005cc3f>] lock_acquire+0x5f/0xbc
[ 2642.999481] [<c0101399>] sysfs_addrm_finish+0xbd/0x120
[ 2643.016696] [<c010174f>] sysfs_remove_dir+0x5b/0x78
[ 2643.033131] [<c02531c3>] kobject_del+0xb/0x28
[ 2643.048003] [<c025321f>] kobject_release+0x3f/0x50
[ 2643.064177] [<c02e83bb>] cpufreq_cpu_put+0xf/0x24
[ 2643.080090] [<c02e94dd>] __cpufreq_remove_dev+0x1a9/0x1c4
[ 2643.098085] [<c03d98bd>] cpufreq_cpu_callback+0x41/0x4c
[ 2643.115560] [<c0037d21>] notifier_call_chain+0x45/0x58
[ 2643.132774] [<c001e4e3>] __cpu_notify+0x1f/0x34
[ 2643.148167] [<c03d2a99>] _cpu_down+0x65/0x1ac
[ 2643.163038] [<c001e713>] disable_nonboot_cpus+0x4b/0xa4
[ 2643.180511] [<c002b2df>] kernel_restart+0x13/0x3c
[ 2643.196425] [<c002b417>] sys_reboot+0x103/0x160
[ 2643.211940] [<c000cc81>] ret_fast_syscall+0x1/0x52
[ 2643.228162]
[ 2643.228162] other info that might help us debug this:
[ 2643.228162]
[ 2643.252136] Possible unsafe locking scenario:
[ 2643.252136]
[ 2643.269859] CPU0 CPU1
[ 2643.283414] ---- ----
[ 2643.296969] lock(&per_cpu(cpu_policy_rwsem, cpu));
[ 2643.311953] lock(s_active#27);
[ 2643.329210] lock(&per_cpu(cpu_policy_rwsem, cpu));
[ 2643.351755] lock(s_active#27);
[ 2643.361636]
[ 2643.361636] *** DEADLOCK ***
[ 2643.361636]
[ 2643.379529] 4 locks held by reboot/2183:
[ 2643.391261] #0: (reboot_mutex){+.+.+.}, at: [<c002b3a1>] sys_reboot+0x8d/0x160
[ 2643.413458] #1: (cpu_add_remove_lock){+.+.+.}, at: [<c001e6db>] disable_nonboot_cpus+0x13/0xa4
[ 2643.439821] #2: (cpu_hotplug.lock){+.+.+.}, at: [<c001e55b>] cpu_hotplug_begin+0x23/0x40
[ 2643.464621] #3: (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at: [<c02e8bb3>] lock_policy_rwsem_write+0x27/0x6c
[ 2643.494890]
[ 2643.494890] stack backtrace:
[ 2643.507940] [<c0012101>] (unwind_backtrace+0x1/0x9c) from [<c03da771>] (print_circular_bug+0x1b1/0x1fc)
[ 2643.536089] [<c03da771>] (print_circular_bug+0x1b1/0x1fc) from [<c005c493>] (__lock_acquire+0x10af/0x14e4)
[ 2643.565020] [<c005c493>] (__lock_acquire+0x10af/0x14e4) from [<c005cc3f>] (lock_acquire+0x5f/0xbc)
[ 2643.591990] [<c005cc3f>] (lock_acquire+0x5f/0xbc) from [<c0101399>] (sysfs_addrm_finish+0xbd/0x120)
[ 2643.619099] [<c0101399>] (sysfs_addrm_finish+0xbd/0x120) from [<c010174f>] (sysfs_remove_dir+0x5b/0x78)
[ 2643.647249] [<c010174f>] (sysfs_remove_dir+0x5b/0x78) from [<c02531c3>] (kobject_del+0xb/0x28)
[ 2643.673055] [<c02531c3>] (kobject_del+0xb/0x28) from [<c025321f>] (kobject_release+0x3f/0x50)
[ 2643.698603] [<c025321f>] (kobject_release+0x3f/0x50) from [<c02e83bb>] (cpufreq_cpu_put+0xf/0x24)
[ 2643.725191] [<c02e83bb>] (cpufreq_cpu_put+0xf/0x24) from [<c02e94dd>] (__cpufreq_remove_dev+0x1a9/0x1c4)
[ 2643.753600] [<c02e94dd>] (__cpufreq_remove_dev+0x1a9/0x1c4) from [<c03d98bd>] (cpufreq_cpu_callback+0x41/0x4c)
[ 2643.783572] [<c03d98bd>] (cpufreq_cpu_callback+0x41/0x4c) from [<c0037d21>] (notifier_call_chain+0x45/0x58)
[ 2643.812764] [<c0037d21>] (notifier_call_chain+0x45/0x58) from [<c001e4e3>] (__cpu_notify+0x1f/0x34)
[ 2643.839873] [<c001e4e3>] (__cpu_notify+0x1f/0x34) from [<c03d2a99>] (_cpu_down+0x65/0x1ac)
[ 2643.864636] [<c03d2a99>] (_cpu_down+0x65/0x1ac) from [<c001e713>] (disable_nonboot_cpus+0x4b/0xa4)
[ 2643.891483] [<c001e713>] (disable_nonboot_cpus+0x4b/0xa4) from [<c002b2df>] (kernel_restart+0x13/0x3c)
[ 2643.919372] [<c002b2df>] (kernel_restart+0x13/0x3c) from [<c002b417>] (sys_reboot+0x103/0x160)
[ 2643.945177] [<c002b417>] (sys_reboot+0x103/0x160) from [<c000cc81>] (ret_fast_syscall+0x1/0x52)
[ 2643.972176] CPU0 packing on CPU-1
[ 2643.982117] CPU4 packing on CPU-1
[ 2643.992134] CPU0 packing on CPU-1
[ 2644.008833] CPU4: shutdown
[ 2644.017967] Restarting system.
Rebooting...
Disabling debug USB.
Switching off ATX PSU.
Board powered down, rebooting...

ARM V2M Firmware v3.1.1
Build Date: Aug 20 2012

Date: Fri 25 Jan 2013
Time: 04:04:07

Tags: qa-services
information type: Proprietary → Public
description: updated
Fathi Boudra (fboudra)
affects: lava-android-test → linaro-android
Revision history for this message
Tixy (Jon Medhurst) (tixy) wrote :

The attached log snippet in the original report shows kernel bug in the LAVA Master Image but this didn't prevent the board from rebooting and is not the cause of the test failure.

Looking at the full logs at: http://validation.linaro.org/lava-server/scheduler/job/45998/log_file#entry20
I see:

[3] LAVA Android Test Image
 - VenHw(09831032-6FA3-4484-AF4F-0A000A8D3A82)/HD(3,MBR,0x00000000,0x200000,0x20000)/uImage
 - Initrd: VenHw(09831032-6FA3-4484-AF4F-0A000A8D3A82)/HD(3,MBR,0x00000000,0x200000,0x20000)/uInitrd
 - Arguments: console=tty0 console=ttyAMA0,38400n8 rootwait ro init=/init androidboot.console=ttyAMA0
 - FDT: VenHw(09831032-6FA3-4484-AF4F-0A000A8D3A82)/HD(3,MBR,0x00000000,0x200000,0x20000)/v2p-ca15-tc2.dtb
 - LoaderType: Linux kernel with Local FDT
-----------------------
Global FDT Config
 - VenHw(1F15DA3C-37FF-4070-B471-BB4AF12A724A)/MemoryMapped(0x0,0xE800000,0xE803000)
-----------------------
[a] Boot Manager
[b] Shell
Start: 3
ERROR: Did not find Device Tree blob.

So the tests failed because of a missing DTB, And looging in the boot.tar.bz2 file for the image it was trying to test (https://snapshots.linaro.org/android/~linaro-android-restricted/vexpress-linaro-iks/86/boot.tar.bz2) I see that the device-tree has a different name than expected, namely 'vexpress-v2p-ca15-tc2.dtb'.

I therefore suspect that the test image was not produce by the normal vexpress BoardConfig.mk which renames the device-tree.

I don't see that this is a bug with the ARM LT kernel or the standard vexpress Android config, so I'll mark this bug as invalid for the ARM LT.

Changed in linaro-landing-team-arm:
status: New → Invalid
Revision history for this message
vishal (vishalbhoj) wrote :
Changed in linaro-android:
importance: Undecided → High
assignee: nobody → vishal (vishalbhoj)
vishal (vishalbhoj)
Changed in linaro-android:
milestone: none → 13.02
status: New → Fix Committed
vishal (vishalbhoj)
Changed in linaro-android:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.