ubuntu 10.04 with kernel 3.7.0-rc4 stops ssh from starting up

Bug #1077248 reported by Vincent Li
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

Hi,

I am running ubuntu 10.04 with most recent kernel based on this link
https://wiki.ubuntu.com/KernelTeam/GitKernelBuild. I had experienced a
strange issue that with 3.7.0-rc4. the ssh service fail to start, but
with 3.6.0 kernel, ssh service startup fine. I did 'strace -tt -f -o
<output> service start ssh' the difference here:

3.7.0-rc4 strace:

1868 16:45:06.047289 writev(3,
[{"l\1\2\1\10\0\0\0\2\0\0\0i\0\0\0\1\1o\0\34\0\0\0/com/ubu"..., 128},
{"\0\0\0\0\1\0\0\0", 8}], 2) = 136
1868 16:45:06.047441 poll([{fd=3, events=POLLIN}], 1, -1) = 1
([{fd=3, revents=POLLIN}])
1868 16:45:06.047742 read(3,
"l\4\1\1#\0\0\0\2\0\0\0o\0\0\0\1\1o\0\34\0\0\0/com/ubu"..., 2048) =
163
1868 16:45:06.047892 read(3, 0xb92eb050, 2048) = -1 EAGAIN (Resource
temporarily unavailable)
1868 16:45:06.047985 gettimeofday({1352421906, 48023}, NULL) = 0
1868 16:45:06.048083 poll([{fd=3, events=POLLIN}], 1, 2078517183) = 1
([{fd=3, revents=POLLIN}])
1868 16:45:06.048984 read(3,
"l\3\1\1\30\0\0\0\3\0\0\0?\0\0\0\4\1s\0%\0\0\0com.ubun"..., 2048) =
104
1868 16:45:06.049121 read(3, 0xb92eb050, 2048) = -1 EAGAIN (Resource
temporarily unavailable)
1868 16:45:06.049319 write(2, "start: Job failed to start\n", 27) = 27
1868 16:45:06.049514 close(3) = 0
1868 16:45:06.049837 exit_group(1) = ?

3.6.0 strace:

2125 16:41:09.150074 writev(3,
[{"l\1\2\1\10\0\0\0\2\0\0\0i\0\0\0\1\1o\0\34\0\0\0/com/ubu"..., 128},
{"\0\0\0\0\1\0\0\0", 8}], 2) = 136
2125 16:41:09.150224 poll([{fd=3, events=POLLIN}], 1, -1) = 1
([{fd=3, revents=POLLIN}])
2125 16:41:09.150527 read(3,
"l\4\1\1#\0\0\0\2\0\0\0o\0\0\0\1\1o\0\34\0\0\0/com/ubu"..., 2048) =
163
2125 16:41:09.150671 read(3, 0xb9302050, 2048) = -1 EAGAIN (Resource
temporarily unavailable)
2125 16:41:09.150760 gettimeofday({1352421669, 150798}, NULL) = 0
2125 16:41:09.150857 poll([{fd=3, events=POLLIN}], 1, -2122005215) =
1 ([{fd=3, revents=POLLIN}])
2125 16:41:09.155263 read(3,
"l\2\1\1#\0\0\0\3\0\0\0\17\0\0\0\5\1u\0\2\0\0\0\10\1g\0\1o\0\0"...,
2048) = 67
2125 16:41:09.155456 read(3, 0xb9302050, 2048) = -1 EAGAIN (Resource
temporarily unavailable)
2125 16:41:09.155716 writev(3,
[{"l\1\2\1)\0\0\0\3\0\0\0h\0\0\0\1\1o\0\34\0\0\0/com/ubu"..., 120},
{"\31\0\0\0com.ubuntu.Upstart0_6.Job\0\0\0"..., 41}], 2) = 161
2125 16:41:09.156108 gettimeofday({1352421669, 156142}, NULL) = 0
2125 16:41:09.156199 poll([{fd=3, events=POLLIN}], 1, 25000) = 1
([{fd=3, revents=POLLIN}])
2125 16:41:09.156308 read(3,
"l\2\1\1\f\0\0\0\4\0\0\0\17\0\0\0\5\1u\0\3\0\0\0\10\1g\0\1v\0\0"...,
2048) = 44
2125 16:41:09.156465 read(3, 0xb9302050, 2048) = -1 EAGAIN (Resource
temporarily unavailable)
2125 16:41:09.156641 writev(3,
[{"l\1\2\1#\0\0\0\4\0\0\0g\0\0\0\1\1o\0\36\0\0\0/com/ubu"..., 120},
{"\36\0\0\0com.ubuntu.Upstart0_6.Instan"..., 35}], 2) = 155
2125 16:41:09.157043 gettimeofday({1352421669, 157080}, NULL) = 0
2125 16:41:09.157144 poll([{fd=3, events=POLLIN}], 1, 25000) = 1
([{fd=3, revents=POLLIN}])
2125 16:41:09.157258 read(3,
"l\2\1\1\210\0\0\0\5\0\0\0\23\0\0\0\5\1u\0\4\0\0\0\10\1g\0\5a{s"...,
2048) = 176
2125 16:41:09.157412 read(3, 0xb9302050, 2048) = -1 EAGAIN (Resource
temporarily unavailable)
2125 16:41:09.157564 fstat64(1, {st_mode=S_IFCHR|0600,
st_rdev=makedev(136, 0), ...}) = 0
2125 16:41:09.157739 mmap2(NULL, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb73d8000
2125 16:41:09.157834 write(1, "ssh start/running, process 2131\n", 32) = 32
2125 16:41:09.158004 close(3) = 0
2125 16:41:09.158410 exit_group(0) = ?

I attached the kernel config and full strace log, is there anything
else I can provide to help diagnosis the issue?

Tags: lucid
Revision history for this message
Vincent Li (vincent-mc-li) wrote :
Revision history for this message
Vincent Li (vincent-mc-li) wrote :

another kernel developer noticed the same issue and think it is user space problem, that is why I filed a bug report here.

http://marc.info/?l=linux-netdev&m=135249445430308&w=2

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1077248

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: lucid
Revision history for this message
Vincent Li (vincent-mc-li) wrote :

sorry for missing the log, so when I run 'service ssh start', I got error in /var/log/syslog:

Nov 12 11:24:34 <hostname> init: Failed to spawn ssh pre-start process: unable to set oom adjustment: No such file or directory

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Vincent Li (vincent-mc-li) wrote :

I did further debugging and noticed:

3.6.0:

# ls -l /proc/1/oom*
-r--r--r-- 1 root root 0 2012-11-12 11:31 /proc/1/oom_score
-rw-r--r-- 1 root root 0 2012-11-12 11:31 /proc/1/oom_score_adj

3.7.0-rc4:

# ls -l /proc/1/oom*
-rw-r--r-- 1 root root 0 2012-11-12 11:41 /proc/1/oom_adj
-r--r--r-- 1 root root 0 2012-11-12 11:41 /proc/1/oom_score
-rw-r--r-- 1 root root 0 2012-11-12 11:41 /proc/1/oom_score_adj

it looks to me upstream kernel removed oom_adj from commit below:

commit 01dc52ebdf472f77cca623ca693ca24cfc0f1bbe
Author: Davidlohr Bueso <email address hidden>
Date: Mon Oct 8 16:29:30 2012 -0700

    oom: remove deprecated oom_adj

    The deprecated /proc/<pid>/oom_adj is scheduled for removal this month.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.7 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-rc5-raring/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Vincent Li (vincent-mc-li) wrote :

I am running the latest upstream kernel to come across this bug, it is upstream commit 01dc52ebd removed deprecated kernel knob oom_adj and somehow either upstart or init still referencing the deprecated oom_adj. this is not kernel bug in my 2 cents.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.