Ubuntu server should not suppress conosle outputs

Bug #1245808 reported by Munehisa Kamata
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
procps (Ubuntu)
New
Undecided
Unassigned

Bug Description

Hello,

Ubuntu server(12.04 LTS, 12.10, 13.04 and 13.10) currently uses the following /proc/sys/kernel/printk configuration by default.

ubuntu@ip-10-120-14-14:~$ cat /proc/sys/kernel/printk
4 4 1 7

This configuration suppresses register dump and stacktrace outputs from lockup detector and khungtaskd on a console.

For example:

lockup detector:
[41812904.107136] BUG: soft lockup - CPU#1 stuck for 22s! [insmod:1003]
[41812904.107213] Stack:
[41812904.107230] Call Trace:
[41812904.107269] Code: 90 41 89 fe 65 44 8b 2c 25 10 da 00 00 66 66 90 0f ae e8 e8 f9 58 d0 ff 66 90 41 89 c4 eb 11 66 90 f3 90 65 8b 1c 25 10 da 00 00 <41> 39 dd 75 20 66 66 90 0f ae e8 e8 d6 58 d0 ff 66 90 89 c2 44
[41812908.227029] INFO: rcu_sched detected stall on CPU 1 (t=15000 jiffies)

khungtaskd:
[41818362.469083] INFO: task swapon:1032 blocked for more than 120 seconds.
[41818362.469096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[41818482.409149] INFO: task swapon:1032 blocked for more than 120 seconds.
[41818482.409161] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

These lack the most important information for debugging lockup/hung-up issue and make debugging such kernel issue difficult.

On the other hand, other distros basically use the different configuration by default as shown below. I believe that Ubuntu server should use this instead of the currnet one.

[ec2-user@ip-10-132-147-86 ~]$ cat /proc/sys/kernel/printk
7 4 1 7

This will not suppress such important console outputs and we can understand where a problem happens at a glance.

For example:

[41814106.447145] BUG: soft lockup - CPU#1 stuck for 23s! [insmod:1066]
[41814106.447160] Modules linked in: stallmod(O+) isofs acpiphp
[41814106.447174] CPU 1
[41814106.447178] Modules linked in: stallmod(O+) isofs acpiphp
[41814106.447191]
[41814106.447198] Pid: 1066, comm: insmod Tainted: G O 3.2.0-52-virtual #78-Ubuntu
[41814106.447209] RIP: e030:[<ffffffff8101c216>] [<ffffffff8101c216>] native_read_tsc+0x6/0x20
[41814106.447225] RSP: e02b:ffff8801d0b4dea8 EFLAGS: 00000246
[41814106.447230] RAX: 000000004c5a957c RBX: 0000000000000001 RCX: 000000004c5a954c
...snip...
[41814106.447296] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[41814106.447304] Process insmod (pid: 1066, threadinfo ffff8801d0b4c000, task ffff8801d09b96e0)
[41814106.447311] Stack:
[41814106.447315] ffff8801d0b4ded8 ffffffff8131693a 0000000000001bbd 0000000000000000
[41814106.447326] ffffffffa000c000 0000000001351010 ffff8801d0b4dee8 ffffffff8131686c
[41814106.447336] ffff8801d0b4df08 ffffffffa000702a 0000000000000000 ffffffffa0009020
[41814106.447347] Call Trace:
[41814106.447357] [<ffffffff8131693a>] delay_tsc+0x4a/0x80
[41814106.447365] [<ffffffffa000c000>] ? 0xffffffffa000bfff
[41814106.447371] [<ffffffff8131686c>] __const_udelay+0x2c/0x30
[41814106.447380] [<ffffffffa000702a>] stall_timeout+0x2a/0x38 [stallmod]
[41814106.447387] [<ffffffffa000c02e>] init_module+0x2e/0x1000 [stallmod]
[41814106.447397] [<ffffffff81002040>] do_one_initcall+0x40/0x180
...snip...
[41814106.447541] [<ffffffff81661b42>] system_call_fastpath+0x16/0x1b
[41814112.951787] INFO: rcu_sched detected stall on CPU 1 (t=15000 jiffies)
[41814112.951801] sending NMI to all CPUs:
[41814123.412289] sched: RT throttling activated

On Ubuntu server, the default configuration is in /etc/sysctl.d/10-console-messages.conf owned by procps package.

ubuntu@ip-10-120-14-14:~$ cat /etc/sysctl.d/10-console-messages.conf

# the following stops low-level messages on console
kernel.printk = 4 4 1 7
ubuntu@ip-10-120-14-14:~$

Here is a possible fix.

--- /etc/sysctl.d/10-console-messages.conf.orig 2013-09-12 07:36:09.101819575 +0000
+++ /etc/sysctl.d/10-console-messages.conf 2013-09-12 07:40:16.517819543 +0000
@@ -1,3 +1,6 @@

-# the following stops low-level messages on console
-kernel.printk = 4 4 1 7
+# To stop low-level messages on console, use the following configuration instead.
+#
+# kernel.printk = 4 4 1 7
+#
+kernel.printk = 7 4 1 7

As far as I can see, this has been originally brought from Debian[1] but I don't think that the point of the original report is still true today. I believe that suppressing register dump and stacktrace on console by default is not good idea, especially on server-class OS.
Of course, this can be easily configured by end-user themselves, but most of them will never touch such configuration unless someone asks after a problem has happened.

Do you have any concerns about this? Any comments would be appreciated.

[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=292834

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.