Activity log for bug #235282

Date Who What changed Old value New value Message
2008-05-27 18:04:16 Anthony Fok bug added bug
2008-05-27 18:04:16 Anthony Fok bug added attachment '20_dmesg_dropped-digits.patch' (Patch to fix klibc-utils dmesg <[0-7]> stripping algorithm)
2008-05-27 18:07:29 Anthony Fok description Note: An equivalent bug report is filed as Debian Bug#483186 at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483186 Every now and then, we come across a machine which is unable to mount the root filesystem for whatever reasons, and get stuck at the busybox initrd environment, from which we can run dmesg to diagnostic what went wrong. To our dismay, in recent months (or years?), dmesg result come out like this, with lots of missing numbers. For example, from a test machine booting Ubuntu 8.04 hardy (with an upgraded kernel): [ 0.000] Linux version 2.6.2-1-generic (buildd@iridium) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 2 0:0:4 UTC 20 (Ubuntu 2.6.2-1.2ubuntu6-generic) [ 0.000] BIOS-provided physical RAM map: [ 0.000] BIOS-e80: 00000000 - 000000e00 (usable) [ 0.000] BIOS-e80: 000000e00 - 000000a00 (reserved) But it is supposed to look like this: [ 0.000000] Linux version 2.6.25-1-generic (buildd@iridium) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 22 05:01:49 UTC 2008 (Ubuntu 2.6.25-1.2ubuntu6-generic) [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009e000 (usable) [ 0.000000] BIOS-e820: 000000000009e000 - 00000000000a0000 (reserved) This caused quite a bit of problem when we trying to diagnose kernel oops or panics since the addresses are all wrong. Initially, we thought it had something to do with memory corruption from the kernel Oops. But later, we noticed this phenomenon happens even for cases without a kernel oops, say, perhaps we just got root=/dev/sda7 written wrong. So, we decided to investigate, and eventually came to the realization that the dmesg in initrd.img in Ubuntu (and Debian) nowadays come not from busybox but klibc-utils, and running /usr/lib/klibc/bin/dmesg on a fully booted system exhibit the same bug. Checking the source code, we found the code used to strip out <[0-7]> that prefixes every kernel message (See klogd(8)) is somewhat incorrect. So, with a bit of hacking, we got that fixed. :-) A patch is attached. Just drop it in debian/patches/20_dmesg_dropped-digits.patch and repackage! :-) We have verified the output of this fixed dmesg identical to that of util-linux dmesg. Further thoughts: We checked out klibc source using: git clone git://git.kernel.org/pub/scm/libs/klibc/klibc.git And noticed it is an upstream bug since dmesg.c was first added on (Mon Aug 20 19:57:50 2007 +0200) commit 9c5a7acda064daa7482148b5a45ee3b7ed39356c As to why this bug wasn't discovered sooner... I don't know. Perhaps very few people use the tiny dmesg in klibc-utils for diagnostic purposes? And before that, Ubuntu (and Debian) uses the dmesg module in busybox, which exhibits no such bug? Cheers, Anthony Fok <anthony dot fok at thizgroup dot com> ThizLinux Software Co., Ltd. - A member of Thiz Technology Group Debian GNU/Linux Developer Note: An equivalent bug report is filed as Debian Bug#483186 at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483186 Every now and then, we come across a machine which is unable to mount the root filesystem for whatever reasons, and get stuck at the busybox initrd environment, from which we can run dmesg to diagnostic what went wrong. To our dismay, in recent months (or years?), dmesg result come out like this, with lots of missing numbers. For example, from a test machine booting Ubuntu 8.04 hardy (with an upgraded kernel): [ 0.000] Linux version 2.6.2-1-generic (buildd@iridium) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 2 0:0:4 UTC 20 (Ubuntu 2.6.2-1.2ubuntu6-generic) [ 0.000] BIOS-provided physical RAM map: [ 0.000] BIOS-e80: 00000000 - 000000e00 (usable) [ 0.000] BIOS-e80: 000000e00 - 000000a00 (reserved) But it is supposed to look like this: [ 0.000000] Linux version 2.6.25-1-generic (buildd@iridium) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 22 05:01:49 UTC 2008 (Ubuntu 2.6.25-1.2ubuntu6-generic) [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009e000 (usable) [ 0.000000] BIOS-e820: 000000000009e000 - 00000000000a0000 (reserved) This caused quite a bit of problem when we trying to diagnose kernel oops or panics since the addresses are all wrong. Initially, we thought it had something to do with memory corruption from the kernel Oops. But later, we noticed this phenomenon happens even for cases without a kernel oops, say, perhaps we just got root=/dev/sda7 written wrong. So, we decided to investigate, and eventually came to the realization that the dmesg in initrd.img in Ubuntu (and Debian) nowadays come not from busybox but klibc-utils, and running /usr/lib/klibc/bin/dmesg on a fully booted system exhibit the same bug. Checking the source code, we found the code used to strip out <[0-7]> that prefixes every kernel message (See klogd(8)) is somewhat incorrect. So, with a bit of hacking, we got that fixed. :-) A patch is attached. Just drop it in debian/patches/20_dmesg_dropped-digits.patch and repackage! :-) We have verified the output of this fixed dmesg identical to that of util-linux dmesg. Further thoughts: We checked out klibc source using: git clone git://git.kernel.org/pub/scm/libs/klibc/klibc.git And noticed it is an upstream bug since dmesg.c was first added on Mon Aug 20 19:57:50 2007 +0200 as commit 9c5a7acda064daa7482148b5a45ee3b7ed39356c As to why this bug wasn't discovered sooner... I don't know. Perhaps very few people use the tiny dmesg in klibc-utils for diagnostic purposes? And before that, Ubuntu (and Debian) uses the dmesg module in busybox, which exhibits no such bug? Cheers, Anthony Fok <anthony dot fok at thizgroup dot com> ThizLinux Software Co., Ltd. - A member of Thiz Technology Group Debian GNU/Linux Developer
2008-05-27 18:24:39 Tim Gardner klibc: status New In Progress
2008-05-27 18:24:39 Tim Gardner klibc: assignee timg-tpi
2008-06-09 11:17:38 Tim Gardner bug added attachment 'klibc_1.5.7-4ubuntu4.patch' (dmesg priority code stripping.)
2008-06-09 11:19:03 Tim Gardner klibc: status In Progress Fix Committed
2008-06-09 11:19:03 Tim Gardner klibc: importance Undecided Medium
2008-06-09 11:19:03 Tim Gardner klibc: milestone ubuntu-8.04.1
2008-06-10 04:12:42 Steve Langasek klibc: milestone ubuntu-8.04.1
2008-06-10 04:14:10 Steve Langasek klibc: status New Fix Committed
2008-06-12 06:27:56 Steve Beattie description Note: An equivalent bug report is filed as Debian Bug#483186 at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483186 Every now and then, we come across a machine which is unable to mount the root filesystem for whatever reasons, and get stuck at the busybox initrd environment, from which we can run dmesg to diagnostic what went wrong. To our dismay, in recent months (or years?), dmesg result come out like this, with lots of missing numbers. For example, from a test machine booting Ubuntu 8.04 hardy (with an upgraded kernel): [ 0.000] Linux version 2.6.2-1-generic (buildd@iridium) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 2 0:0:4 UTC 20 (Ubuntu 2.6.2-1.2ubuntu6-generic) [ 0.000] BIOS-provided physical RAM map: [ 0.000] BIOS-e80: 00000000 - 000000e00 (usable) [ 0.000] BIOS-e80: 000000e00 - 000000a00 (reserved) But it is supposed to look like this: [ 0.000000] Linux version 2.6.25-1-generic (buildd@iridium) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 22 05:01:49 UTC 2008 (Ubuntu 2.6.25-1.2ubuntu6-generic) [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009e000 (usable) [ 0.000000] BIOS-e820: 000000000009e000 - 00000000000a0000 (reserved) This caused quite a bit of problem when we trying to diagnose kernel oops or panics since the addresses are all wrong. Initially, we thought it had something to do with memory corruption from the kernel Oops. But later, we noticed this phenomenon happens even for cases without a kernel oops, say, perhaps we just got root=/dev/sda7 written wrong. So, we decided to investigate, and eventually came to the realization that the dmesg in initrd.img in Ubuntu (and Debian) nowadays come not from busybox but klibc-utils, and running /usr/lib/klibc/bin/dmesg on a fully booted system exhibit the same bug. Checking the source code, we found the code used to strip out <[0-7]> that prefixes every kernel message (See klogd(8)) is somewhat incorrect. So, with a bit of hacking, we got that fixed. :-) A patch is attached. Just drop it in debian/patches/20_dmesg_dropped-digits.patch and repackage! :-) We have verified the output of this fixed dmesg identical to that of util-linux dmesg. Further thoughts: We checked out klibc source using: git clone git://git.kernel.org/pub/scm/libs/klibc/klibc.git And noticed it is an upstream bug since dmesg.c was first added on Mon Aug 20 19:57:50 2007 +0200 as commit 9c5a7acda064daa7482148b5a45ee3b7ed39356c As to why this bug wasn't discovered sooner... I don't know. Perhaps very few people use the tiny dmesg in klibc-utils for diagnostic purposes? And before that, Ubuntu (and Debian) uses the dmesg module in busybox, which exhibits no such bug? Cheers, Anthony Fok <anthony dot fok at thizgroup dot com> ThizLinux Software Co., Ltd. - A member of Thiz Technology Group Debian GNU/Linux Developer Note: An equivalent bug report is filed as Debian Bug#483186 at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483186 Every now and then, we come across a machine which is unable to mount the root filesystem for whatever reasons, and get stuck at the busybox initrd environment, from which we can run dmesg to diagnostic what went wrong. To our dismay, in recent months (or years?), dmesg result come out like this, with lots of missing numbers. For example, from a test machine booting Ubuntu 8.04 hardy (with an upgraded kernel): [ 0.000] Linux version 2.6.2-1-generic (buildd@iridium) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 2 0:0:4 UTC 20 (Ubuntu 2.6.2-1.2ubuntu6-generic) [ 0.000] BIOS-provided physical RAM map: [ 0.000] BIOS-e80: 00000000 - 000000e00 (usable) [ 0.000] BIOS-e80: 000000e00 - 000000a00 (reserved) But it is supposed to look like this: [ 0.000000] Linux version 2.6.25-1-generic (buildd@iridium) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 22 05:01:49 UTC 2008 (Ubuntu 2.6.25-1.2ubuntu6-generic) [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009e000 (usable) [ 0.000000] BIOS-e820: 000000000009e000 - 00000000000a0000 (reserved) This caused quite a bit of problem when we trying to diagnose kernel oops or panics since the addresses are all wrong. Initially, we thought it had something to do with memory corruption from the kernel Oops. But later, we noticed this phenomenon happens even for cases without a kernel oops, say, perhaps we just got root=/dev/sda7 written wrong. So, we decided to investigate, and eventually came to the realization that the dmesg in initrd.img in Ubuntu (and Debian) nowadays come not from busybox but klibc-utils, and running /usr/lib/klibc/bin/dmesg on a fully booted system exhibit the same bug. Checking the source code, we found the code used to strip out <[0-7]> that prefixes every kernel message (See klogd(8)) is somewhat incorrect. So, with a bit of hacking, we got that fixed. :-) A patch is attached. Just drop it in debian/patches/20_dmesg_dropped-digits.patch and repackage! :-) We have verified the output of this fixed dmesg identical to that of util-linux dmesg. Further thoughts: We checked out klibc source using: git clone git://git.kernel.org/pub/scm/libs/klibc/klibc.git And noticed it is an upstream bug since dmesg.c was first added on Mon Aug 20 19:57:50 2007 +0200 as commit 9c5a7acda064daa7482148b5a45ee3b7ed39356c As to why this bug wasn't discovered sooner... I don't know. Perhaps very few people use the tiny dmesg in klibc-utils for diagnostic purposes? And before that, Ubuntu (and Debian) uses the dmesg module in busybox, which exhibits no such bug? Cheers, Anthony Fok <anthony dot fok at thizgroup dot com> ThizLinux Software Co., Ltd. - A member of Thiz Technology Group Debian GNU/Linux Developer TESTCASE - ensure klibc-utils is installed $ /usr/lib/klibc/bin/dmesg > /tmp/klibc.dmesg $ /bin/dmesg > /tmp/bin.dmesg "diff -u /tmp/{klibc,bin}.dmesg" should show no differences (except for any additional messages that were emitted between the dmesg invocations)
2008-06-16 07:56:39 Martin Pitt klibc: status Fix Committed Fix Released
2008-06-16 07:56:55 Martin Pitt klibc: milestone intrepid-alpha-2
2008-07-11 13:33:35 Colin Watson klibc: status Fix Committed Fix Released
2009-07-04 21:24:13 Launchpad Janitor branch linked lp:ubuntu/hardy-updates/klibc