2010-04-22 18:47:01 |
Bogdan Butnaru |
bug |
|
|
added bug |
2010-04-22 19:05:42 |
Bogdan Butnaru |
description |
bug bug bug |
Hello! I’m having a very strange problem.
I’m the proud reporter of bug #554749, and I think I found something that might explain it. The short of that bug is that I’m using SSHFS to mount some shares from my server on my desktop; randomly (a few times each day) something goes wrong, and every program using that mount-point freezes. (I have to do a complex evil ritual to re-mount it without rebooting the computer.) While trying to debug it I discovered some occasional “Corrupted MAC on input” errors. I googled a bit for it, without much success; anyway, a post somewhere suggested I check for network corruption with netcat.
So, I cat’ed together two movie files, obtaining a 1.4 GB file filled with mostly random data. And I started shuttling it between the two computers, using netcat (via the default TCP). I did a dozen transfers, and exactly one of them was corrupted (the second, actually). Interestingly, the corruption was exactly 128 bytes long; the replaced data doesn’t have any obvious relationship to what was there originally.
According to ifconfig,
bogdanb@mabelode:~/tests$ ifconfig eth0 |grep errors
RX packets:9487952 errors:0 dropped:0 overruns:0 frame:0
TX packets:6132714 errors:0 dropped:0 overruns:0 carrier:2
bogdanb@tanelorn:~/tests$ ifconfig eth0|grep errors
RX packets:149100044 errors:0 dropped:0 overruns:0 frame:0
TX packets:135620981 errors:0 dropped:0 overruns:0 carrier:0
there haven’t been any transmission errors, so this being just something that randomly passed undetected through the TCP checksum is _really_ unlikely. There’s also the suspicious length of the error.
I’d expect a tiny bug in some of the routines that shuttle data between the NIC’s buffer and the application’s. I’ve no idea how to debug this further, please help!
A few more notes:
*) all this happens via Ethernet; the two computers are both linked to a switch with short cables. Anyway, given the above, it doesn’t look like line errors.
*) the server runs Karmic, the desktop runs Lucid.
*) I’ve had similar (but not identical) problems with SSHFS ever since I had these two computers (around Feisty, I think); it’s likely that whatever is causing the corruption was there since the beginning, but the way SSHFS handles occurrences of the bug changed.
*) whatever it is, it’s very random. As the test showed, I got a single error after 2 GB, then no other error for the next 15 GB of transferred files. However, the SSHFS error (which I’m pretty sure is caused by this) sometimes happens after 15 minutes, sometimes I have no problems for a full day.
*) I tried reporting this with ubuntu-bug, but Launchpad timed out on me several times in a row. Please tell me whatever information you think I should add.
|
|
2010-04-28 04:48:00 |
Jeremy Foshee |
tags |
|
needs-kernel-logs |
|
2010-04-28 04:48:05 |
Jeremy Foshee |
tags |
needs-kernel-logs |
needs-kernel-logs needs-upstream-testing |
|
2010-04-28 04:48:10 |
Jeremy Foshee |
tags |
needs-kernel-logs needs-upstream-testing |
kj-triage needs-kernel-logs needs-upstream-testing |
|
2010-04-28 04:48:19 |
Jeremy Foshee |
linux (Ubuntu): status |
New |
Incomplete |
|
2010-04-28 22:20:09 |
Bogdan Butnaru |
tags |
kj-triage needs-kernel-logs needs-upstream-testing |
apport-collected kj-triage needs-kernel-logs needs-upstream-testing |
|
2010-04-28 22:20:18 |
Bogdan Butnaru |
description |
Hello! I’m having a very strange problem.
I’m the proud reporter of bug #554749, and I think I found something that might explain it. The short of that bug is that I’m using SSHFS to mount some shares from my server on my desktop; randomly (a few times each day) something goes wrong, and every program using that mount-point freezes. (I have to do a complex evil ritual to re-mount it without rebooting the computer.) While trying to debug it I discovered some occasional “Corrupted MAC on input” errors. I googled a bit for it, without much success; anyway, a post somewhere suggested I check for network corruption with netcat.
So, I cat’ed together two movie files, obtaining a 1.4 GB file filled with mostly random data. And I started shuttling it between the two computers, using netcat (via the default TCP). I did a dozen transfers, and exactly one of them was corrupted (the second, actually). Interestingly, the corruption was exactly 128 bytes long; the replaced data doesn’t have any obvious relationship to what was there originally.
According to ifconfig,
bogdanb@mabelode:~/tests$ ifconfig eth0 |grep errors
RX packets:9487952 errors:0 dropped:0 overruns:0 frame:0
TX packets:6132714 errors:0 dropped:0 overruns:0 carrier:2
bogdanb@tanelorn:~/tests$ ifconfig eth0|grep errors
RX packets:149100044 errors:0 dropped:0 overruns:0 frame:0
TX packets:135620981 errors:0 dropped:0 overruns:0 carrier:0
there haven’t been any transmission errors, so this being just something that randomly passed undetected through the TCP checksum is _really_ unlikely. There’s also the suspicious length of the error.
I’d expect a tiny bug in some of the routines that shuttle data between the NIC’s buffer and the application’s. I’ve no idea how to debug this further, please help!
A few more notes:
*) all this happens via Ethernet; the two computers are both linked to a switch with short cables. Anyway, given the above, it doesn’t look like line errors.
*) the server runs Karmic, the desktop runs Lucid.
*) I’ve had similar (but not identical) problems with SSHFS ever since I had these two computers (around Feisty, I think); it’s likely that whatever is causing the corruption was there since the beginning, but the way SSHFS handles occurrences of the bug changed.
*) whatever it is, it’s very random. As the test showed, I got a single error after 2 GB, then no other error for the next 15 GB of transferred files. However, the SSHFS error (which I’m pretty sure is caused by this) sometimes happens after 15 minutes, sometimes I have no problems for a full day.
*) I tried reporting this with ubuntu-bug, but Launchpad timed out on me several times in a row. Please tell me whatever information you think I should add.
|
Hello! I’m having a very strange problem.
I’m the proud reporter of bug #554749, and I think I found something that might explain it. The short of that bug is that I’m using SSHFS to mount some shares from my server on my desktop; randomly (a few times each day) something goes wrong, and every program using that mount-point freezes. (I have to do a complex evil ritual to re-mount it without rebooting the computer.) While trying to debug it I discovered some occasional “Corrupted MAC on input” errors. I googled a bit for it, without much success; anyway, a post somewhere suggested I check for network corruption with netcat.
So, I cat’ed together two movie files, obtaining a 1.4 GB file filled with mostly random data. And I started shuttling it between the two computers, using netcat (via the default TCP). I did a dozen transfers, and exactly one of them was corrupted (the second, actually). Interestingly, the corruption was exactly 128 bytes long; the replaced data doesn’t have any obvious relationship to what was there originally.
According to ifconfig,
bogdanb@mabelode:~/tests$ ifconfig eth0 |grep errors
RX packets:9487952 errors:0 dropped:0 overruns:0 frame:0
TX packets:6132714 errors:0 dropped:0 overruns:0 carrier:2
bogdanb@tanelorn:~/tests$ ifconfig eth0|grep errors
RX packets:149100044 errors:0 dropped:0 overruns:0 frame:0
TX packets:135620981 errors:0 dropped:0 overruns:0 carrier:0
there haven’t been any transmission errors, so this being just something that randomly passed undetected through the TCP checksum is _really_ unlikely. There’s also the suspicious length of the error.
I’d expect a tiny bug in some of the routines that shuttle data between the NIC’s buffer and the application’s. I’ve no idea how to debug this further, please help!
A few more notes:
*) all this happens via Ethernet; the two computers are both linked to a switch with short cables. Anyway, given the above, it doesn’t look like line errors.
*) the server runs Karmic, the desktop runs Lucid.
*) I’ve had similar (but not identical) problems with SSHFS ever since I had these two computers (around Feisty, I think); it’s likely that whatever is causing the corruption was there since the beginning, but the way SSHFS handles occurrences of the bug changed.
*) whatever it is, it’s very random. As the test showed, I got a single error after 2 GB, then no other error for the next 15 GB of transferred files. However, the SSHFS error (which I’m pretty sure is caused by this) sometimes happens after 15 minutes, sometimes I have no problems for a full day.
*) I tried reporting this with ubuntu-bug, but Launchpad timed out on me several times in a row. Please tell me whatever information you think I should add.
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: amd64
AudioDevicesInUse:
Cannot stat file /proc/19634/fd/3: Transport endpoint is not connected
USER PID ACCESS COMMAND
/dev/snd/controlC1: bogdanb 1604 F.... pulseaudio
/dev/snd/controlC0: bogdanb 1604 F.... pulseaudio
/dev/snd/pcmC0D0p: bogdanb 1604 F...m pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
Card hw:0 'Intel'/'HDA Intel at 0xf9ff8000 irq 22'
Mixer name : 'Realtek ALC1200'
Components : 'HDA:10ec0888,104382fe,00100101'
Controls : 40
Simple ctrls : 22
Card1.Amixer.info:
Card hw:1 'Headset'/'Logitech Logitech Wireless Headset at usb-0000:00:1d.0-2, full speed'
Mixer name : 'USB Mixer'
Components : 'USB046d:0a12'
Controls : 4
Simple ctrls : 2
DistroRelease: Ubuntu 10.04
EcryptfsInUse: Yes
Frequency: Once a day.
HibernationDevice: RESUME=/dev/sdb2
IwConfig:
lo no wireless extensions.
eth0 no wireless extensions.
MachineType: System manufacturer P5Q-PRO
NonfreeKernelModules: nvidia
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-21-generic root=/dev/sda1 ro nomodeset
ProcEnviron:
LANGUAGE=en_US:en
PATH=(custom, user)
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-21.32-generic 2.6.32.11+drm33.2
Regression: Yes
RelatedPackageVersions: linux-firmware 1.34
Reproducible: No
RfKill:
Tags: lucid networking regression-potential needs-upstream-testing
Uname: Linux 2.6.32-21-generic x86_64
UserAsoundrc:
# ALSA library configuration file
# Include settings that are under the control of asoundconf(1).
# (To disable these settings, comment out this line.)
</home/bogdanb/.asoundrc.asoundconf>
UserGroups: adm admin audio cdrom dialout floppy fuse lpadmin netdev plugdev sambashare scanner staff video
WpaSupplicantLog:
dmi.bios.date: 11/04/2008
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1501
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: P5Q-PRO
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1501:bd11/04/2008:svnSystemmanufacturer:pnP5Q-PRO:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP5Q-PRO:rvrRev1.xx:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: P5Q-PRO
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
|
|
2010-04-28 22:20:25 |
Bogdan Butnaru |
attachment added |
|
AlsaDevices.txt http://launchpadlibrarian.net/46071511/AlsaDevices.txt |
|
2010-04-28 22:20:31 |
Bogdan Butnaru |
attachment added |
|
AplayDevices.txt http://launchpadlibrarian.net/46071519/AplayDevices.txt |
|
2010-04-28 22:20:36 |
Bogdan Butnaru |
attachment added |
|
ArecordDevices.txt http://launchpadlibrarian.net/46071530/ArecordDevices.txt |
|
2010-04-28 22:20:46 |
Bogdan Butnaru |
attachment added |
|
BootDmesg.txt http://launchpadlibrarian.net/46071536/BootDmesg.txt |
|
2010-04-28 22:20:51 |
Bogdan Butnaru |
attachment added |
|
Card0.Amixer.values.txt http://launchpadlibrarian.net/46071537/Card0.Amixer.values.txt |
|
2010-04-28 22:20:56 |
Bogdan Butnaru |
attachment added |
|
Card0.Codecs.codec.0.txt http://launchpadlibrarian.net/46071541/Card0.Codecs.codec.0.txt |
|
2010-04-28 22:21:04 |
Bogdan Butnaru |
attachment added |
|
Card1.Amixer.values.txt http://launchpadlibrarian.net/46071546/Card1.Amixer.values.txt |
|
2010-04-28 22:21:12 |
Bogdan Butnaru |
attachment added |
|
CurrentDmesg.txt http://launchpadlibrarian.net/46071550/CurrentDmesg.txt |
|
2010-04-28 22:21:21 |
Bogdan Butnaru |
attachment added |
|
Lspci.txt http://launchpadlibrarian.net/46071560/Lspci.txt |
|
2010-04-28 22:21:27 |
Bogdan Butnaru |
attachment added |
|
Lsusb.txt http://launchpadlibrarian.net/46071567/Lsusb.txt |
|
2010-04-28 22:21:32 |
Bogdan Butnaru |
attachment added |
|
PciMultimedia.txt http://launchpadlibrarian.net/46071569/PciMultimedia.txt |
|
2010-04-28 22:21:42 |
Bogdan Butnaru |
attachment added |
|
ProcCpuinfo.txt http://launchpadlibrarian.net/46071573/ProcCpuinfo.txt |
|
2010-04-28 22:21:51 |
Bogdan Butnaru |
attachment added |
|
ProcInterrupts.txt http://launchpadlibrarian.net/46071575/ProcInterrupts.txt |
|
2010-04-28 22:21:57 |
Bogdan Butnaru |
attachment added |
|
ProcModules.txt http://launchpadlibrarian.net/46071579/ProcModules.txt |
|
2010-04-28 22:22:06 |
Bogdan Butnaru |
attachment added |
|
UdevDb.txt http://launchpadlibrarian.net/46071580/UdevDb.txt |
|
2010-04-28 22:22:30 |
Bogdan Butnaru |
attachment added |
|
UdevLog.txt http://launchpadlibrarian.net/46071812/UdevLog.txt |
|
2010-04-28 22:22:35 |
Bogdan Butnaru |
attachment added |
|
UserAsoundrcAsoundconf.txt http://launchpadlibrarian.net/46071815/UserAsoundrcAsoundconf.txt |
|
2010-04-28 22:22:40 |
Bogdan Butnaru |
attachment added |
|
WifiSyslog.txt http://launchpadlibrarian.net/46071864/WifiSyslog.txt |
|
2010-08-12 01:16:59 |
Jeremy Foshee |
tags |
apport-collected kj-triage needs-kernel-logs needs-upstream-testing |
apport-collected kj-expired kj-triage needs-kernel-logs needs-upstream-testing |
|
2010-08-12 01:17:02 |
Jeremy Foshee |
linux (Ubuntu): status |
Incomplete |
Expired |
|
2010-08-20 22:19:46 |
Bogdan Butnaru |
linux (Ubuntu): status |
Expired |
New |
|
2010-12-03 23:43:28 |
Brad Figg |
tags |
apport-collected kj-expired kj-triage needs-kernel-logs needs-upstream-testing |
acpi-table-checksum apport-collected kj-expired kj-triage needs-kernel-logs needs-upstream-testing |
|
2011-05-04 02:33:45 |
Brad Figg |
linux (Ubuntu): status |
New |
Confirmed |
|
2012-03-07 06:52:42 |
kripton |
bug |
|
|
added subscriber kripton |
2012-06-09 12:10:48 |
penalvch |
linux (Ubuntu): status |
Confirmed |
Incomplete |
|
2012-06-09 12:11:25 |
penalvch |
tags |
acpi-table-checksum apport-collected kj-expired kj-triage needs-kernel-logs needs-upstream-testing |
acpi-table-checksum apport-collected kj-expired kj-triage lucid needs-kernel-logs needs-upstream-testing |
|
2012-08-09 04:17:42 |
Launchpad Janitor |
linux (Ubuntu): status |
Incomplete |
Expired |
|