AR8131:webDAV corruption on Lucid: Lightning calendar not available

Bug #651004 reported by Vince O'Farrell
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linux
New
Undecided
Unassigned
slide-webdavclient (Ubuntu)
New
Undecided
Unassigned

Bug Description

Version info:

vince@uranus:/proc$ lsb_release -rd
Description: Ubuntu 10.04.1 LTS
Release: 10.04

vince@uranus:/proc$ uname -a
Linux uranus 2.6.32-25-generic #44-Ubuntu SMP Fri Sep 17 20:26:08 UTC 2010 i686 GNU/Linux

All apache packages: 2.2.14-5ubuntu8.2

Symptoms are that Lightning shows a warning ! against webDAV calendars and the popup message says:
The calendar xxxxx is momentarily not available
The calendar grid shows no events/actions from the webDAV calendars affected.

The Error Console shows the following message for each webDAV calendar:
Warning: There has been an error reading data for calendar: xxxxx. Error code: CAL_UTF8_DECODING_FAILED. Description: An error occured while decoding an iCalendar (ics) file as UTF-8. Check that the file, including symbols and accented letters, is encoded using the UTF-8 character encoding.

When a webDAV .ics calendar file is downloaded via HTTP (Firefox) then viewed (with Notepad on XP client), it appears to contain spurious extra characters at start of file. These are not present in the underlying file as viewed locally on the Apache server machine. gedit on Lucid client baulks at displaying the downloaded file reporting that file appears to contain non-UTF-8 data.

Example:

Underlying file actually starts:
---------------------------------------
BEGIN:VCALENDAR
PRODID:-//Mozilla.org/NONSGML Mozilla Calendar V1.1//EN
VERSION:2.0
X-WR-CALNAME:home
X-WR-TIMEZONE:Europe/London
BEGIN:VTIMEZONE
TZID:Europe/London
X-LIC-LOCATION:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
END:DAYLIGHT
<remainder of file snipped>
---------------------------------------

File presented via webDAV starts:
---------------------------------------------
 7‘0lðIæŸZ E
„Fƒ@ @ À¨À¨ P´4€ž»u
™óF€ |ƒ\ 
 ‰
 F}HTTP/1.1 200 OK
Date: Wed, 29 Sep 2010 09:50:02 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Wed, 15 Sep 2010 19:15:06 GMT
ETag: "12c56b-771b-4905125b1c280"
Accept-Ranges: bytes
Content-Length: 30491
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Content-Type: text/calendar

BEGIN:VCALENDAR
PRODID:-//Mozilla.org/NONSGML Mozilla Calendar V1.1//EN
VERSION:2.0
X-WR-CALNAME:home
X-WR-TIMEZONE:Europe/London
BEGIN:VTIMEZONE
TZID:Europe/London
X-LIC-LOCATION:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
END:DAYLIGHT
<remainder of file snipped>
---------------------------------------

Now, possibly coincidentally, this problem apparently started immediately after I migrated my Apache server system from an old 600MHz Pentium II system to a new dual core Celeron E1200 system. Migration involved little more than moving the system disk from one box to the other.

I see other bugs reported on Apache2 SSL involving corruption some of which refer to a gcc bug affecting code generation for SSE4 cpus so here is cpuinfo for Apache2 server cpu:
vince@uranus:/proc$ cat cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Celeron(R) CPU E1200 @ 1.60GHz
stepping : 13
cpu MHz : 1200.000
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm
bogomips : 3199.76
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Celeron(R) CPU E1200 @ 1.60GHz
stepping : 13
cpu MHz : 1200.000
cache size : 512 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm
bogomips : 3199.99
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: vince 1826 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfdff8000 irq 16'
   Mixer name : 'Realtek ALC883'
   Components : 'HDA:10ec0883,1458c603,00100002'
   Controls : 40
   Simple ctrls : 22
DistroRelease: Ubuntu 10.04
HibernationDevice: RESUME=
IwConfig:
 lo no wireless extensions.

 eth1 no wireless extensions.

 tap0 no wireless extensions.
Lsusb:
 Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Gigabyte Technology Co., Ltd. G31M-ES2L
Package: linux 2.6.32.28.32
PackageArchitecture: i386
ProcCmdLine: root=UUID=cba10512-6319-4ff0-bac9-3f2e8fe7b031 ro quiet splash
ProcEnviron:
 LANGUAGE=en_GB:en
 PATH=(custom, no user)
 LANG=en_GB.utf8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-28.55-generic 2.6.32.27+drm33.12
Regression: Yes
RelatedPackageVersions: linux-firmware 1.34.3
Reproducible: Yes
RfKill:

Tags: lucid networking regression-potential needs-upstream-testing
Uname: Linux 2.6.32-28-generic i686
UserGroups: adm admin audio cdrom dialout dip floppy lpadmin plugdev sambashare scanner video
dmi.bios.date: 02/24/2010
dmi.bios.vendor: Award Software International, Inc.
dmi.bios.version: FG
dmi.board.name: G31M-ES2L
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.modalias: dmi:bvnAwardSoftwareInternational,Inc.:bvrFG:bd02/24/2010:svnGigabyteTechnologyCo.,Ltd.:pnG31M-ES2L:pvr:rvnGigabyteTechnologyCo.,Ltd.:rnG31M-ES2L:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvr:
dmi.product.name: G31M-ES2L
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Vince O'Farrell (vof) wrote :

Well, I got up this morning to find that my webDAV calendars in Lightning were working again albeit locked as read-only! Unchecking the read-only attribute for each one got me back working again.

There were a few updates - mainly avahi (0.6.25-1ubuntu6.1) packages with a new version of the libmikmod2 sound package - on the server yesterday but none that seemed immediately relevant to the problem so I just applied them without much thought. The server has not been rebooted since the update but the XP and Lucid Thunderbird/Lightning clients have.

The HTTP view of the .ics files confirms that the corruption of the start of the file is not happening now.

Can anyone suggest the connection?

Revision history for this message
Vince O'Farrell (vof) wrote :

For completeness, this is the list of packages that were updated yesterday:

libavahi-common-data (0.6.25-1ubuntu6.1)
libavahi-common3 (0.6.25-1ubuntu6.1)
libavahi-core6 (0.6.25-1ubuntu6.1)
avahi-daemon (0.6.25-1ubuntu6.1)
libavahi-client3 (0.6.25-1ubuntu6.1)
libavahi-compat-libdnssd1 (0.6.25-1ubuntu6.1)
libavahi-glib1 (0.6.25-1ubuntu6.1)
libavahi-qt3-1 (0.6.25-1ubuntu6.1)
libmikmod2 (3.1.11-a-6.1ubuntu0.1)

and this is the apt log for the update-manager run:

vince@uranus:/var/log/apt$ cat history.log

Start-Date: 2010-10-01 13:07:14
Upgrade: libavahi-compat-libdnssd1 (0.6.25-1ubuntu6, 0.6.25-1ubuntu6.1), avahi-daemon (0.6.25-1ubuntu6, 0.6.25-1ubuntu6.1), libavahi-common-data (0.6.25-1ubuntu6, 0.6.25-1ubuntu6.1), libavahi-qt3-1 (0.6.25-1ubuntu6, 0.6.25-1ubuntu6.1), libmikmod2 (3.1.11-a-6.1, 3.1.11-a-6.1ubuntu0.1), libavahi-client3 (0.6.25-1ubuntu6, 0.6.25-1ubuntu6.1), libavahi-glib1 (0.6.25-1ubuntu6, 0.6.25-1ubuntu6.1), libavahi-common3 (0.6.25-1ubuntu6, 0.6.25-1ubuntu6.1), libavahi-core6 (0.6.25-1ubuntu6, 0.6.25-1ubuntu6.1)
End-Date: 2010-10-01 13:07:22

Revision history for this message
Fabio Marconi (fabiomarconi) wrote :

hello
There's something new or i can close the report ?
Thanks
FAbio

Changed in ubuntu:
status: New → Incomplete
Revision history for this message
Vince O'Farrell (vof) wrote : Re: [Bug 651004] Re: webDAV corruption on Lucid: Lightning calendar not available

TB/Lightning still working fine so I think you can close the report.

Thank you.

Vince

On 02/10/10 22:08, Fabio Marconi wrote:
> hello
> There's something new or i can close the report ?
> Thanks
> FAbio
>
> ** Changed in: ubuntu
> Status: New => Incomplete
>

Revision history for this message
Fabio Marconi (fabiomarconi) wrote : Re: webDAV corruption on Lucid: Lightning calendar not available

Hello vince
Thanks for the reply
This bug report is being closed due to your last comment regarding this being fixed with an update. For future reference you can manage the status of your own bugs by clicking on the current status in the yellow line and then choosing a new status in the revealed drop down box. You can learn more about bug statuses at https://wiki.ubuntu.com/Bugs/Status. Thank you again for taking the time to report this bug and helping to make Ubuntu better. Please submit any future bugs you may find.

Changed in ubuntu:
status: Incomplete → Invalid
Revision history for this message
Vince O'Farrell (vof) wrote :

Sorry Fabio, I spoke too soon. Problem has reappeared this morning.

I cannot see that anything has changed to cause the reappearance. The server has not been rebooted and no updates have been applied.

I have investigated further and discovered:

- the corruption only affects clients running on *other* machines on the network. When I use a browser on the server to view or download the files in localhost, all looks OK.

- when I use the files in clients from across my network, the files appear to have 300 or 301 bytes of other data added to the beginning of the file. This other data looks very similar each time to the data I originally posted.

- I have rebooted the server and restarted all the clients but the problem remains.

Changed in ubuntu:
status: Invalid → Incomplete
affects: ubuntu → slide-webdavclient (Ubuntu)
Changed in slide-webdavclient (Ubuntu):
status: Incomplete → New
Revision history for this message
Vince O'Farrell (vof) wrote :

Calendars working again this morning though set read-only again!! Weird. Nothing apparently changed overnight except client machines powered down then up again. This looks as if it may be difficult to pin down...

Now looking more closely at networking and NICs. When I changed motherboard in server, NIC changed from discrete 3Com 3C905 to integrated Atheros AR8131. On first boot after change, network was not brought up. New NIC was eth1 while old was eth0 so assuming the interface name had changed because the NIC had, I added a stanza in /etc/network/interfaces for eth1 using same details as for eth0 including same static IP address.

This approach does not seem to have caused problems since in general networking is fine.

However, looking at logs this morning, noticed two oddities:
- dmesg does not seem to have been generated since 18 July.
- in syslog at boot, there is line:
Oct 4 12:31:54 uranus kernel: [ 4.464417] udev: renamed network interface eth0 to eth1

I don't know enough to judge whether these are significant.

Revision history for this message
Vince O'Farrell (vof) wrote :

...and not working today, the day after.

More clues? Looks like it is caused by some variable condition on server since when one client (XP or Lucid) has problem, all have problem.

And since yesterday, only obvious changes have been clients restarted and daily/overnight cron jobs on server.

Revision history for this message
Vince O'Farrell (vof) wrote :

...and working again this morning after not working for 5 consecutive days. Server has been up 11 days and last time I logged in to it and did *anything* was 02:00 on Sunday 10 October.

Revision history for this message
Vince O'Farrell (vof) wrote :

Sorry, I cannot subtract this morning! Should have said "...after not working for 8 consecutive days."

Revision history for this message
Fabio Marconi (fabiomarconi) wrote :

Hello Vince
To work on this bug we need to get the apache/WebDAV logs.
Can you pleasae attach here ?
Thanks
Fabio

Revision history for this message
Vince O'Farrell (vof) wrote :

Here is the latest Apache2 access.log. Let me know if/how the log level needs raising.

Revision history for this message
Vince O'Farrell (vof) wrote :

...and the latest Apache2 error.log. Are there any other logs you want to see?

Revision history for this message
Vince O'Farrell (vof) wrote :

...and as usual, the calendars stopped working after a day, that is early on 2010-10-16. I will only report further changes in state if they are unusual.

Revision history for this message
Vince O'Farrell (vof) wrote :

Well, as I suspected all along, I think the root of the bug is in the Atheros AR8131 network chip or its driver.

The server was working fine when the network card was an old 3Com 3C905C but after moving the system to a Gigabyte GA-G31M-ES2L mobo with integrated Atheros AR8131, the problem started. There were very occasional and short periods when the webDAV calendars worked but for 99% of the time they did not when the server was running on this mobo.

About 2 weeks ago, I moved the system disk again without any explicit changes, but this time to a low power Intel Atom-based D510MO mini-ITX board based system with an integrated Realtek 8111DL and the problem immediately disappeared.

Fabio: I still have the problem system if any further evidence is required but I think webDAV has been exonerated and if the bug is to be kept open, the affected component should be changed to something like the AR8131 driver (or the underlying silicon ;>)

Revision history for this message
Fabio Marconi (fabiomarconi) wrote :

Hello Vince.
If possible please run from the affected system
apport-collect -p linux 651004
Thanks
Fabio

summary: - webDAV corruption on Lucid: Lightning calendar not available
+ AR8131:webDAV corruption on Lucid: Lightning calendar not available
Revision history for this message
Vince O'Farrell (vof) wrote : AlsaDevices.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Vince O'Farrell (vof) wrote : AplayDevices.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : ArecordDevices.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : BootDmesg.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : Card0.Codecs.codec.2.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : Dependencies.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : Lspci.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : PciMultimedia.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : ProcModules.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : UdevDb.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : UdevLog.txt

apport information

Revision history for this message
Vince O'Farrell (vof) wrote : WifiSyslog.txt

apport information

Revision history for this message
Artur R. Czechowski (arturcz) wrote :

Hi,
Just a note: I spotted similar problem using plain Apache on current Debian stable (squeeze) but perhaps you find this information usefull. The problem appears for files longer than 2600 bytes.

Additionaly, the problem did not appear for other means of transferring data (I tried to transfer the file using netcat, successfully).

Also, I tried to mess with offload setting using ethtool, but the result is:
Cannot set device tcp segmentation offload settings: Operation not supported.

I suspect the error is a strange combination of AR8131 NIC and apache using TCP stack (IIRC writev insted of write).

The box I spotted this problem on is a multihomed box. The problem appears only on Atheros interface.

Regards
    Artur

Revision history for this message
Artur R. Czechowski (arturcz) wrote :

Hi,
I finally nailed down the problem. The culprits are sendfile(2) function together with Atheros NIC or module. The minimal code to reproduce the bug is available at http://pastebin.com/r6jjQ7JL.

Please create a file 3kB long, compile the provide code, then run ar8131-sendfile port createdfile on server with Atheros card.
On different machine run: netcat server_address port > localfile.
Please compare the content of localfile and createdfile - md5sum is sufficient.

There is also a bug I just submitted in Debian: http://bugs.debian.org/623059

Regards
    Artur

Revision history for this message
Artur R. Czechowski (arturcz) wrote :

Hi,
You may be intersted that there is a patch which shall fix the problem. For details please see the Debian bug mentioned above.

Regards
    Artur

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.