Slow throughput when communicating with windows

Bug #546649 reported by Paul Larson
116
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-fsl-imx51 (Ubuntu)
Fix Released
High
Bryan Wu
Karmic
Fix Released
Undecided
Unassigned
Lucid
Won't Fix
Undecided
Unassigned

Bug Description

A problem was recently discovered with the Karmic kernel for imx51, where slow network throughput is observed when downloading from a windows system. The following tests were performed with the karmic kernel that should be going into proposed shortly.

Now that we understand the problem better, I'm retesting and getting
some mixed results.

Test 1:
 Server: http://v.imwx.com/v/wxflash/100318teenrescue.flv
 Client: x86 linux
 342 KB/s

 Server: http://v.imwx.com/v/wxflash/100318teenrescue.flv
 Client: imx51 karmic+neon enabled kernel
 339 KB/s

 This is similar enough to call it even, and MUCH better results than
what they were reporting when downloading from that site.

Test 2:
 Server: local windows box/100318teenrescue.flv (http IIS)
 Client: x86 linux
 10.9 MB/s

 Server: local windows box/100318teenrescue.flv (http IIS)
 Client: imx51 karmic+neon enabled kernel
 91.0 KB/s

 This is consistent with the throughput reported by Adobe.

Test 3:
 Server: local windows box/100318teenrescue.flv (FTP Xlight)
 Client: x86 linux
 11242.8 kB/s

 Server: local windows box/100318teenrescue.flv (FTP Xlight)
 Client: imx51 karmic+neon enabled kernel
 65.1 kB/s

Test 4:
 Server: x86 Ubuntu (http Apache2)
 Client: imx51 karmic+neon enabled kernel
 6.35 MB/s

No duplicate test here from Linux to Linux here, because the server here
is the same system I was using as a Linux client in the previous tests.

I'm still investigating, but would be very interested if anyone can
reproduce these results.

Paul Larson (pwlars)
visibility: public → private
Changed in linux-fsl-imx51 (Ubuntu):
importance: Undecided → High
assignee: nobody → Bryan Wu (cooloney)
Tobin Davis (gruemaster)
Changed in linux-fsl-imx51 (Ubuntu):
status: New → Confirmed
Revision history for this message
Bryan Wu (cooloney) wrote :

I just backported the driver from lucid to karmic, please help me to test on your karmic machine:
http://people.canonical.com/~roc/kernel/karmic_fec/

It will take some time for me to setup karmic on my board. So please help me test it firstly.

Thanks,
-Bryan

Revision history for this message
Paul Larson (pwlars) wrote :

Tried this, and unfortunately the driver does not seem to work at all. I see it loading in dmesg, with no version number and none of the usual messages that follow it. ifconfig -a shows no adapters. Attaching a dmesg output.

Revision history for this message
Anmar Oueja (anmar) wrote :

Gents: This is becoming critical to the client. They want access to the backported driver. I suggest Bryan you try and get it up and running ASAP then grant them access to the location where they can get the source and compile it. This way both of us can test.

Revision history for this message
Bryan Wu (cooloney) wrote :

I think I got a working kernel with backported fec.c driver now. The source code is here:
http://kernel.ubuntu.com/git?p=roc/ubuntu-karmic.git;a=shortlog;h=refs/heads/fec
For building the kernel package, we need to add 'skipabi=true skipmodule=true".

The kernel package is here:
http://people.canonical.com/~roc/kernel/karmic_fec/
Please help me to do the testing with Windows server. I got the same behavior as Lucid driver now: phylib supporting and link status detection as well as good transfer speed.

Changed in linux-fsl-imx51 (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Tobin Davis (gruemaster) wrote :

This new kernel works very well on karmic. My test results show an average download speed of 10.2 MB/s vs 59.4 KB/s with the current Karmic kernel when downloading from the same Windows server on my test network.

Revision history for this message
Paul Larson (pwlars) wrote :

I tried this kernel as well, and is working much better. I'm seeing the same throughput communicating with a local windows box that I get with other systems on my network.

Paul Larson (pwlars)
visibility: private → public
Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted linux-fsl-imx51 into karmic-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in linux-fsl-imx51 (Ubuntu Karmic):
status: New → Fix Committed
tags: added: verification-needed
Revision history for this message
Anmar Oueja (anmar) wrote :

Client faced some issues. We need more testing:

-- snip --
One thing we're seeing is occasional "FEC: MDIO read timeout" console messages with this fix anywhere from 1-5 minute intervals of constant use (website testing). Is this related to the fix? It's hard to tell if the Ethernet setup itself is the culprit or the driver:

Snippet of syslog:

Mar 31 09:45:02 freescale NetworkManager: <info> (eth0): carrier now OFF (device state 8, deferring action for 4 seconds)
Mar 31 09:45:03 freescale NetworkManager: <info> (eth0): carrier now ON (device state 8)
Mar 31 09:45:11 freescale kernel: FEC: MDIO read timeout
Mar 31 09:45:12 freescale NetworkManager: <info> (eth0): carrier now OFF (device state 8, deferring action for 4 seconds)
Mar 31 09:45:13 freescale NetworkManager: <info> (eth0): carrier now ON (device state 8)
Mar 31 09:47:25 freescale kernel: FEC: MDIO read timeout
Mar 31 09:47:26 freescale NetworkManager: <info> (eth0): carrier now OFF (device state 8, deferring action for 4 seconds)
Mar 31 09:47:27 freescale NetworkManager: <info> (eth0): carrier now ON (device state 8)

-- snip --

Revision history for this message
Bryan Wu (cooloney) wrote :

Anmar,

Actually, that is not an error it a warning message from our new fec driver. I lowered the printk level of this message from ERROR to WARNING. So normal users won't see such message anymore. And this warning does not introduce any impact to our Ethernet performance.

Please help to test the new kernel here:
http://people.canonical.com/~roc/kernel/phy_speed/

Thanks,

Revision history for this message
Tobin Davis (gruemaster) wrote :

Still seeing the error messages in syslog. Also, while downloading an iso file from my internal test server, I noticed that the download speed would jump between 2M/s and 8M/s. When it would go from 8 down to two, there would be a pause. I am also seeing an excessive number of overruns in ifconfig. For reference, I do not see this on other platforms.

Revision history for this message
Paul Larson (pwlars) wrote :

Updating tag based on Tobin's results - seems there may be an impact from those messages after all.

tags: added: verification-failed
removed: verification-needed
Revision history for this message
Bryan Wu (cooloney) wrote :

I also found this speed drop on lucid and karmic. But after transfered about 600MB image to the board and from the board, I failed to find the overruns number. It is 0.

I will focus on this speed drop these days.

-Bryan

Revision history for this message
Bryan Wu (cooloney) wrote :

Paul and Tobin,

Could you please help me to test following package? 2 packages for lucid and other 2 packages for karmic.
http://people.canonical.com/~roc/kernel/fec_speed/

Thanks a lot
-Bryan

Revision history for this message
Tobin Davis (gruemaster) wrote :

Lucid:
Speed improvement clearly visible now. With this test kernel, I averaged 8.32 MB/s vs 3-4 MB/s with the current Lucid kernel. Also, the errors/warnings in the syslog have disappeared. ifconfig doesn't report any errors after 527,474 packets received, where the previous kernel would report ~160 errors/overruns for the same amount of data xfer.
I would say that this kernel is a good improvement. Now to get suspend/resume working (Bug #537083).

Karmic:
Same improvements seen here as in Lucid test kernel. Much better throughput. No more messages in syslog. No errors/overruns in ifconfig. Would be nice if we could get the video refresh issue fixed on this kernel (Bug #460419 & Bug #358956 - I think these may be related).

Overall, I would say these improvements are much better than the previous test kernels (fec_phy_speed1).

tags: added: verification-passed
removed: verification-failed
Revision history for this message
Paul Larson (pwlars) wrote :

confirmed here as well, I'm seeing consistent throughput with the updated karmic kernel, neon flag still exists, no errors in dmesg or /syslog. I never saw the overruns that Tobin reported seeing before, but I still don't see them now.

Revision history for this message
Michael Casadevall (mcasadevall) wrote :

Speed seems extremely good with test kernel to the point that NFS root gets acceptable performance (I've tried it before and never had usable performance out of it).

I ran bonnie++ as a stress test over NFS, no errors in dmesg that I can see:

mcasadevall@dusk:~$ bonnie++
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
dusk 1G 215 97 7584 5 5137 8 574 100 11709 8 564.4 63
Latency 40163us 8138ms 7328ms 41249us 310ms 153ms
Version 1.96 ------Sequential Create------ --------Random Create--------
dusk -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
                 16 293 18 1469 33 492 37 270 19 2211 10 397 30
Latency 407ms 109ms 90387us 80360us 2311us 255ms
1.96,1.96,dusk,1,1270685767,1G,,215,97,7584,5,5137,8,574,100,11709,8,564.4,63,16,,,,,293,18,1469,33,492,37,270,19,2211,10,397,30,40163us,8138ms,7328ms,41249us,310ms,153ms,407ms,109ms,90387us,80360us,2311us,255ms

Revision history for this message
Bryan Wu (cooloney) wrote :

Great, guys.

And Tobin and Paul. could you guys confirm the suspend/resume works with these new lucid kernel? It works on my side.

Thanks,
-Bryan

Martin Pitt (pitti)
tags: added: verification-done
removed: verification-passed
Revision history for this message
Paul Larson (pwlars) wrote :

No, suspend/resume is definitely still broken for me, on both karmic and lucid

Revision history for this message
Tobin Davis (gruemaster) wrote :

Suspend/resume fails here as well (as I noted in my test results above). Not sure how critical it is for karmic, but it should be fixed for lucid prior to release, if possible.

Andy Whitcroft (apw)
Changed in linux-fsl-imx51 (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-fsl-imx51 - 2.6.31-607.13

---------------
linux-fsl-imx51 (2.6.31-607.13) lucid; urgency=low

  [ Bryan Wu ]

  * SAUCE: (upstream) netdev/fec: fix phy_speed caculating
    - LP: #546649, #457878
  * SAUCE: (upstream) netdev/fec: fix performance impact from mdio poll
    operation
    - LP: #546649, #457878
 -- Andy Whitcroft <email address hidden> Wed, 14 Apr 2010 12:08:46 +0100

Changed in linux-fsl-imx51 (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.3 KiB)

This bug was fixed in the package linux-fsl-imx51 - 2.6.31-111.27

---------------
linux-fsl-imx51 (2.6.31-111.27) karmic-proposed; urgency=low

  [ Bryan Wu ]

  * [Config] built in SMSC_PHY driver for fsl-imx51
    - LP: #546649
  * SAUCE: fsl-imx51: sync karmic fec.c driver with lucid fec.c
    - LP: #546649
  * [Config]: turn off CONFIG_FIXED_PHY for fsl-imx51
    - LP: #546649

linux-fsl-imx51 (2.6.31-110.26) karmic-proposed; urgency=low

  [ Stefan Bader ]

  * Rebased to 2.6.31-21.59
  * [Config] Enable CONFIG_NEON
    - LP: #507416

  [ Ubuntu: 2.6.31-21.59 ]

  * [Config] generic-pae switch to M586TSC
    - LP: #519448
  * (pre-stable) drm/i915: Increase fb alignment to 64k
    - LP: #404064
  * Input: i8042 - bypass AUX IRQ delivery test on laptops
    - LP: #534448
  * SAUCE: Fix volume hotkeys for Dell Studio 1557
    - LP: #465250
  * SAUCE: aufs: Fix header files inclusion in debug.h
    - LP: #517151
  * [Config] Enable all CGROUP configuration options
    - LP: #480739
  * Revert "[Upstream] acerhdf: Limit modalias matching to supported
    boards"
    - LP: #509730
  * [Config] ext3 defaults to ordered mode
    - LP: #510067
  * [Config] Fix sub-flavours package conflicts
    - LP: #454827
  * PCI/cardbus: Add a fixup hook and fix powerpc
    - LP: #455723
  * fnctl: f_modown should call write_lock_irqsave/restore
    - LP: #519436
  * ACPI: enable C2 and Turbo-mode on Nehalem notebooks on A/C
    - LP: #516325
  * tg3: Add 57788, remove 57720
    - LP: #515390
  * HID: ignore all recent SoundGraph iMON devices
    - LP: #488443
  * Input: ALPS - add interleaved protocol support (Dell E6x00 series)
    - LP: #296610
  * acerhdf: limit modalias matching to supported
    - LP: #509730
  * ASoC: Do not write to invalid registers on the wm9712.
    - LP: #509730
  * cifs: NULL out tcon, pSesInfo, and srvTcp pointers when chasing DFS
    referrals
    - LP: #509730
  * clockevents: Prevent clockevent_devices list corruption on cpu hotplug
    - LP: #509730
  * dma: at_hdmac: correct incompatible type for argument 1 of
    'spin_lock_bh'
    - LP: #509730
  * drivers/net/usb: Correct code taking the size of a pointer
    - LP: #509730
  * Libertas: fix buffer overflow in lbs_get_essid()
    - LP: #509730
  * md: Fix unfortunate interaction with evms
    - LP: #509730
  * pata_cmd64x: fix overclocking of UDMA0-2 modes
    - LP: #509730
  * pata_hpt3x2n: fix clock turnaround
    - LP: #509730
  * SCSI: fc class: fix fc_transport_init error handling
    - LP: #509730
  * sound: sgio2audio/pdaudiocf/usb-audio: initialize PCM buffer
    - LP: #509730
  * USB: emi62: fix crash when trying to load EMI 6|2 firmware
    - LP: #509730
  * USB: Fix a bug on appledisplay.c regarding signedness
    - LP: #509730
  * USB: musb: gadget_ep0: avoid SetupEnd interrupt
    - LP: #509730
  * USB: option: support hi speed for modem Haier CE100
    - LP: #490068, #509730
  * x86, cpuid: Add "volatile" to asm in native_cpuid()
    - LP: #509730
  * e100: Use pci pool to work around GFP_ATOMIC order 5 memory allocation
    failure
    - LP: #509730
  * e100: Fix broken cbs accounting due to missing memset.
    - LP: #509730
  * hostap: Revert a toxic part...

Read more...

Changed in linux-fsl-imx51 (Ubuntu Karmic):
status: Fix Committed → Fix Released
Revision history for this message
Bryan Wu (cooloney) wrote :

We got a correct fixing from upstream. So I revert this patch and apply the upstream version here:
http://kernel.ubuntu.com/git?p=roc/ubuntu-lucid.git;a=shortlog;h=refs/heads/fec

And the kernel package here:
http://people.canonical.com/~roc/kernel/fec0/

Tobin, could you please help me to test it again?

-Bryan

Revision history for this message
Tobin Davis (gruemaster) wrote :

Tested on Lucid. Seems ok. Only slight performance drop, but mostly negligible. Wget log files attached.

Revision history for this message
Jonathan Riddell (jr) wrote :

Waiting in lucid-proposed unapproved queue

needs bug task opened for lucid

needs approval from ubuntu-sru

Revision history for this message
Rolf Leggewie (r0lf) wrote :

lucid has seen the end of its life and is no longer receiving any updates. Marking the lucid task for this ticket as "Won't Fix".

Changed in linux-fsl-imx51 (Ubuntu Lucid):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.