HTTP downloads get cut short in Foundation model

Bug #1196907 reported by Andrew McDermott
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro OpenEmbedded
Confirmed
High
Riku Voipio

Bug Description

I have a test file on a host that gets copied using wget but the file size (and resulting md5sum) are different to the file size on the remote host.

On the remote host I have:

$ ls -l ~/mauve/mauve.zip
-rw-rw-r-- 1 aim aim 16242122 Jul 2 09:36 /home/aim/mauve/mauve.zip

Using wget on the fastmodel [1] I get:

# wget
BusyBox v1.20.2 (2013-06-24 12:58:02 BST) multi-call binary.

Usage: wget [-c|--continue] [-s|--spider] [-q|--quiet] [-O|--output-document FILE]
        [--header 'header: value'] [-Y|--proxy on/off] [-P DIR]
        [--no-check-certificate] [-U|--user-agent AGENT] [-T SEC] URL...

# wget http://spicy:8080/mauve/mauve.zip
# ls -l wget/
total 15895
-rw-r--r-- 1 root root 16211652 Jul 2 10:34 mauve.zip

yet if I use scp I get (the correct size):

# scp aim@spicy:mauve/mauve.zip .
# ls -l scp
total 15925
-rw-r--r-- 1 root root 16242122 Jul 2 10:37 mauve.zip

If I change my image to use the non busybox version of wget (1.14) then I get the correct size.

This was found using the leg-java image (as of Tuesday Jul 2nd 2013) and that build configuration is based off the lamp image.

[1] Also reproducible with the following build components:

linaro-image-leg-java-genericarmv8-20130630-366.rootfs.tar.gz
hwpack_linaro-vexpress64-rtsm_20130701-385_arm64_supported.tar.gz

Revision history for this message
Fathi Boudra (fboudra) wrote :

it could be related (or a duplicate) to busybox test failing in lava. seel lp:#1196624

Changed in linaro-oe:
milestone: none → 13.07
Revision history for this message
Fathi Boudra (fboudra) wrote :

Andy, busybox has been updated to 1.21.1. Can you still reproduce the issue?

Changed in linaro-oe:
assignee: nobody → Andrew McDermott (frobware)
status: New → Incomplete
importance: Undecided → Medium
Revision history for this message
Andrew McDermott (frobware) wrote :

Yes.

# root@genericarmv8:~# wget
BusyBox v1.21.1 (2013-07-17 13:47:15 BST) multi-call binary.

Usage: wget [-c|--continue] [-s|--spider] [-q|--quiet] [-O|--output-document FILE]
        [--header 'header: value'] [-Y|--proxy on/off] [-P DIR]
        [-U|--user-agent AGENT] [-T SEC] URL...

On my host:

$ ls -l *.gz |grep mauve-tar.gz
-rw-rw-r-- 1 aim aim 16604318 Jul 16 08:31 mauve-tar.gz

=== Running wget on the model ===

root@genericarmv8:~# wget -q http://spicy:8080/mauve-tar.gz
root@genericarmv8:~# ls -l *.gz
-rw-r--r-- 1 root root 16573732 Jul 18 17:07 mauve-tar.gz

=== Switching to curl ===

root@genericarmv8:~# curl --version
curl 7.30.0 (aarch64-oe-linux-gnu) libcurl/7.30.0 GnuTLS/2.12.23 zlib/1.2.8
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smtp smtps telnet tftp
Features: Largefile NTLM NTLM_WB SSL libz TLS-SRP

root@genericarmv8:~# curl -s -O http://spicy:8080/mauve-tar.gz
root@genericarmv8:~# ls -l *.gz
-rw-r--r-- 1 root root 16604318 Jul 18 17:14 mauve-tar.gz

which matches the original on my host.

Fathi Boudra (fboudra)
Changed in linaro-oe:
status: Incomplete → Confirmed
summary: - busybox wget v1.20.2 fails to copy file(s) properly
+ busybox wget v1.21.1 fails to copy file(s) properly
Fathi Boudra (fboudra)
Changed in linaro-oe:
milestone: 13.07 → 13.08
Changed in linaro-oe:
importance: Medium → High
assignee: Andrew McDermott (frobware) → Riku Voipio (riku-voipio)
summary: - busybox wget v1.21.1 fails to copy file(s) properly
+ HTTP downloads get cut short in Foundation model
Revision history for this message
Riku Voipio (riku-voipio) wrote :

Looking at where the strace ends:

strace -o test.log wget http://kos.to/porting-list.txt
...
read(3, "able/haskell-hashable_1.1.2.3-1."..., 4096) = 4096
write(4, "able/haskell-hashable_1.1.2.3-1."..., 4096) = 4096
read(3, "haskell-happstack-ixset_6.0.1-2."..., 4096) = 4096
write(4, "haskell-happstack-ixset_6.0.1-2."..., 4096) = 4096
read(3, "iverse/h/haskell-mersenne-random"..., 4096) = 1320
read(3, "", 4096) = 0
write(4, "iverse/h/haskell-mersenne-random"..., 1320) = 1320
read(3, "", 4096) = 0

read result with "0" implies EOF or error. Errno is not set, so as from busybox wget POV that is where the file ends. The bug is therefor not in busybox (apart not from showing user that Content-Length field and actual download mismatched).

Using wireshark, we see the entire HTTP request. This leaves us with the following possible culprits:

- virtio networking bug in foundation model
- virtio network driver bug in kernel
- eglibc

the read value of 0 is coming from the kernel, so eglibc is probably not the source (assuming strace is not confused).

Revision history for this message
Riku Voipio (riku-voipio) wrote :

This also reproducible using the oldest images we have (using 3.6 kernels)

http://releases.linaro.org/12.10/openembedded/aarch64/rc3

I will file a bug for arm landing team to ponder.

Revision history for this message
Riku Voipio (riku-voipio) wrote :

This is apparently only reprocible using --network=nat

As a workaround, use --network=bridged (note that you will have to set up bridged networking on your host pc to make it work)

Revision history for this message
Andrew McDermott (frobware) wrote :

Riku,

I'm pretty sure this is now happening for me using bridged networking.

=== Running the foundation model as:

/opt/arm/Foundation_v8pkg/Foundation_v8 --network-bridge=ARMaim --network bridged --image /scratch/oe-jdk8/jenkins-setup/build/tmp-eglibc/deploy/images/linux-system-foundation-Image--3.10+git0+2f9c206cc5-r0-genericarmv8-20130820153604.axf --block-device /scratch/oe-jdk8/jenkins-setup/build/tmp-eglibc/deploy/images/linaro-image-minimal-genericarmv8-20130820154502.rootfs.ext2

=== And on the device:

# ifconfig eth0

root@genericarmv8:~# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:02:F7:EF:00:01
          inet addr:192.168.1.137 Bcast:192.168.1.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:42083 errors:0 dropped:128 overruns:0 frame:0
          TX packets:26605 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:63391633 (60.4 MiB) TX bytes:1911712 (1.8 MiB)
          Interrupt:47 DMA chan:ff

where 192.168.1.x is my host network.

=== Doing the same as before. On my host:

$ ls -l openjdk8-aarch64-snapshot.tar.bz2
-rw-rw-r-- 1 aim aim 60668557 Aug 8 11:20 openjdk8-aarch64-snapshot.tar.bz2
$ md5sum !$
md5sum openjdk8-aarch64-snapshot.tar.bz2
9d17af7bf8b67b9b7af8b0e792c30238 openjdk8-aarch64-snapshot.tar.bz2

=== On the target:

root@genericarmv8:~# rm openjdk8-aarch64-snapshot.tar.bz2
root@genericarmv8:~# wget http://spicy:8080/net/openjdk8-aarch64-snapshot.tar.bz2
Connecting to spicy:8080 (192.168.1.2:8080)
openjdk8-aarch64-sna 100% |*******************************| 59185k 0:00:00 ETA
root@genericarmv8:~# ls -l openjdk8-aarch64-snapshot.tar.bz2
-rw-r--r-- 1 root root 60606359 Aug 20 15:52 openjdk8-aarch64-snapshot.tar.bz2
root@genericarmv8:~# md5sum openjdk8-aarch64-snapshot.tar.bz2
da71ff0f2630cfb2628c1c4dbb29086e openjdk8-aarch64-snapshot.tar.bz2

and indeed the untar generates the expected error:

I discovered this whilst trying to use bridged networking for bootstrapping the jtreg/openjdk tests (which now breaks that plan).

Fathi Boudra (fboudra)
Changed in linaro-oe:
milestone: 13.08 → 13.09
Revision history for this message
Andrew McDermott (frobware) wrote :

As I was experimenting with the Fast Model today(as opposed to the foundation model) I thought I'd repeat this test.

I believe the issue happens on the Fast Model too. My invocation used 'user networking':

/home/aim/ARM/RTSM_AEMv8_VE/models/Linux64_GCC-4.1/RTSM_VE_AEMv8A -a /scratch/oe/jenkins-setup/build/tmp-eglibc/deploy/images/linux-system-ve-Image--3.11+git0+c4cafd3547-r0-genericarmv8-20130913101737.axf -C motherboard.mmc.p_mmc_file=/scratch/oe/jenkins-setup/build/tmp-eglibc/deploy/images/linaro-image-minimal-genericarmv8-20130913132809.rootfs.ext2 -C motherboard.smsc_91c111.enabled=1 -C motherboard.hostbridge.userNetworking=1

# On my host

$ md5sum openjdk8-aarch64-port-snapshot.tar.bz2
e349779d6478094c0c20883d483be8bd openjdk8-aarch64-port-snapshot.tar.bz2

# On the target, post a very lengthy wget:

root@genericarmv8:~# md5sum openjdk8-aarch64-port-snapshot.tar.bz2
da8e6a6188144425c1438184f3f47286 openjdk8-aarch64-port-snapshot.tar.bz2

Fathi Boudra (fboudra)
Changed in linaro-oe:
milestone: 13.09 → 13.10
Fathi Boudra (fboudra)
Changed in linaro-oe:
milestone: 13.10 → 13.11
Fathi Boudra (fboudra)
Changed in linaro-oe:
milestone: 13.11 → 13.12
Fathi Boudra (fboudra)
Changed in linaro-oe:
milestone: 13.12 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.