upstart ftbfs in quantal on arm*

Bug #1066351 reported by Matthias Klose on 2012-10-13
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
upstart
Undecided
James Hunt
linux (Ubuntu)
High
Unassigned
Quantal
High
Unassigned
upstart (Ubuntu)
High
James Hunt
Quantal
High
James Hunt

Bug Description

Testing job_process_run()
...with simple command
...with shell command
...with small script
...with small script and trailing newlines
...with script that will fail
...with environment of unnamed instance
...with environment of named instance
...with environment for pre-stop
...with environment for post-stop
...with long script
...with non-daemon job
...with script for daemon job
...with daemon job
...with forking job
...with no such file
...ensure sane fds with no console, no script
...ensure sane fds with no console, and script
...ensure sane fds with console log, no script
...ensure sane fds with console log, and script
...ensure that no log file written for single-line no-output script
...ensure that no log file written for single-line no-output command
...ensure that no log file written for CONSOLE_NONE
...ensure that no log file written for multi-line no-output script
...with single-line script that writes 1 line to stdout
...with single-line script that is killed
BAD: wrong content in file 0x4065d168 (output), expected 'hello world
' got 'hello world'
 at tests/test_job_process.c:1683 (test_run).
/bin/bash: line 5: 17382 Aborted ${dir}$tst
FAIL: test_job_process
[...]
===============================================
1 of 14 tests failed
Please report to <email address hidden>
===============================================
make[4]: *** [check-TESTS] Error 1

Matthias Klose (doko) on 2012-10-13
Changed in upstart (Ubuntu Quantal):
assignee: nobody → James Hunt (jamesodhunt)
Steve Langasek (vorlon) wrote :

Is it possible that this bug is specific to the virtual builders used for the rebuild test? The upstart test suite seems to be very finicky with regards to kernel versions, and has never had problems building on the real buildds.

Matthias Klose (doko) wrote :

this has nothing to do with test rebuilds, it's a regular update to -proposed.

Matthias Klose (doko) wrote :

and even if the ftbfs was for a test rebuild ... all these are done on non-virtual buildds for a reason

Steve Langasek (vorlon) wrote :

ah, gotcha. build failure confirmed locally, anyway.

James Hunt (jamesodhunt) wrote :

Hi Matthias - this is the issue we have seen before where a rebuild will probably work. However, I'll see if I can identify what is going on here...

Matthias Klose (doko) wrote :

package did now build on the buildds. maybe investigate while it's locally reproducible?

Changed in upstart (Ubuntu Quantal):
milestone: ubuntu-12.10 → none
Steve Langasek (vorlon) on 2012-10-16
Changed in upstart (Ubuntu Quantal):
milestone: none → quantal-updates
James Hunt (jamesodhunt) wrote :

I have just recreated this bug on arm using a test program completely unrelated to Upstart and NIH:

loop 2926 of 9999
loop 2927 of 9999
loop 2928 of 9999
loop 2929 of 9999
ERROR: incorrect output: result_string='hello, world
', buffer='hello, world'
result_string:
6865 6c6c 6f2c 2077 6f72 6c64 0d0a
hello, world\r\n
buffer:
6865 6c6c 6f2c 2077 6f72 6c64
hello, world

So, this is looking like a kernel bug to me.

James Hunt (jamesodhunt) wrote :

Also now seen intermittently on i386:

loop 174 of 9999
ERROR: incorrect output: result_string='hello, world
', buffer='hello, world'
result_string:
6865 6c6c 6f2c 2077 6f72 6c64 0d0a
hello, world\r\n
buffer:
6865 6c6c 6f2c 2077 6f72 6c64
hello, world

James Hunt (jamesodhunt) wrote :

After compiling (need to link with -lutil as shown at top of file), either run with no params, or pass a number which is the number of times it will run the test:

$ ./bug-1066351 9999

James Hunt (jamesodhunt) wrote :

Interestingly, I tried running the test program on FreeBSD, but that fails. I believe the reason is that on BSD, if the process connected to the master end of the pty has unflushed data that the slave end hasn't yet read, and the master process is killed, the kernel discards the unflushed data and read returns EOF.

I've also took a peak at the the Linux tty layer and, before I ran away screaming, I did see that '\n' is special-cased in a number of places. This isn't to say the bug is with the kernel necessarily (no evidence yet!), but it's still a possibility unless I've done something very silly in the test program :-)

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1066351

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: quantal
Changed in linux (Ubuntu Quantal):
importance: Undecided → High

ApportVersion: 2.6.1-0ubuntu3
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: james 3951 F.... pulseaudio
 /dev/snd/controlC0: james 3951 F.... pulseaudio
 /dev/snd/pcmC0D0c: james 3951 F...m pulseaudio
 /dev/snd/pcmC0D0p: james 3951 F...m pulseaudio
DistroRelease: Ubuntu 12.10
HibernationDevice: RESUME=UUID=67e3cd44-242b-4bbf-918b-28fff81e0312
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release i386 (20101007)
MachineType: LENOVO 2516CTO
NonfreeKernelModules: nvidia
Package: linux-image-generic 3.5.0.17.19
PackageArchitecture: i386
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.5.0-17-generic root=UUID=7ad192e9-7b26-49d1-8e1c-fefc7dc495cb ro quiet splash
ProcVersionSignature: Ubuntu 3.5.0-17.28-generic 3.5.5
RelatedPackageVersions:
 linux-restricted-modules-3.5.0-17-generic N/A
 linux-backports-modules-3.5.0-17-generic N/A
 linux-firmware 1.95
Tags: quantal running-unity
Uname: Linux 3.5.0-17-generic i686
UpgradeStatus: Upgraded to quantal on 2012-09-30 (16 days ago)
UserGroups: adm admin cdrom dialout kvm libvirtd lpadmin plugdev sambashare sbuild vboxusers
dmi.bios.date: 08/27/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 6IET72WW (1.32 )
dmi.board.name: 2516CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr6IET72WW(1.32):bd08/27/2010:svnLENOVO:pn2516CTO:pvrThinkPadT410:rvnLENOVO:rn2516CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 2516CTO
dmi.product.version: ThinkPad T410
dmi.sys.vendor: LENOVO

tags: added: apport-collected running-unity

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

James Hunt (jamesodhunt) wrote :

@Brad - logs attached, but note that this problem is not unique to my system as it can be seen on i386 and ARM systems. It also potentially affects all precise kernels I believe.

James Hunt (jamesodhunt) on 2012-10-17
Changed in linux (Ubuntu Quantal):
status: Incomplete → Confirmed
Andy Whitcroft (apw) wrote :

@James -- ok I can trigger this error from the test program as well on amd64. However after some poking I think this test case is flawed. I made the read retry if it did not see a 14 long string and in the tests I have run all failures show the newline is there:

    loop 39225 of 99999
    ret = 14
    loop 39226 of 99999
    ret = 12
    ret = 2
    ret = -1

So I suspect that the pty newline mangling is allowing this single write at the dash end to be split into two read at your consumer.

James Hunt (jamesodhunt) on 2012-11-15
Changed in upstart:
status: New → Confirmed
assignee: nobody → James Hunt (jamesodhunt)
status: Confirmed → Fix Released
Steve Langasek (vorlon) on 2013-10-02
Changed in upstart (Ubuntu):
status: Confirmed → Fix Released

This bug was nominated against a series that is no longer supported, ie quantal. The bug task representing the quantal nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Quantal):
status: Confirmed → Won't Fix
Rolf Leggewie (r0lf) wrote :

quantal has seen the end of its life and is no longer receiving any updates. Marking the quantal task for this ticket as "Won't Fix".

Changed in upstart (Ubuntu Quantal):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers