bzr, amule become stuck in sync_page

Bug #651806 reported by gpk
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Binary package hint: bzr

I added a binary file (.pdf) to a bzr working directory,
ran bzr add foo.pdf
then ran bzr diff

First of all, bzr didn't detect that it was a binary file and do something sensible.
I got lots of binary crud on my terminal.

But more importantly, it then froze, and I couldn't kill the bzr process, even
with kill -9. Here's the bottom of the file and my attempts to kill bzr:
+0000116489 00000 n
+0000116510 00000 n
+0000116531 00000 n
+0000117114 00000 n
+0000117167 00000 n
+0000117312 00000 n
+0000117386 00000 n
+0000117517 00000 n
%%EOFxref9da1f7807f0e7c25f5a8132db545><f6279da1f7807f0e7c25f5a8132db545>]
\ No newline at end of file

^C
^Z
------------in another window-------------

$ ps -u gpk | grep bzr
19758 pts/0 00:00:00 bzr
$ kill 19758
$ ps -u gpk | grep bzr
19758 pts/0 00:00:00 bzr
$ kill -1 19758
$ kill -9 19758
$ !ps
ps -u gpk | grep bzr
19758 pts/0 00:00:00 bzr
$

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: bzr 2.2.0-1
ProcVersionSignature: Ubuntu 2.6.35-22.33-generic 2.6.35.4
Uname: Linux 2.6.35-22-generic x86_64
Architecture: amd64
Date: Thu Sep 30 07:11:36 2010
ExecutablePath: /usr/bin/bzr
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Beta amd64 (20100901.1)
InterpreterPath: /usr/bin/python2.6
ProcEnviron:
 SHELL=/bin/bash
 LANG=en_GB.utf8
SourcePackage: bzr

Revision history for this message
gpk (gpk-kochanski) wrote :
Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 651806] [NEW] bzr becomes unkillable with -9

Hi,

If you can reproduce this, please run

  ps l BZR_PID

inserting the right process id there. The second word is a lower-case
l. That will tell us where it was in the kernel.

Generally speaking a process stuck for a long time in D would be a
kernel bug not an application bug.

--
Martin

Changed in bzr (Ubuntu):
status: New → Incomplete
Revision history for this message
Alejandro R. Mosteo (mosteo) wrote : Re: bzr becomes unkillable with -9

I'm seeing this with another process (amule). The process ends stuck at 100% cpu, can't be killed by root with -9. The process is sometimes in R and sometimes in D state. This has happened to me with an up-to-date Lucid (kernel 32-25) and a fresh Maverick install.

Next time this happens to me I'll try the 'ps l' thing.

Revision history for this message
Martin Pool (mbp) wrote :

If you're hitting this problem on nfs4, it's very likely an Ubuntu kernel bug; please file it there. In fact, pretty much regardless of the environment, if the process is unkillable it's a kernel bug.

Revision history for this message
gpk (gpk-kochanski) wrote :

Here you go: "ps l", as requested.

kitchen:~$ ps l 7823
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
0 1000 7823 1787 20 0 113056 21760 sync_p D+ pts/0 0:00 /usr/bin/python /usr/bin/bzr diff
kitchen:~$

Changed in bzr (Ubuntu):
status: Incomplete → New
Revision history for this message
Martin Pool (mbp) wrote :

This is very likely a kernel bug not a bzr bug, especially if people are getting similar problems with amule.

summary: - bzr becomes unkillable with -9
+ bzr become stuck in sync_page
summary: - bzr become stuck in sync_page
+ bzr, amule become stuck in sync_page
affects: bzr (Ubuntu) → linux (Ubuntu)
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi gpk,

If you could also please test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
gpk (gpk-kochanski) wrote :

The problem reliably goes away when I boot the server into the 2.6.37-999-generic kernel, and it comes back when I boot into the current Maverick kernel. The server is the kernel that matters: the client's kernel doesn't matter.

description: updated
tags: removed: needs-upstream-testing
Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
gpk (gpk-kochanski) wrote :

Still present in later Maverick:
Linux desk 2.6.35-29-generic #51-Ubuntu SMP Fri Apr 15 17:12:35 UTC 2011 x86_64 GNU/Linux

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 651806

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.