kernel hangs in xlog_grant_log_space
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | Linux |
Confirmed
|
High
|
||
| | linux (Ubuntu) |
Medium
|
Unassigned | ||
| | Lucid |
Medium
|
Unassigned | ||
| | Oneiric |
Medium
|
Unassigned | ||
| | Precise |
Medium
|
Unassigned | ||
| | Quantal |
Medium
|
Unassigned | ||
Bug Description
We're seeing the following stack traces on different production machines that are running Natty 2.6.38-8-server. The machines need to be rebooted to recover. http://
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
Apr 10 23:31:14 nv-aw2az1-
| Changed in linux (Ubuntu): | |
| status: | New → Incomplete |
| tags: | added: natty |
Can't run apport-collect on these machines.
| Changed in linux (Ubuntu): | |
| status: | Incomplete → Confirmed |
| Juerg Haefliger (juergh) wrote : | #3 |
On one machine, this affected the /tmp logical volume. Any command/task that touched /tmp hung and never completed.
| Changed in linux (Ubuntu): | |
| importance: | Undecided → High |
| Changed in linux (Ubuntu): | |
| status: | Confirmed → In Progress |
| assignee: | nobody → Brad Figg (brad-figg) |
| Brad Figg (brad-figg) wrote : | #4 |
@Juerg,
1. I have backported the indicated commit and test kernels are available at:
http://
Please test the appropriate kernel and add a comment here if it resolved this issue for you or not.
2. Be aware that we are just 6 months away from the end of support for Natty. This patch has been part of Oneiric for some time now. You may want to think about upgrading to Oneiric or possibly even Precise when it releases.
| Changed in linux (Ubuntu): | |
| status: | In Progress → Incomplete |
| Juerg Haefliger (juergh) wrote : | #5 |
Trying to find a reproducer to test the kernel. Thanks.
| Jason Yen (jasonyen) wrote : | #6 |
@Brad,
I think the link to the test kernel should be:
http://
If I was wrong please feel free to correct me. Thanks.
| Brad Figg (brad-figg) wrote : | #7 |
@Jason,
You are correct. I apologize for the fumble fingers.
| Juerg Haefliger (juergh) wrote : | #8 |
Can I get a copy of the source so that I can check what other patches are in that kernel?
| Brad Figg (brad-figg) wrote : | #9 |
@Juerg,
You can find the git tree at:
git:
This is the Ubuntu 2.6.34-14.58 tree with just this one patch on it.
| Juerg Haefliger (juergh) wrote : | #10 |
I finally managed to create a reproducer for the XFS hang but the provided kernel does not solve the problem. It hangs within a few seconds, just like the original 2.6.38-8-server kernel. I also tried the following Ubuntu kernels but they both hang within a few minutes. Do run a little longer than 2.6.38-8 and -14 though.
3.0.0-17-server
3.2.0-23-lowlatency
Next I tried the upstream stable kernels but they also hang within an hour:
3.0.29
3.1.10
3.2.15
3.3.2
A typical stacktrace is attached. Once the task is hanging, the directory or partition becomes unusable and only an emergency sync clear things up again. Note that I've also started the discussion on the XS mailing list: http://
| Juerg Haefliger (juergh) wrote : | #11 |
| Changed in linux (Ubuntu): | |
| status: | Incomplete → In Progress |
| Brad Figg (brad-figg) wrote : | #12 |
@Juerg,
We are currently trying to reproduce the problem on three systems here.
| Brad Figg (brad-figg) wrote : | #13 |
@Juerg,
I should have mentioned that we are attempting to reproduce this using the Precise (3.2) kernel.
| Juerg Haefliger (juergh) wrote : | #14 |
I logged a case #00029027 through the HP landscape account. Do you have access to that? It contains instructions and some scripts that I use to force the hang. Can you try the Natty kernel? That one hangs within a few seconds.
| Brad Figg (brad-figg) wrote : | #15 |
@Juerg,
Yes, I have access and have been using those instructions to try to reproduce.
| Brad Figg (brad-figg) wrote : | #16 |
@Juerg,
Installed Natty and ran the scripts for 1hr 10min without hang.
| Juerg Haefliger (juergh) wrote : | #17 |
I reproduces the issue on 4 different machines with different HW configurations (SE1170/P410, SE2170/P212, SL390/P212, z400 no RAID controller).
Do you have a machine that I can access to give it a try? Or would it help if I gave you access to one of our machines?
| tags: | added: kernel-da-key |
| Chris J Arges (arges) wrote : | #18 |
Filed an upstream bug against xfs here:
http://
| Changed in linux: | |
| importance: | Unknown → High |
| status: | Unknown → Confirmed |
| tags: | added: lucid oneiric precise quantal |
| tags: | added: exists-upstream |
| Changed in linux (Ubuntu Lucid): | |
| status: | New → In Progress |
| Changed in linux (Ubuntu Natty): | |
| status: | New → In Progress |
| Changed in linux (Ubuntu Oneiric): | |
| status: | New → In Progress |
| Changed in linux (Ubuntu Precise): | |
| status: | New → In Progress |
| importance: | Undecided → High |
| Changed in linux (Ubuntu Natty): | |
| importance: | Undecided → High |
| Changed in linux (Ubuntu Lucid): | |
| importance: | Undecided → High |
| Changed in linux (Ubuntu Oneiric): | |
| importance: | Undecided → High |
| Changed in linux (Ubuntu Precise): | |
| assignee: | nobody → Chris J Arges (christopherarges) |
| Changed in linux (Ubuntu Quantal): | |
| assignee: | Brad Figg (brad-figg) → Chris J Arges (christopherarges) |
| Changed in linux (Ubuntu Lucid): | |
| assignee: | nobody → Chris J Arges (christopherarges) |
| Changed in linux (Ubuntu Natty): | |
| assignee: | nobody → Chris J Arges (christopherarges) |
| Changed in linux (Ubuntu Oneiric): | |
| assignee: | nobody → Chris J Arges (christopherarges) |
| tags: | added: rls-q-incoming |
| Leann Ogasawara (leannogasawara) wrote : | #19 |
Removing the rls-q-incoming tag as this has properly been nominated for Quantal and has an assignee.
| tags: | removed: rls-q-incoming |
| Changed in linux (Ubuntu Quantal): | |
| milestone: | none → ubuntu-12.10 |
| no longer affects: | linux (Ubuntu Natty) |
| Changed in linux (Ubuntu): | |
| importance: | High → Medium |
| Changed in linux (Ubuntu Precise): | |
| importance: | High → Medium |
| Changed in linux (Ubuntu Quantal): | |
| milestone: | ubuntu-12.10 → none |
| importance: | High → Medium |
| Changed in linux (Ubuntu Lucid): | |
| importance: | High → Medium |
| Changed in linux (Ubuntu Oneiric): | |
| importance: | High → Medium |
| dino99 (9d9) wrote : | #21 |
| tags: | removed: natty oneiric |
| Changed in linux (Ubuntu Oneiric): | |
| status: | In Progress → Invalid |
| Changed in linux (Ubuntu): | |
| milestone: | ubuntu-12.10 → none |
| status: | In Progress → Incomplete |
| Chris J Arges (arges) wrote : | #23 |
@penalvch
That patch was already tested in the above comments and does not fix the issue. This may affect currently released versions of the Ubuntu kernel still.
| summary: |
- Critical: Natty kernel hangs in xlog_grant_log_space + kernel hangs in xlog_grant_log_space |
| Changed in linux (Ubuntu Oneiric): | |
| assignee: | Chris J Arges (arges) → nobody |
| Changed in linux (Ubuntu): | |
| assignee: | Chris J Arges (arges) → nobody |
| Changed in linux (Ubuntu Precise): | |
| assignee: | Chris J Arges (arges) → nobody |
| Changed in linux (Ubuntu): | |
| status: | Incomplete → Triaged |
| Changed in linux (Ubuntu Lucid): | |
| assignee: | Chris J Arges (arges) → nobody |
| Changed in linux (Ubuntu Quantal): | |
| assignee: | Chris J Arges (arges) → nobody |
| dino99 (9d9) wrote : | #24 |
EOL riched
| Changed in linux (Ubuntu Quantal): | |
| status: | In Progress → Invalid |
| Rolf Leggewie (r0lf) wrote : | #25 |
lucid has seen the end of its life and is no longer receiving any updates. Marking the lucid task for this ticket as "Won't Fix".
| Changed in linux (Ubuntu Lucid): | |
| status: | In Progress → Won't Fix |


This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:
apport-collect 979498
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.