metadata operations hang after node outage

Bug #295836 reported by Evan Broder on 2008-11-09
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lvm2 (Ubuntu)
Undecided
Unassigned
Nominated for Hardy by Evan Broder

Bug Description

Binary package hint: clvm

When I shut down a node, and then boot it back up, LVM metadata operations hang after it has rejoined the cluster, unless I restart clvmd (by sending it a SIGKILL first).

Currently, when I'm booting up all 3 nodes in my cluster fresh, LVM works fine.

When I shut down one of the nodes cleanly (so that the cluster remains quorate), LVM continues to work fine until the downed node is brought back up.

The LVM operations appear to be hanging on a connect() call - I've attached an strace. There's nothing interesting in dmesg or the syslog.

cman_tool services reports that all 3 nodes are running a clvmd (and ps on each node confirms)

I can restore functionality by doing `pkill -9 clvmd` and then running clvmd by hand on the node that was just restarted.

Evan Broder (broder) on 2008-11-09
description: updated
Evan Broder (broder) on 2008-11-09
description: updated
Evan Broder (broder) wrote :

This appears to be a deadlock of some sort or another in clvmd - it's hanging on a futex call.

Evan Broder (broder) on 2009-01-04
description: updated
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers