metadata operations hang after node outage

Bug #295836 reported by Evan Broder
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lvm2 (Ubuntu)
New
Undecided
Unassigned
Nominated for Hardy by Evan Broder

Bug Description

Binary package hint: clvm

When I shut down a node, and then boot it back up, LVM metadata operations hang after it has rejoined the cluster, unless I restart clvmd (by sending it a SIGKILL first).

Currently, when I'm booting up all 3 nodes in my cluster fresh, LVM works fine.

When I shut down one of the nodes cleanly (so that the cluster remains quorate), LVM continues to work fine until the downed node is brought back up.

The LVM operations appear to be hanging on a connect() call - I've attached an strace. There's nothing interesting in dmesg or the syslog.

cman_tool services reports that all 3 nodes are running a clvmd (and ps on each node confirms)

I can restore functionality by doing `pkill -9 clvmd` and then running clvmd by hand on the node that was just restarted.

Revision history for this message
Evan Broder (broder) wrote :
Evan Broder (broder)
description: updated
Evan Broder (broder)
description: updated
Revision history for this message
Evan Broder (broder) wrote :
Revision history for this message
Evan Broder (broder) wrote :

This appears to be a deadlock of some sort or another in clvmd - it's hanging on a futex call.

Evan Broder (broder)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.