Activity log for bug #295836

Date Who What changed Old value New value Message
2008-11-09 04:11:15 Evan Broder bug added bug
2008-11-09 04:11:15 Evan Broder bug added attachment 'strace-lvchange.txt' (strace of an lvchange operation on the hung cluster)
2008-11-09 07:55:51 Evan Broder description When I shut down a node, and then boot it back up, LVM metadata operations hang after it has rejoined the cluster. Currently, when I'm booting up all 3 nodes in my cluster fresh, LVM works fine. When I shut down one of the nodes cleanly (so that the cluster remains quorate), LVM continues to work fine until the downed node is brought back up. The LVM operations appear to be hanging on a connect() call - I've attached an strace. There's nothing interesting in dmesg or the syslog. cman_tool services reports that all 3 nodes are running a clvmd (and ps on each node confirms) Binary package hint: clvm When I shut down a node, and then boot it back up, LVM metadata operations hang after it has rejoined the cluster. Currently, when I'm booting up all 3 nodes in my cluster fresh, LVM works fine. When I shut down one of the nodes cleanly (so that the cluster remains quorate), LVM continues to work fine until the downed node is brought back up. The LVM operations appear to be hanging on a connect() call - I've attached an strace. There's nothing interesting in dmesg or the syslog. cman_tool services reports that all 3 nodes are running a clvmd (and ps on each node confirms)
2008-11-09 09:40:00 Evan Broder description Binary package hint: clvm When I shut down a node, and then boot it back up, LVM metadata operations hang after it has rejoined the cluster. Currently, when I'm booting up all 3 nodes in my cluster fresh, LVM works fine. When I shut down one of the nodes cleanly (so that the cluster remains quorate), LVM continues to work fine until the downed node is brought back up. The LVM operations appear to be hanging on a connect() call - I've attached an strace. There's nothing interesting in dmesg or the syslog. cman_tool services reports that all 3 nodes are running a clvmd (and ps on each node confirms) Binary package hint: clvm When I shut down a node, and then boot it back up, LVM metadata operations hang after it has rejoined the cluster, unless I restart clvmd (by sending it a SIGKILL first). Currently, when I'm booting up all 3 nodes in my cluster fresh, LVM works fine. When I shut down one of the nodes cleanly (so that the cluster remains quorate), LVM continues to work fine until the downed node is brought back up. The LVM operations appear to be hanging on a connect() call - I've attached an strace. There's nothing interesting in dmesg or the syslog. cman_tool services reports that all 3 nodes are running a clvmd (and ps on each node confirms) I can restore functionality by doing `pkill -9 clvmd` and then running clvmd by hand on all 3 nodes.
2008-11-09 09:53:12 Evan Broder bug added attachment 'clvmd-strace.txt' (stracing clvmd as another node comes back up)
2008-11-12 06:50:59 Evan Broder bug added subscriber XVM developers
2008-11-12 06:51:18 Greg Price bug added subscriber XVM developers
2009-01-04 04:05:36 Evan Broder description Binary package hint: clvm When I shut down a node, and then boot it back up, LVM metadata operations hang after it has rejoined the cluster, unless I restart clvmd (by sending it a SIGKILL first). Currently, when I'm booting up all 3 nodes in my cluster fresh, LVM works fine. When I shut down one of the nodes cleanly (so that the cluster remains quorate), LVM continues to work fine until the downed node is brought back up. The LVM operations appear to be hanging on a connect() call - I've attached an strace. There's nothing interesting in dmesg or the syslog. cman_tool services reports that all 3 nodes are running a clvmd (and ps on each node confirms) I can restore functionality by doing `pkill -9 clvmd` and then running clvmd by hand on all 3 nodes. Binary package hint: clvm When I shut down a node, and then boot it back up, LVM metadata operations hang after it has rejoined the cluster, unless I restart clvmd (by sending it a SIGKILL first). Currently, when I'm booting up all 3 nodes in my cluster fresh, LVM works fine. When I shut down one of the nodes cleanly (so that the cluster remains quorate), LVM continues to work fine until the downed node is brought back up. The LVM operations appear to be hanging on a connect() call - I've attached an strace. There's nothing interesting in dmesg or the syslog. cman_tool services reports that all 3 nodes are running a clvmd (and ps on each node confirms) I can restore functionality by doing `pkill -9 clvmd` and then running clvmd by hand on the node that was just restarted.