problem with pack on two zeoraids
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
gocept.zeoraid |
Confirmed
|
Critical
|
Christian Theune |
Bug Description
Okay, so I finally got brave enough to try packing the two zeos. So, on each server I did:
${buildout:
So, the pack is direct via zeo, as we discussed, not via zeoraid, but does have a fixed time.
I did this a few times on each server, but only noticed problems later on. zeoraid1 stopped answering for the packed storage, and zeoraid2 showed the packed storage as degraded with zeo1 failed.
Here's when I did the packs on zeo1:
2009-11-18T15:56:49 INFO ZEO.StorageServer (12276/
2009-11-18T16:14:37 INFO ZEO.StorageServer (12276/
2009-11-18T16:47:27 INFO ZEO.StorageServer (12276/
2009-11-18T16:50:08 INFO ZEO.StorageServer (12276/
2009-11-18T16:56:30 INFO ZEO.StorageServer (12276/
2009-11-18T16:57:45 INFO ZEO.StorageServer (12276/
Here's when I did the packs on zeo2:
2009-11-18T15:57:05 INFO ZEO.StorageServer (26939/
2009-11-18T16:15:44 INFO ZEO.StorageServer (26939/
2009-11-18T16:37:26 INFO ZEO.StorageServer (26939/
2009-11-18T16:39:10 INFO ZEO.StorageServer (26939/
2009-11-18T16:42:49 INFO ZEO.StorageServer (26939/
2009-11-18T16:44:06 INFO ZEO.StorageServer (26939/
2009-11-18T16:55:08 INFO ZEO.StorageServer (26939/
2009-11-18T16:57:53 INFO ZEO.StorageServer (26939/
From zeoraid1's event log:
2009-11-18T17:03:14 INFO ZEO.zrpc.
2009-11-18T17:03:30 INFO ZEO.zrpc.
Neither zeoraid2's event log nor debug log indicate when zeo1 failed for the packed storage, I'll open a separate bug for that.
Now, zeo1 and zeo2 seem to both be running happily, but Packed.fs is 575mb on zeo1 and 671Mb on zeo2.
If I try and do:
bin/zeoraid-
...on zeoraid2, I *was* getting the connection refused error, but I've just tried again and it now seems to be recovering okay...
I'm shutting down zeoraid1 until recovery is complete just to be safe...
Changed in gocept.zeoraid: | |
assignee: | nobody → Christian Theune (ct-gocept) |
importance: | Undecided → Critical |
Changed in gocept.zeoraid: | |
milestone: | none → 1.0b7 |
Changed in gocept.zeoraid: | |
milestone: | 1.0b7 → 1.0b8 |
Changed in gocept.zeoraid: | |
status: | New → Confirmed |
So, the last transaction call is very likely again the monitor script trying to connect.