maas takes a while to recover after primary database is moved
Bug #1822618 reported by
Jason Hobbs
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Expired
|
Undecided
|
Unassigned |
Bug Description
This is with 2.5.2-7523-
We have a test where we kill the postgres master, wait for the failover to happen, then try to use maas.
We wait 75 seconds between killing the database master and trying to use maas.
However, we still get an error back from MAAS occasionally:
"500 Internal Server Error (SSL SYSCALL error: EOF detected"
To post a comment you must log in.
Marking this as incomplete provided that:
1. There are no regiond.log's attached for any of the regions.
2. The error above is not clear where it comes from, it talks about 'SSL SYSCALL' when MAAS doesn't even support nor configures SSL.
3. My understanding is that the configuration for postgresql HA has recently changed. Is there any indication that this could be the result of that ?
4. There's no information about the state of postgresql when you see that error. Can you confirm it recover successfully and the database is up to date and no errors with postgresql?