The server can't recover from a database failure

Bug #905257 reported by Mário Dagot on 2011-12-16
42
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Odoo Server (MOVED TO GITHUB)
Fix Released
Wishlist
OpenERP's Framework R&D
Therp Backports (Deprecated)
Undecided
Stefan Rijnhart (Opener)
Server-6.1
Undecided
Stefan Rijnhart (Opener)

Bug Description

In version 6.0. I'm using 6.0.1.
When the database stops the server can't reconnect to the database, and in this situation two bad things happens:
i) The web client, don't send the correct message to the user. The feedback a user receives is that the username/password is incorrect, this leads to frustration and confusion of users. The correct behavior would be to report for example “server or database not ready”.
ii) After that all openerp servers must be restarted, shutting down all user sessions, imagine a cluster with hundreds of server, thousands of users.
To reproduce just stop, and try to use the web client, start database again, and try to use the web client.

Related branches

description: updated
no longer affects: openobject-client-web

Hello,

This is a known issue, however it is considered minor, as the reliability of the connection to the database is a must in a production environment. I'm sure a cluster with hundreds of servers and thousands of user would have a database connection that is available 100% of the time during business hours. So in most cases, being aware of this limitation is sufficient to properly plan maintenance operations on a production environment (e.g. you should not perform database restarts during business hours, which is simply common sense)

In any case, patches are welcome to improve this, and there's one from the community considered for inclusion at this very moment. I'll link the relevant branch to this bug.

I hope you understand,

Thanks!

Changed in openobject-server:
assignee: nobody → OpenERP's Framework R&D (openerp-dev-framework)
importance: Undecided → Wishlist
status: New → Confirmed
Changed in therp-backports:
status: New → Fix Committed
assignee: nobody → Stefan Rijnhart (Therp) (stefan-therp)

This issue should be fixed as of revision 4839 in the 7.0 server branch [1].

Thanks for your patience and thanks to everyone who contributed on getting it solved, especially Florent X. :-)

[1] revision-id: <email address hidden>

Changed in openobject-server:
milestone: none → 7.0
status: Confirmed → Fix Released
no longer affects: ocb-server
no longer affects: ocb-server/7.0

A typo remained in the patch that was merged, sorry about that. This is fixed after revision 4841 rev-id: <email address hidden>

@odony Thank you for your updates on this. It's much appreciated!

Florent (florent.x) wrote :

I replied on the initial MP why I believe the patch committed in haste in the 7.0 branch is not optimal

https://code.launchpad.net/~florent.x/openobject-server/trunk-bug-905257-fix-reconnect/+merge/132149/comments/322977

An update to this fix was merged at revision 4843[1], improving the performance of the fix as a follow-up to Florent's feedback. Sorry for the spam and for the confusion.

[1] revision-id: <email address hidden>

Another minor update to this fix was applied at revision 4849 rev-id: <email address hidden>, to remove a small side-effect spotted by Florent. It should not make any difference in most cases, just mentioning it for those tracking the bug in other branches.

Changed in therp-backports:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers