Comment 14 for bug 713216

Revision history for this message
Olivier Dony (Odoo) (odo-openerp) wrote :

Hi Kyle,

Thanks for sending the output of the lock analysis query. For some reason the "relname" column is empty, so we don't know the name of the tables that are locked, but that does not matter very much.
The rest of the analysis is pretty clear, the login() call tries to update the last login "date" of the user trying to login, while the scheduler transaction is apparently holding a lock to that same user row.

This is most likely because the tables on with the scheduler transaction is working (stock.move, procurement_order, etc.) have foreign key references to their creator/updater user in the `create_uid` `write_uid` columns.
For any UPDATE/INSERT/DELETE PostgreSQL not only locks the row being modified, but also acquires a RowShareLock on any referenced rows (FKs), to avoid the foreign keys from changing or disappearing before the referring transaction completes.
Therefore while the scheduler is running it is likely to be holding a RowShareLock on the res_users rows corresponding to the create_uid and write_uid values of all the records it is working on. The most likely user to be locked out is the Admin user (UID 1) because the scheduler is running as admin by default.

Postgres does not keep row-level locks in memory so that won't appear directly in the lock analysis, but you can see this happening because the scheduler transaction [1] has been granted a number of RowShareLocks (on my system I can see one of them is for res_users), and the login transaction is waiting for a ShareLock on the "transaction_id" of the other transaction. This is how a transaction waiting for another to release a row-level lock will appear [2].

 transid | virtrid| relname | locktype | mode | grtd | cur_query
---------+--------+----------+---------------+---------------+------+----------
         | 7/10 | res_users| relation | RowShareLock | t | <IDLE>...
 1077956 | 7/10 | | transactionid | ExclusiveLock | t | <IDLE>...
 1077957 | 8/38 | | transactionid | ExclusiveLock | t | update re
 1077956 | 8/38 | | transactionid | ShareLock | f | update re

You could confirm this in several ways:
- logging in as a new user who has not created or updated any document should work normally even when scheduler is running
- creating a new user 'MRP Scheduler' with full access to all the relevant tables and configuring the scheduler job to run as that user may greatly reduce the chance of blocking other users (but because there are many existing rows this may not take effect for a while)
- disabling the code that updates the "last login" date in res_users or making it tolerant to locked rows should eliminate that issue altogether;

The patch applied in revision 4049[3] should solve the issue by doing the latter option: skipping the update of the login date if the user's record is currently locked. The fix is for trunk/6.1 but you could use the same code to replace the bulk of the login() method in 6.0.

There aren't many other situations where we alter the user record, but those may still be prevented after the patch, so the use of a separate "owner user" for long-running background jobs may still be a good idea.

Thanks for reporting and for your patience during the long analysis!

[1] Shown as <IDLE> in transation, because it is busy in Python code
[2] Because row-level locks are not kept in memory, see also full explanation at http://www.postgresql.org/docs/8.4/interactive/explicit-locking.html#LOCKING-ROWS
[3] trunk server r.4049 rev-id: <email address hidden>
http://bazaar.launchpad.net/~openerp/openobject-server/trunk/revision/4049