The __compute method of account_account needs optmization to improve Accounting Performance!

Bug #568537 reported by Borja López Soilán (NeoPolus)
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Odoo Addons (MOVED TO GITHUB)
Status tracked in Trunk
5.0
Fix Released
Undecided
Unassigned
Trunk
Fix Released
Medium
OpenERP R&D Addons Team 3

Bug Description

The current __compute method of account_account is wasting a lot of time reordering the list of accounts to compute.

We can make the next block of code about five times faster with a small one-line optimization (just adding a "ids2.reverse()" line):

--------
        brs = list(self.browse(cr, uid, ids2, context=context))
        sums = {}
        while brs:
            current = brs[0]
            can_compute = True
            for child in current.child_id:
                if child.id not in sums:
                    can_compute = False
                    try:
                        brs.insert(0, brs.pop(brs.index(child)))
                    except ValueError:
                        brs.insert(0, child)
            if can_compute:
                brs.pop(0)
                for fn in field_names:
                    sums.setdefault(current.id, {})[fn] = accounts.get(current.id, {}).get(fn, 0.0)
                    if current.child_id:
                        sums[current.id][fn] += sum(sums[child.id][fn] for child in current.child_id)
--------

That code is computing the value of each account as the sums of the account values plus its children values.

The problem is that the list of the accounts is sorted on the worst posible way! So most of time is wasted reordering that list.

The list of accounts comes from _get_children_and_consol, that returns a list of accounts in the form [parent, child1, child2, child1_of_child2, child2_of_child2, child3].
So the block of code shown above, that always tries to compute the first element of the list, but he won't be able to do it without computing the children accounts first: so it ends up poping accounts from the list and puting them back at the begining of the list in the reverse order... that is, it wastes a lot of time just to reverse the list on the most expensive way!

Adding a ids2.reverse() line before the block of code, will mean that the list of accounts will be in the form [child3, child2_of_child2, child1_of_child2, child2, child1, parent] so no poping&inserting will be necesary!

We have timed that block of code before adding the "ids2.reverse()" and after it:

                      BLOCK FULL METHOD
Original: 2.1701090335825 2.370021998875
Optimized: 0.37584179639849996 0.50867897272100004 (4.65 times faster!)

Note: Average times after several runs, getting the debit and credit of the root account on a database with 1663 accounts and 6930 account move lines.

I think this should be fixed ASAP: It will soothe the currents problems with accounting reports (like the general ledger performance problems reported on bug 514808 and bug 551630).

Related branches

Revision history for this message
Borja López Soilán (NeoPolus) (borjals) wrote :
Revision history for this message
Ferdinand (office-chricar) wrote :

account_account; 378
account_move_line; 3944
account_move; 1303
account_period; 84

all periods open - with patch mentioned on expert list regarding real time computation of all open fy

response of chart of account < 1 sec

Revision history for this message
Borja López Soilán (NeoPolus) (borjals) wrote :

Ferdinand, somehow opening the chart of accounts is faster than getting the root account values:

* Chart of accounts test:
   Without optimization:
          Elapsed to compute ['debit', 'credit', 'balance'] for accounts with code 1,2,3,4,5,6,7 => 1.08773589134
   With optimization:
         Elapsed to compute ['debit', 'credit', 'balance'] for account 1,2,3,4,5,6,7 => 0.540276050568

101% faster (original/optimized = 2.01)

* Accounts list (first 80 accounts) test:
   Without optimization:
          Elapsed to compute ['credit', 'debit', 'balance'] for account 0,1,10,100,1000000,...,4300498,4300502 => 2.84862804413
   With optimization:
          Elapsed to compute ['credit', 'debit', 'balance'] for account 0,1,10,100,1000000,...,4300498,4300502 => 0.894719839096

218% faster (original/optimized = 3.18)

* Search test (searching for accounts like '%700%'):
   Without optimization:
        Elapsed to compute ['credit', 'debit', 'balance'] for account 700,7000,7000000,7000001,7001000,7002000 => 0.0230215191841
   With optimization:
        Elapsed to compute ['credit', 'debit', 'balance'] for account 700,7000,7000000,7000001,7001000,7002000 => 0.0187527537343

22% faster (original/optimized = 1.22)

So I suppose the problem is that the performance hit grows with the number of account levels, it takes more time to get the values for the root account, that for all its childs.
(And the Spanish chart of accounts has about five levels [43000001 is child of 4300 which is child of 430 that is child of 43 child of 4 child of 0]).

Changed in openobject-addons:
importance: Undecided → High
milestone: none → 5.0.10
status: New → Confirmed
summary: - The __compute method of account_account is wasting 78% of time
+ The __compute method of account_account needs optmization to improve
+ Accounting Performance!
Changed in openobject-addons:
status: Confirmed → In Progress
Revision history for this message
Jay Vora (Serpent Consulting Services) (jayvora) wrote :

Hello Experts,

The point represented by Borja is really good.

This change would definately help to improve performance in Accounting where its CoA,Reports,etc.

Kindly apply the patch and notify us for the behaviour.

Thanks.

Revision history for this message
Borja López Soilán (NeoPolus) (borjals) wrote :

Jay, your patch works for me.

Average times (four runs) of __calculate method when showing the accounts list (first 80 accounts) on a demo database with the Spanish chart of accounts:

Original: 2.8158234953874999 seconds

Simple patch: 0.93597447872149997 seconds

New patch: 0.91453778743750003 seconds

Revision history for this message
Jay Vora (Serpent Consulting Services) (jayvora) wrote :

Hello Borja,

Thank you for the feedback.

You are right, but thinking on generic view; there is a little problem when IDS2 does not contain ordered IDS.

We will propose a good patch for trunk. Let us not disturb stable for now.

BTW, your patch is acceptable for stable.
Thanks.

Revision history for this message
Jay Vora (Serpent Consulting Services) (jayvora) wrote :

It has been improved by revision 2725 <email address hidden>.
Thanks,

Revision history for this message
Borja López Soilán (NeoPolus) (borjals) wrote :

Thanks Jay.

I also think that is better to use the "minimal" patch on the stable version, and leave a greater optimization for the trunk.

---

Anyway, when I tested your extended patch I already noticed it "might" have had caused problems if the ids where not ordered properly. So I did test the typical use cases, and even tried by using unsorted ids, but it never failed:

ipdb> self.read(cr,uid,[71,9],['code','debit','credit'])
  account codes for ids => [u'10', u'100']
  account codes for ids2 => [u'10', u'100', u'1000000', u'1010000', u'1020000', u'103', u'1030000', u'1034000', u'104', u'1040000', u'1044000', u'1080000', u'1090000']
  result => [{'credit': 87000.0, 'code': u'10', 'id': 9, 'debit': 43500.0}, {'credit': 87000.0, 'code': u'100', 'id': 71, 'debit': 43500.0}]

ipdb> self.read(cr,uid,[9,71],['code','debit','credit'])
  account codes for ids => [u'10', u'100']
  account codes for ids2 => [u'10', u'100', u'1000000', u'1010000', u'1020000', u'103', u'1030000', u'1034000', u'104', u'1040000', u'1044000', u'1080000', u'1090000']
  result => [{'credit': 87000.0, 'code': u'10', 'id': 9, 'debit': 43500.0}, {'credit': 87000.0, 'code': u'100', 'id': 71, 'debit': 43500.0}]

As you can see, the ids order doesn't matter, the ORM seems to order the ids by parent_left before calling the __compute method (btw, I tried with different account codes: no change either).

Revision history for this message
Jordi Esteve (www.zikzakmedia.com) (jesteve-zikzakmedia) wrote :

I have the same opinion as Borja, I think that is better to use the "minimal" patch on the stable version, and leave a greater optimization for the trunk. The improvement with the "minimal" patch of 1 line (just adding ids2.reverse() line) is impressive and it does no have collateral effects.

IMHO the waiting of 4-5 seconds when an account field is opened in a chart of accounts with 5 or 6 levels depth (like in the Spain chart of accounts) due to a non-optimized computation is a non-critical bug.

Revision history for this message
Jordi Esteve (www.zikzakmedia.com) (jesteve-zikzakmedia) wrote :

Sorry, I have just seen now that a fix has been released in version 5.0.10.

Revision history for this message
Serge Rudaz (sr02) wrote :

Hi,
I use latest stable 5.0.10 from Launchpad with Ubuntu 8.04.
If I create a new database with service profile and UK minimal chart (I tried also with the 2 Swiss chart, it is same) and try to list the accounts from "Financial management / Configuration / Financial Accounting / Financial Accounts / List of Accoutns" I get an error:

[2010-06-02 22:53:24,572] ERROR:web-services:[01]:
[02]: Environment Information :
[03]: System : Linux-2.6.24-27-generic-i686-with-debian-lenny-sid
[04]: OS Name : posix
[05]: Distributor ID: Ubuntu
[06]: Description: Ubuntu 8.04.4 LTS
[07]: Release: 8.04
[08]: Codename: hardy
[09]: Operating System Release : 2.6.24-27-generic
[10]: Operating System Version : #1 SMP Wed Mar 24 10:04:52 UTC 2010
[11]: Operating System Architecture : 32bit
[12]: Operating System Locale : en_GB.UTF8
[13]: Python Version : 2.5.2
[14]: OpenERP-Server Version : 5.0.10
[15]: Last revision No. & ID :
[16]: Traceback (most recent call last):
[17]: File "/usr/lib/python2.5/site-packages/openerp-server/osv/osv.py", line 58, in wrapper
[18]: return f(self, dbname, *args, **kwargs)
[19]: File "/usr/lib/python2.5/site-packages/openerp-server/osv/osv.py", line 119, in execute
[20]: res = pool.execute_cr(cr, uid, obj, method, *args, **kw)
[21]: File "/usr/lib/python2.5/site-packages/openerp-server/osv/osv.py", line 111, in execute_cr
[22]: return getattr(object, method)(cr, uid, *args, **kw)
[23]: File "/usr/lib/python2.5/site-packages/openerp-server/osv/orm.py", line 2228, in read
[24]: result = self._read_flat(cr, user, select, fields, context, load)
[25]: File "/usr/lib/python2.5/site-packages/openerp-server/osv/orm.py", line 2354, in _read_flat
[26]: res2 = self._columns[val[0]].get(cr, self, ids, val, user, context=context, values=res)
[27]: File "/usr/lib/python2.5/site-packages/openerp-server/osv/fields.py", line 659, in get
[28]: res = self._fnct(obj, cr, user, ids, name, self._arg, context)
[29]: File "/usr/lib/python2.5/site-packages/openerp-server/addons/account/account.py", line 280, in __compute
[30]: ids2.reverse()
[31]: NameError: global name 'ids2' is not defined

I didn't had such error for previous versions of OpenERP.
After reading this post, I tried to replace:

        # consolidate accounts with direct children
        ids2.reverse()
        brs = list(self.browse(cr, uid, ids2, context=context))
with
        # consolidate accounts with direct children
        ids.reverse()
        brs = list(self.browse(cr, uid, ids, context=context))

Then it works correctly (but slow).

Revision history for this message
Borja López Soilán (NeoPolus) (borjals) wrote :

Weird, pretty weird.
The "brs = list(self.browse(cr, uid, ids2, context=context))" line was like that in the original, so your problem has nothing to do with the optimization! (the optimization is the one-liner "ids2.reverse()", and it won't fail unless the __compute method did already fail).

Ok, just found the problem; somebody did a pretty bad merge recently and broke it:

They replaced:

206 ids2 = self._get_children_and_consol(cr, uid, ids, context)
207 acc_set = ",".join(map(str, ids2))

With:

243 children_and_consolidated = self._get_children_and_consol(
244 cr, uid, ids, context=context)

But didn't replace all the occurrences of "id2" with the new name "children_and_consolidated"! Ouch! Hey people, don't refactor the 'stable version' without testing!

I just filled a bug here: bug 589052

You may want to the version before the problem until it gets fixed (bzr revert -r2745) [by the way, your ids2->ids change is wrong, you should do ids2->children_and_consolidated instead)

Revision history for this message
Jay Vora (Serpent Consulting Services) (jayvora) wrote :

It has been fixed by revision 2749 <email address hidden>.
It was a regression introduced by #2746.
Thanks for being so quick.

Revision history for this message
Borja López Soilán (NeoPolus) (borjals) wrote :

Thanks to you Jay :)

Revision history for this message
Jay Vora (Serpent Consulting Services) (jayvora) wrote :

Hello Experts,

I am planning to fix this one with the patch attached in #4.

It would be a pleasure having a feedback/speed tests/ responses from you.

Thanks.

Revision history for this message
Jay Vora (Serpent Consulting Services) (jayvora) wrote :

Here is the patch compatible to trunk.
Please go through this.

Queries invited.

Revision history for this message
Purnendu Singh (OpenERP) (purnendu-singh) wrote :

Hello Borja López Soilán,

We have applied the proposed patch to improve the accounting performance in lp:~openerp-commiter/openobject-addons/trunk-dev-addons3-psi2 branch.

Revision No: 4560
Revision ID: <email address hidden>

Thanks

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.