Overall performance issue of BaseModel.fields_view_get - binary fields should not be prefetched

Bug #1177965 reported by Guewen Baconnier @ Camptocamp
28
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Odoo Server (MOVED TO GITHUB)
Fix Released
Low
OpenERP Publisher's Warranty Team

Bug Description

This bug is expressed by a general slowdown of the web client through all the
application. Display any view may take several seconds.

It does not happen on all the users. Some are afflicted, some are not. More
weirdly, this happens in our production database, but not on the replication I
did on my machine.

According to my analysis, I guess that it happens only when report_webkit is
installed (however a similar issue could probably happen without)

Example of the issue:

 - I connect with lmaurer
 - I click on Sales > Customers
 - It takes 8 seconds to display the list

See the attached screenshot.

Most of the times is spent on the fields_view_get of the tree view (I attached
the request and the response). fields_view_get is called 2 times (is it normal
BTW?) and each times it takes ~3.10 seconds to achieve.

I dived deep in the code to find out what actually happens.
I will try to summarize that below.

Finding from where comes the slowness:
======================================

-> BaseModel.fields_view_get()
Most of the time is spent when it builds the toolbar, precisely at the line
2274 when it gets the print actions:

    resprint = ir_values_obj.get(cr, user, 'action',
            'client_print_multi', [(self._name, False)], False,
            context)

-> ir_values.get()
Forward to ir_values.get_actions()

-> ir_values.get_actions()
Most of the time is spent when it reads the action at line 404:

    action_def = self.pool.get(action_model).read(cr, uid, int(id), fields, context)

Here, `fields` contains all the columns of the action, excepted a list of
fields which are blacklisted in
openerp.addons.base.ir.ir_values.EXCLUDED_FIELDS

It appears that the field `webkit_header` is in the list of the fields to read,
but is not filtered out. This is a fields.property and it takes nearly 1.5
seconds to compute. (as we have few actions, it is multiplied by the number of
actions).

An easy fix would be to add this field in the `EXCLUDED_FIELDS` list,
but that's only a part of the solution, it would mask the root cause.

The real problem is the read on ir_actions_report_xml.webkit_header which takes
1.5 seconds to achieve.

Finding why the read of the fields.property is so slow:
=======================================================

Note that I was now able to know that the slowness does not comes from the
client nor from fields_view_get (directly). Just by doing a
ir_actions_report_xml.read() (for instance with `erppeek`) on the field
`webkit_header`, I'm able to reproduce the issue, it takes between 1.5 to 3
seconds to compute.

So I dived more deeply in the call stack

-> fields.property._fnct_read()
Most of the time here is spent when it gets the property, at line 1447:

    value = prop.get(cr, uid, prop_name, obj._name, res_id=obj_reference, context=context)

-> ir_property.get()
Most of the time is spent when it builds the search domain at line 159:

    domain = self._get_domain(cr, uid, name, model, context=context)

-> ir_property._get_domain()
Most of the time is spent when it gets the default company at line 185:

    cid = company._company_default_get(cr, uid, model, res[0], context=context)

-> res_company._company_default_get()

Here is an extract of this method:

    user = self.pool.get('res.users').browse(cr, SUPERUSER_ID, uid, context=context)
    for rule in proxy.browse(cr, uid, ids, context):
        if eval(rule.expression, {'context': context, 'user': user}):
            return rule.company_dest_id.id
    return user.company_id.id

It appears that the returned value is the thing which consumes all that time.
It takes ~0.7s, but seems to be called a couple of time.
I replaced it by a read like that:

    user_obj = self.pool.get('res.users')
    user = user_obj.browse(cr, SUPERUSER_ID, uid, context=context)
    for rule in proxy.browse(cr, uid, ids, context):
        if eval(rule.expression, {'context': context, 'user': user}):
            return rule.company_dest_id.id
    return user_obj.read(cr, SUPERUSER_ID,
                         uid,
                         ['company_id'],
                         context=context)['company_id'][0]

And it was fast again.

I did not dived more deeply to see why the browse of the company take as much
time because I was running out of time. Could it be related to function fields
or _inherits? I cannot explain neither why it happens on a server and not on
another. I cannot figure out neither why for some users that's really slower
than others (the same fields_view_get on my user takes ~500 ms, lmaurer 3s,
some users 80ms).

I didn't check if the search_read which lasts 1.17 seconds is normal, but it
seems less shocking as it returns all the data of the records, including the
function fields.

Conclusion
==========

* I think that `webkit_header` should be added to openerp.addons.base.ir.ir_values.EXCLUDED_FIELDS
  when the report_webkit addon is installed because it is probably not necessary to compute it, however
  that's not a fix for the root cause.

* I think that we should replace the browse by a read in res_company._company_default_get() because
  it takes way too much time. Alternatively this method could be cached but I could be a problem
  if the context is necessary.

* I think that further analysis should be done on the browse because it seems not normal to me that
  user.company_id.id takes 0.7s.
b

Tags: maintenance

Related branches

Revision history for this message
Guewen Baconnier @ Camptocamp (gbaconnier-c2c) wrote :
Revision history for this message
Guewen Baconnier @ Camptocamp (gbaconnier-c2c) wrote :
Revision history for this message
Guewen Baconnier @ Camptocamp (gbaconnier-c2c) wrote :
Revision history for this message
Guewen Baconnier @ Camptocamp (gbaconnier-c2c) wrote :

I did some investigations again, and it appears that the real cause is the browse of the `_inherits'ed` partner of the user.

The partner ID of the user lmaurer has the ID 649.

In openerp.osv.orm.BaseModel._read_flat, at line 3657, there is a simple query on the res_partner:

            for sub_ids in cr.split_for_in_conditions(ids):
                cr.execute(query, [tuple(sub_ids)] + rule_params)
                results = cr.dictfetchall()

The query being executed:

SELECT res_partner."ean13",res_partner."zip_id",res_partner."tz",res_partner."opt_out",res_partner."title",res_partner."parent_id",res_partner."last_reconciliation_date",res_partner."employee",res_partner."fax",res_partner."image_medium",res_partner."name",res_partner."debit_limit",res_partner."signup_token",res_partner."street",res_partner."bank_statement_label",res_partner."country_id",res_partner."notification_email_send",res_partner."supplier",res_partner."ref",res_partner."email",res_partner."street2",res_partner."signup_expiration",res_partner."zip",res_partner."comment",res_partner."color",res_partner."image",res_partner."city",res_partner."user_id",res_partner."type",res_partner."function",res_partner."phone",res_partner."customer",res_partner."image_small",res_partner."birthdate",res_partner."vat",res_partner."state_id",res_partner."website",res_partner."use_parent_address",res_partner. "signup_type",res_partner."date",res_partner."is_company",res_partner."vat_subjected",res_partner."section_id",res_partner."lang",res_partner."credit_limit",res_partner."mobile",res_partner.id FROM "res_partner" WHERE res_partner.id IN (649) ORDER BY name

And the time it takes to do the cr.execute:
cr.execute: 0:00:00.716974

I had to delve way too far in the server to find the cause of this issue, but I finally found the root cause.
The query took a lot of time to execute because a very big image was uploaded on the partner.

As the partner is inherits'ed by the user, each time OpenERP needed to do a browse on the current user, it took nearly 1 second to read the image from the database.

I removed the image and OpenERP is fast again.

The work around is simple.

However, that revealed an issue which I think is really important: the images should not be prefetched, they should be read from the database only if they are really necessary (display on the views, ...), because:
 - images are used in more and more objects (user, partner, products, ...)
 - we do browse on the records much more often than we need their image
 - this is a true but insidious performance killer, even if we limit the size of the images, each browse will take significantly more time, and this will be more noticeable on loops

Example, this simple loop:

    for partner in self.pool.get('res.partner').browse(cr, uid, ids, context=context):
        print partner.name

Would take a lot of time to read all the images from the database just to print the name.

Revision history for this message
Raphaël Valyi - http://www.akretion.com (rvalyi) wrote : Re: [Bug 1177965] Re: Overall performance issue of BaseModel.fields_view_get
Download full text (9.6 KiB)

Hello,

I absolutely agree with Guewen: binary fields should absolutely not be
pre-fetched! Only lazily when explicitly required. BTW, this is something I
implemented in OOOR a couple of month ago: avoid to pre-fetch images at
least on the RPC layer and just doing that kind of doubled the perceived
performance in general. But in any case, even in the ORM itself,
pre-fetching binaries should totally be avoided!

On Fri, May 10, 2013 at 10:49 AM, Guewen Baconnier @ Camptocamp <
<email address hidden>> wrote:

> I did some investigations again, and it appears that the real cause is
> the browse of the `_inherits'ed` partner of the user.
>
> The partner ID of the user lmaurer has the ID 649.
>
> In openerp.osv.orm.BaseModel._read_flat, at line 3657, there is a simple
> query on the res_partner:
>
> for sub_ids in cr.split_for_in_conditions(ids):
> cr.execute(query, [tuple(sub_ids)] + rule_params)
> results = cr.dictfetchall()
>
> The query being executed:
>
> SELECT
>
> res_partner."ean13",res_partner."zip_id",res_partner."tz",res_partner."opt_out",res_partner."title",res_partner."parent_id",res_partner."last_reconciliation_date",res_partner."employee",res_partner."fax",res_partner."image_medium",res_partner."name",res_partner."debit_limit",res_partner."signup_token",res_partner."street",res_partner."bank_statement_label",res_partner."country_id",res_partner."notification_email_send",res_partner."supplier",res_partner."ref",res_partner."email",res_partner."street2",res_partner."signup_expiration",res_partner."zip",res_partner."comment",res_partner."color",res_partner."image",res_partner."city",res_partner."user_id",res_partner."type",res_partner."function",res_partner."phone",res_partner."customer",res_partner."image_small",res_partner."birthdate",res_partner."vat",res_partner."state_id",res_partner."website",res_partner."use_parent_address",res_partner.
>
> "signup_type",res_partner."date",res_partner."is_company",res_partner."vat_subjected",res_partner."section_id",res_partner."lang",res_partner."credit_limit",res_partner."mobile",
> res_partner.id
> FROM "res_partner" WHERE res_partner.id IN (649) ORDER BY name
>
> And the time it takes to do the cr.execute:
> cr.execute: 0:00:00.716974
>
>
> I had to delve way too far in the server to find the cause of this issue,
> but I finally found the root cause.
> The query took a lot of time to execute because a very big image was
> uploaded on the partner.
>
> As the partner is inherits'ed by the user, each time OpenERP needed to
> do a browse on the current user, it took nearly 1 second to read the
> image from the database.
>
> I removed the image and OpenERP is fast again.
>
> The work around is simple.
>
> However, that revealed an issue which I think is really important: the
> images should not be prefetched, they should be read from the database only
> if they are really necessary (display on the views, ...), because:
> - images are used in more and more objects (user, partner, products, ...)
> - we do browse on the records much more often than we need their image
> - this is a true but insidious performance killer, even if we limit the
> size ...

Read more...

Changed in openobject-server:
assignee: nobody → OpenERP Publisher's Warranty Team (openerp-opw)
tags: added: maintenance
Revision history for this message
Rifakat Husen (OpenERP) (rha-openerp) wrote : Re: Overall performance issue of BaseModel.fields_view_get

Hi Guewen, Raphael,

Thanks a lot for your detail analysis. I agree with your point, browsing on the name would also fetch the image and will take significant time. In my test I uploaded around 2 MB images on around 50 partner and it took 8 seconds to display them.
I think for stable version we should avoid uploading images with very big size. We could have a mechanism to restrict pre-fetching
images and that can be done in the trunk version only. IMHO we should discuss and think over it for trunk and for stable larger images should be avoided.

Let me know your suggestions.
Regards,
Rifakat

Revision history for this message
Guewen Baconnier @ Camptocamp (gbaconnier-c2c) wrote :

Hi Rikafat,

Thanks for your confirmation and acknowledgement.

I'm very conscious that it is by no means a trivial fix and it could be hard / risky to change in stable version.

However, while the loading of 50 partners with a 2mb image taking 8 seconds is bad, I think we can live with it because the chances to have such a configuration seems low, at worse we'll have a mitigation of small and big images.

Mine concern here is more on the upload of a big image on a user account. If I upload a 2mb image on my own user, each time I click on a menu, it will take ~8 seconds to load. This is discouraging and makes the usage of OpenERP almost unusable for this user.

The thing is that, you know it as well as I do, we cannot prevent users to upload big images.

All that being said, I think that 2 actions could be done in the stable version while doing a mechanism to restrict pre-fetching in trunk:

 - Replace the `browse` by a `read` as I did in `res_company._company_default_get()` -> this method is called several times each time we display a view and this single change drastically reduces the loading time when a big image is used (even if a small image is used, it will improve the performance here). Note that other `browse` of the user may have the same issue but this is the one I spotted and come the more often.

 - Prevent the upload of a image too large or resize it if it is too large.

In my opinion, this is an acceptable mitigation of the issue for the stable version while a better one is done in trunk.

What do you think?
Thanks

Guewen

Revision history for this message
Rifakat Husen (OpenERP) (rha-openerp) wrote :

Hi Guewen,

>- Replace the `browse` by a `read` as I did in `res_company._company_default_get()`
I agree this method is used widely in all the modules and specifically setting default value for the company_id into _defaults.
However, I think the actual problem lies in the return value of this method(return user.comapny_id.id) as per your first
comment in the bug report and we can simply use read() instead of this return value.

There is not need to replace `browse` by `read` as this should not impact on the performance. Doing user.company_id.id calls
`browse_record` and reads all the m2o field of company_id and that becomes company_id > partner_id > user_id and that should
be prevented. What do you think?

Doing `return user.company_id.id` actually reads value for these fields in class `browse_record` along with company_id,
['menu_id', 'alias_id', 'share', 'active', 'password', 'partner_id', 'id', 'login_date', 'company_id', 'signature', 'login', 'action_id', 'comment', 'ean13', 'color', 'image', 'alias_model_id', 'use_parent_address', 'street', 'birthdate', 'signup_type', 'display_name', 'supplier', 'city', 'user_id', 'opt_out', 'title', 'alias_force_thread_id', 'alias_defaults', 'country_id', 'company_id', 'notification_email_send', 'last_reconciliation_date', 'parent_id', 'employee', 'is_company', 'alias_user_id', 'type', 'email', 'vat', 'function', 'lang', 'fax', 'zip', 'street2', 'phone', 'credit_limit', 'ref', 'date', 'active', 'tz', 'signup_expiration', 'customer', 'image_medium', 'name', 'mobile', 'debit_limit', 'image_small', 'website', 'signup_token', 'alias_name', 'state_id']

Please share your suggestion on this and let me know if I have missed anything.
Warm regards,
Rifakat

Revision history for this message
Guewen Baconnier @ Camptocamp (gbaconnier-c2c) wrote :

Hi,

My feeling is that as soon as we read any field (not only company_id) of the browse_record, it reads all the other fields along and then read the large image. However, the call to browse() itself is not time consuming. But I don't have actually the structure to test that again.

If my feeling is right and if we:
 - replace the return value by a read of the company_id instead of the browse (like my proposition and your patch)
 - leave the browse given to the rules

->
In most cases, we won't have a penalty of performance.
If we create a rule, as soon as an attribute of the user is read, it will read the image and get a penalty. However, I don't think we have any choice because we absolutely need to leave the browse here as it can be used in the rules (this case can only be resolved by removing the images from the pre-fetching, so for trunk only). I don't know whether these rules are widely used or not (my impression is not).

Summarizing: I think that replacing the return value by a read() will improve most of the cases, while the lasting cases (when using rules) will be resolved only in trunk when the images will no longer be pre-fetched in the browse().

Thanks
Guewen

Revision history for this message
Rifakat Husen (OpenERP) (rha-openerp) wrote :

Guewen,
Fixed this in below branch authored to you,
lp:~openerp-dev/openobject-server/7.0-opw-592482-rha
r4998, <email address hidden>

Thanks a lot for reporting this bug,
Rifakat

no longer affects: openobject-server/trunk
summary: - Overall performance issue of BaseModel.fields_view_get
+ Overall performance issue of BaseModel.fields_view_get - binary fields
+ should not be prefetched
Revision history for this message
Olivier Dony (Odoo) (odo-openerp) wrote :

The implementation of the prefetching system of browse_record was indeed incorrect. The _prefetch attribute on osv.fields classes is meant to exclude columns during the prefetching phase, but it was only used to determine whether to trigger the prefetching phase.

This has been fixed in the following revisions:

- server 7.0: 5014 rev-id: <email address hidden>
- server 6.1: 4355 rev-id: <email address hidden>
- server 6.0: 3664 rev-id: <email address hidden>

(automatic forward-port to trunk will follow)

The fix is safe for all stable versions because it only affects the internals of browse_record, but not its API. It may cause an extra SQL query when a binary field is accessed explicitly after other fields, but will save time in all the cases where it is not.
It may theoretically break custom code that was strongly coupled with the internals of browse_records, and accessing the browse cache directly while assuming binary values would be present in it due to the bug. It is believed that the overall performance improvement of the fix definitely outweighs possibly breaking these flawed use cases.

Thanks for reporting and tracking the root cause for your symptoms :-)

Changed in openobject-server:
assignee: nobody → OpenERP Publisher's Warranty Team (openerp-opw)
importance: Undecided → Low
milestone: none → 7.0
status: New → Fix Released
no longer affects: openobject-server/7.0
Revision history for this message
Guewen Baconnier @ Camptocamp (gbaconnier-c2c) wrote :

Whoopee! Excellent news!

Thanks Olivier

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.