Activity log for bug #1234845

Date Who What changed Old value New value Message
2013-10-03 17:43:08 Jason Stephenson bug added bug
2013-10-03 17:43:59 Jason Stephenson description From depesz's email to the developer list: my name is Hubert Lubaczewski (a.k.a. depesz). I'm PostgreSQL DBA from Poland, that was hired to do review for Evergreen installations, and suggest changes in database "things" (queries, structure). For the client I analyzed logs from production Pg instance for ~ 10 days. During this time, the single most time consuming query (summarized time for all instances of the query, with different parameters) was: SELECT * FROM unapi.bre ( ... ) AS "unapi.bre"; I profiled this function, and found that in my test case most of the time (2.04s out of 2.06s, so ~ 99%) was spent in call to unapi.holdings_xml() function. When I profiled this function, I found that most of the time (sorry, don't have the number now with me) was spent in call to evergreen.ranked_volumes() function. At this moment in my research something changed on the server I was testing on, and all subsequent times were ~ 4-5 times lower, but the ratios were more or less the same. Anyway - call to evergreen.ranked_volumes() showed repeatable time (with full caches/buffers) of ~ 380ms. I modified the function by: 1. inlining actor.org_unit_descendants(?, ?) 2. inlining evergreen.rank_ou(?, ?, ?) 3. extracting depth calculation to separate call 4. switched to plpgsql (which gives me ability to use variables) 5. removed evergreen.rank_ou() and evergreen.rank_cp_status() from select clause - these are still in WINDOW definition, but they weren't used in the SELECT, so it's better to remove from there. 6. in passing renamed arguments to avoid name clash (argument depth vs. field depth) 7. in passing changed usage of $* to access parameters to using named parameters, for readability. New function did the same work in ~ 18ms. Now - after I finished my tests, I was unfortunately not able to repeat slow performance of unapi.bre(), but I think that the optimization of evergreen.ranked_volumes() is good on its own. If you'll agree, I would appreciate modifying the function in project. If I should do/provide something else, please let me know. Best regards, depesz NOTE: I (Jason Stepheson) have depesz's permission to create a git branch of his function changes with his name and email as the author. We both think this will get more attention if there is a branch and an associated Launchpad bug. From depesz's email to the developer list: my name is Hubert Lubaczewski (a.k.a. depesz). I'm PostgreSQL DBA from Poland, that was hired to do review for Evergreen installations, and suggest changes in database "things" (queries, structure). For the client I analyzed logs from production Pg instance for ~ 10 days. During this time, the single most time consuming query (summarized time for all instances of the query, with different parameters) was: SELECT * FROM unapi.bre ( ... ) AS "unapi.bre"; I profiled this function, and found that in my test case most of the time (2.04s out of 2.06s, so ~ 99%) was spent in call to unapi.holdings_xml() function. When I profiled this function, I found that most of the time (sorry, don't have the number now with me) was spent in call to evergreen.ranked_volumes() function. At this moment in my research something changed on the server I was testing on, and all subsequent times were ~ 4-5 times lower, but the ratios were more or less the same. Anyway - call to evergreen.ranked_volumes() showed repeatable time (with full caches/buffers) of ~ 380ms. I modified the function by: 1. inlining actor.org_unit_descendants(?, ?) 2. inlining evergreen.rank_ou(?, ?, ?) 3. extracting depth calculation to separate call 4. switched to plpgsql (which gives me ability to use variables) 5. removed evergreen.rank_ou() and evergreen.rank_cp_status() from    select clause - these are still in WINDOW definition, but they    weren't used in the SELECT, so it's better to remove from there. 6. in passing renamed arguments to avoid name clash (argument depth vs.    field depth) 7. in passing changed usage of $* to access parameters to using named    parameters, for readability. New function did the same work in ~ 18ms. Now - after I finished my tests, I was unfortunately not able to repeat slow performance of unapi.bre(), but I think that the optimization of evergreen.ranked_volumes() is good on its own. If you'll agree, I would appreciate modifying the function in project. If I should do/provide something else, please let me know. Best regards, depesz NOTE: I (Jason Stephenson) have depesz's permission to create a git branch of his function changes with his name and email as the author. We both think this will get more attention if there is a branch and an associated Launchpad bug.
2013-10-08 16:07:14 Yamil bug added subscriber Yamil
2013-11-06 20:23:38 Ben Shum evergreen: status New Triaged
2013-11-06 20:23:41 Ben Shum evergreen: importance Undecided Medium
2013-11-06 20:23:51 Ben Shum evergreen: milestone 2.next
2014-02-05 15:14:31 Dan Wells tags database performance ranked unapi volumes database performance pullrequest ranked unapi volumes
2014-02-05 15:14:36 Dan Wells evergreen: milestone 2.6.0-alpha1 2.6.0-beta1
2014-02-27 19:00:57 Dan Wells evergreen: milestone 2.6.0-beta1 2.6.0-rc1
2014-03-11 16:56:07 Kathy Lussier tags database performance pullrequest ranked unapi volumes database performance pullrequest ranked signedoff unapi volumes
2014-04-07 15:53:18 Evergreen Bug Maintenance evergreen: milestone 2.6.0-rc1 2.next
2014-07-10 20:39:31 Ben Shum evergreen: milestone 2.next 2.7.0-alpha1
2014-07-10 20:39:33 Ben Shum evergreen: status Triaged Fix Committed
2014-11-06 17:37:30 Evergreen Bug Maintenance evergreen: status Fix Committed Fix Released