performance regression when running sysbench with inno log turned on

Bug #715174 reported by Joe Daly
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Monty Taylor

Bug Description

when running the sysbench with inno replication enabled, it seems that each subsequent run the results get slower. In the results the first run at each concurrency level is significantly faster then the last. This looks like me like a possible memory leak, but our tests don't show one.

drizzle> select concurrency, tps from sysbench_run_iterations where run_id = 675;
| concurrency | tps |
| 16 | 1021.27 |
| 16 | 776.30 |
| 16 | 644.29 |
| 32 | 1134.56 |
| 32 | 846.17 |
| 32 | 685.41 |
| 64 | 1203.83 |
| 64 | 875.96 |
| 64 | 715.35 |
| 128 | 1247.30 |
| 128 | 938.70 |
| 128 | 746.51 |
| 256 | 1258.96 |
| 256 | 938.50 |
| 256 | 726.87 |
| 512 | 1110.94 |
| 512 | 850.06 |
| 512 | 684.00 |

Revision history for this message
Joe Daly (skinny.moey) wrote :

Monty, Brian thought you may know why this is happening, if thats not the case move it to unassigned for now.

Changed in drizzle:
milestone: none → 2011-02-28
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Monty Taylor (mordred)
Monty Taylor (mordred)
Changed in drizzle:
milestone: 2011-02-28 → 2011-03-14
Revision history for this message
Patrick Crews (patrick-crews) wrote :

Part of this could be tied to us not re-setting the server between the iterations for a concurrency level:
  # Run the benchmarks for the specified config
  for concurrency in concurrency_levels:
    for iteration in range(iterations):

So what happens is that on the second run of iteration N, we have the data in the logs from the first run and so on.
We do reset the server between concurrency levels.

However, this is still looking problematic as we are seeing a fair bit of slowdown as the log ages / grows.

Working on integrating dbqp / sysbench so we can use valgrind and whatnot more effectively to see wtf is up

Revision history for this message
Patrick Crews (patrick-crews) wrote :
Revision history for this message
Patrick Crews (patrick-crews) wrote :

Ran the benchmark for several iterations with the following settings via drizzle-automation for sysbench readwrite.
1) w/o the log turned on and no reset between iterations of a concurrency level.
2) with the log turned on and no reset
3) log on with a reset between iterations

Could not duplicate the drop off in performance noted above on my machine. The only difference noted was between the runs with and without the log enabled (which was expected). It should be noted that there is not *that* much of a performance hit when using the log, which is very nice news.

Posted the data above for others to check out.

Will be running the same tests with a profiler soon and will post anything of note when complete.

Revision history for this message
Joe Daly (skinny.moey) wrote :

Thanks for running this. It may be hard to tell on a slower machine as 10% of 500tps isnt as much as 10% of 2000tps on a faster machine, it does look like the first run is always faster then the latter runs at higher concurrency levels nothing like the original numbers though. This may tell us though that cleaning out the inno replication table didnt have much impact though. Is there any way to run this on the sysbench machines without goofing up every other sysbench run? Maybe there was some anomaly when it ran before, the code has changed a bit as well sense the original test. Thanks again for running this!

Changed in drizzle:
milestone: 2011-03-14 → 2011-04-11
Revision history for this message
Joe Daly (skinny.moey) wrote :

I ran the GA release through sysbench with the inno replication log turned on results below its about a 10% hit. It did not exhibit the behavior described in this bug, in fact the performance numbers were significantly better. I wonder if the reorg of this table removing the gpb string made a difference, although I dont know why it would.

Revision history for this message
Joe Daly (skinny.moey) wrote :

moved to invalid, I suspect the problem may have been related to GPB being transposed to a string and stored in the sys_replication_log but its gone now

Changed in drizzle:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.