PercentileComputation sometimes gives incorrect result

Bug #1569416 reported by Alexander Maretskiy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Rally
Fix Released
Undecided
Unassigned

Bug Description

See the code below and result of its execution.
Sometimes we have up to 80% difference with real percentile result.
This happens when we have very long list of mixed small and big values.

$ cat percentile_issue.py
from rally.common import streaming_algorithms as st

data = (
    list(range(10)),
    list(range(10)) * 10,
    list(range(10)) * 100,
    list(range(10)) * 1000,
    list(range(10)) * 10000,
    [1, 2, 3, 4, 99999] * 10000,
)

for lst in data:
    p = st.PercentileComputation(.95, len(lst))
    for i in lst:
        p.add(i)

    streaming = p.result()
    real = st.utils.percentile(lst, .95)

    diff = float(abs(real - streaming)) / max(real, streaming) * 100
    if diff > 5:
        print "%-8.2f %-8.2f Differs by %.1f%%" % (real, streaming, diff)
    else:
        print "%-8.2f %.2f" % (real, streaming)

$ python percentile_issue.py
8.55 8.55
9.00 9.00
9.00 9.00
9.00 9.00
9.00 4.50 Differs by 50.0%
99999.00 20001.80 Differs by 80.0%

description: updated
Revision history for this message
Andriy Kurilin (andreykurilin) wrote :
Changed in rally:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.