Rally

PercentileComputation sometimes gives incorrect result

Bug #1569416 reported by Alexander Maretskiy on 2016-04-12

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Rally	Fix Released	Undecided	Unassigned

Bug Description

See the code below and result of its execution.
Sometimes we have up to 80% difference with real percentile result.
This happens when we have very long list of mixed small and big values.

$ cat percentile_issue.py
from rally.common import streaming_algorithms as st

data = (
    list(range(10)),
    list(range(10)) * 10,
    list(range(10)) * 100,
    list(range(10)) * 1000,
    list(range(10)) * 10000,
    [1, 2, 3, 4, 99999] * 10000,
)

for lst in data:
    p = st.PercentileComputation(.95, len(lst))
    for i in lst:
        p.add(i)

streaming = p.result()
real = st.utils.percentile(lst, .95)

    diff = float(abs(real - streaming)) / max(real, streaming) * 100
    if diff > 5:
        print "%-8.2f %-8.2f Differs by %.1f%%" % (real, streaming, diff)
    else:
        print "%-8.2f %.2f" % (real, streaming)

$ python percentile_issue.py
8.55 8.55
9.00 9.00
9.00 9.00
9.00 9.00
9.00 4.50 Differs by 50.0%
99999.00 20001.80 Differs by 80.0%

See original description

Alexander Maretskiy (maretskiy) on 2016-04-21

description:

updated

Revision history for this message

Andriy Kurilin (andreykurilin) wrote on 2020-02-28:

fixed by https://github.com/openstack/rally/commit/5d0d48a3f2180a4ebf9d7847aed7f7f4b8ad0187

Changed in rally:
status:	New → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.