Comment 8 for bug 894468

Revision history for this message
Scott Ritchie (scottritchie) wrote :

I had thought I proposed a dampening algorithm above, but maybe it was on the Cross Validated post. Anyway, to be clear:

1) Compute the median and count 3 numbers: ratings less than the median (x), equal to the median (y), and above the median (z)
2) For large sample sizes, your score approaches the median plus the probability a voter rates above the median minus the probability a voter rates below it. For large sample sizes, this approaches median + (z/(x+y+z)) - (x/(x+y+z)), which equals median + (z-x)/(x+y+z).
3) Since we're talking about estimates of probabilities, rather than using a proportion as an estimator we use the bounds of the wilson score to estimate it. If we use the lower bound of the probability range on the positive end and the upper bound of the probability range on the negative end, we'd have something analogous to what we do now: new apps with small sample sizes are punished slightly, but as votes accrue they approach what's listed in 2.

Note that the probabilities in question cannot exceed 0.5 (even given a situation where very few people are rating at the actual median), which means we're neither adding nor subtracting more than half away from the median. This makes the sample median the dominant sorting method, but nevertheless the other ratings are important.

Some quick examples this would produce:
 - If an app received 70% median votes of 3, 10% below median, and 20% above median, it would rate about 3.1.
 - If an app had just 10 votes in the same proportion as above, it would receive a bit lower rating since the error bars on the wilson score would be higher.
 - If an app received some votes of 4 and about equal numbers of 5 and 3, it would be rated about 4.0, since the positive and negative portions would cancel eachother out.
 - If an app received almost entirely votes of 4, it would also receive about a 4.0 rating.
 - If an app received a lot but approximately equal number of 3's and 4's, it would be rated about 3.5 -- either because its median was 3 and there was about a .5 chance someone would rate higher, or because it's median was 4 and there was about a .5 chance someone would rate lower.