Activity metric score should include lack of badge as a factor in calculation

Bug #1796176 reported by Kathy Lussier
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Evergreen
Confirmed
Undecided
Unassigned

Bug Description

Here's the use case.

A system has created three activity metric badges to identify popular materials in their collection: one badge for titles owned by the most org units, one for most current holds, and one for most circs over time.

The video game, Harry Potter Lego, gets a score of five for the most org units badge, but does not get a badge for holds or circs.

The book, Harry Potter & the Deathly Hallows, gets all three badges. It scores 5 on the org units badge, a 1 on most current holds, and a 4 on holds over time.

Clearly, the Harry Potter book is the title that should be getting the highest score. However, the way we calculate the total activity score is to only average the badges that were earned and discounting the badges that were not earned.

Therefore, Harry Potter Lego gets a score of 5 and ends up getting bumped ahead of Harry Potter & the Deathly Hallows, which gets a score of 3.3. Note: I do know the calculation is not just averaging these scores, but I'm saying 'average' for the sake of simplicity.

After using the activity metric for over a year now, we've found that this method of calculating the total score is frequently pushing up materials with less activity above those that are truly popular. It would be better if the system assigned a zero to any badge that did not get earned by the record, so that the final score looked more like:

Lego Harry Potter: 1.6
Harry Potter and the Deathly Hallows: 3.3

Tags: search badges
Michele Morgan (mmorgan)
Changed in evergreen:
assignee: nobody → Michele Morgan (mmorgan)
Revision history for this message
Michele Morgan (mmorgan) wrote :

Here is a patch that changes the badge score calculation to include all applicable badges for the search when averaging the scores.

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/mmorgan/lp1796176_use_all_badges_in_scope_to_calculate_total_score

Current behavior calculates total score by taking the sum of (badge weight * badge score) for all earned badges and dividing that by the sum of badge weight for *all earned badges*

The patch changes the calculation to take the sum of (badge weight * badge score) for all earned badges and dividing that by the sum of badge weight for *all badges in the scope*

Without this patch, for a scenario similar to Kathy's use case above, total scores would be calculated as follows:

Assuming the following badges and weights:

Badge: Circs over time, weight: 1
Badge: Ownership, weight: 2
Badge: Holds over time, weight 3

For Harry Potter Lego, with badge and score:

Ownership:5

The calculation for total score would be:

5*2/2 = 5

For Harry Potter and the Deathly Hallows with badges and scores:

Circs over time:3
Ownership:5
Holds over time:1

The calculation for total score would be:

(3*1+5*2+1*3)/(1+2+3) = 2.7

After applying the patch, the total score for Harry Potter and the Deathly Hallows would still be 2.7, but Harry Potter Lego's score would calculate as follows:

5*2/(1+2+3) = 1.67

So the clearly more popular item now gets the higher total score.

Changed in evergreen:
assignee: Michele Morgan (mmorgan) → nobody
tags: added: pullrequest
Revision history for this message
Mike Rylander (mrylander) wrote :

Hi Michele,

There's a problem calculating the weighted average that way, because not all records are in the original population for every badge that is in scope. There are record-, location- and copy-related filters that preclude records from even having a chance to earn a give badge, and dinging them for not earning impossible badges will only become more of an issue as new, cool badges are invented. Then there are the static badges, and eventually patron and staff rating badges, that won't require a calculated population, but would ding every unrated record.

From a mathmatical point of view, we'll need to figure out some way of recording original population inclusion for each badge+record combination. And I think we'll further need to create a concept of badge groups, where the members of the group are averaged together separate from other groups to model different facets of popularity (usage vs curation vs "reader attachment", etc). There is the naive was of just recording a 0 for population members that fall outside the thresholds, but there are probably more compact representations that would be faster at search time, especially for the set of unearned badges, which will be the large majority of badge+record combinations.

Revision history for this message
Remington Steed (rjs7) wrote :

Based on Mike's feedback in #2, I'm removing the pullrequest tag and milestones.

tags: removed: pullrequest
Andrea Neiman (aneiman)
tags: added: badges
removed: activity
no longer affects: evergreen/3.1
no longer affects: evergreen/master
no longer affects: evergreen/3.2
Revision history for this message
John Amundson (jamundson) wrote :

The current behavior of this is really annoying. I'm spending a ton of time fiddling with percentile, weight, etc, to try to work in this framework, when ideally if a badge isn't earned, it should just be counted as 0.

Changed in evergreen:
status: New → Confirmed
Revision history for this message
Kathy Lussier (klussier) wrote :

Since this was nearly the last bug I filed before leaving MassLNC, I guess it makes sense that it is also one of the first I comment on now that I'm back in the community. IMO, the activity metric feature is the best method available for Evergreen sites to improve their search relevance, and I worry that it is underutilized, possibly because it doesn't work as well as it could.

Since this bug has been gathering dust for the last five years, I'm assuming nobody has taken up the very large project to create badge groups that would provide a calculation that will meet all anticipated use cases.

Therefore, I would like to propose that the current calculation should reflect how the majority of Evergreen sites are currently using their badges. My reasoning is that if we are sticking with just one method of calculating these badges, somebody is going to get inferior search results. With the current method of calculation, sites like CW MARS and NOBLE are getting inferior search results because we apply scores to the full set of bibliographic records, which was the intention behind the original development project.

It's quite possible that if we reach out to other sites, we'll find the vast majority of Evergreen sites using the activity metric badges are using it in the same way that CW MARS and NOBLE are. If that's the case, wouldn't it be better to change the calculation to meet the needs of the predominant use case? Then, if a site decides to develop the patron and staff rating badges as suggested by Mike, the onus will be on that group to develop the more flexible calculation that can meet a greater number of use cases.

NOBLE has been using the code from Michele's branch for five years now and is very happy with the way it calculates scores. We are willing to do the legwork to reach out to the community to see how it's being used at other sites.

Revision history for this message
Jessica Woolford (jwoolford) wrote :

I can chime in and say Bibliomation is using badges the way NOBLE and CW MARS do. We have one badge that is supposed to give more weight to print titles, because we have received feedback that libraries would like to see ebook records farther down in the search results. Without the lack of badge being a factor, that badge doesn't as intended.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.