One popularity ranking method should be used by default

Bug #1615600 reported by Kathy Lussier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evergreen
Won't Fix
Wishlist
Unassigned

Bug Description

The new Statistically generated record ratings from bug 1549505 introduces two new ranking methods to the catalog.

The Most Popular ranking method first retrieves the records based on their popularity badge score. If several records in the result set have the same score, they are then retrieved in order of bib relevance.

The Popularity-Adjusted Relevance ranking method attempts to consider both bib relevance and popularity badge scores in the sort order. There is a global flag that gives you some control over the weight popularity badge scores receive in this ranking method. When using this sort method, even if the global flag weight is fairly low, popular materials will still float high in the list, but if there is a record that scores high on bib relevance, it may appear before those popular results.

Both sort methods have value, and different Evergreen sites may opt to use different sort methods. However, including both sort methods in the dropdown will be confusing to users, who are unlikely to understand the fine differences between each sorting method. We should choose to one of these methods to display in the sort dropdown by default, while making the other available to those sites that prefer to use the other.

After extensive testing on our part, MassLNC recommends going with the popularity-adjusted relevance sort option as the default. It's imperfect, and we would like to see it improved in the future, but we think it's the one that will be the most likely to retrieve good results for the user, particularly for general keyword searches.

I can create a branch to do so if we have a consensus that this is the best direction to go.

Revision history for this message
Terran McCanna (tmccanna) wrote :

+1

Revision history for this message
Mike Rylander (mrylander) wrote :

Kathy,

My concern with removing one or the other from the default interface is two-fold:

First, they are useful in different situations, but the path to those situations are presented in the same UI. Specifically, the popularity adjusted relevance has been shown to be very effective in keyword search results. I suspect that title searches will show similar, if less pronounced, behavior. However, for searches on low cardinality values, such as subjects, series, or authors, especially where a link is clicked but even for a directly typed search, "relevance" doesn't mean much. But the variance in the ranking value (because of string duplication, etc) can be high. For those search axes, the bare popularity is a better sort because relevance is a false qualifier. Also, for curated lists (via containers, say) that follow some theme, relevance will likely be of much less use than popularity, and including relevance will just make sorting harder.

Second, by effectively removing one option (particularly the bare popularity option), it will most likely end up as either (or both) dead code because of momentum, and/or "reimplemented" by some sites via forced, manual manipulation of the relevance+popularity sort axis. That would be unfortunate when the thing that's needed actually exists, but has been hidden.

I think we need to consider more carefully how to present the various options, and whether we need to look at a more sophisticated (though, ideally, simplified for the user) interface that understand the user's context better, and presents the most useful option where appropriate. And, I don't think we have enough data or input from users to make a decision on which option to effectively do away with just yet.

Thoughts? Thanks!

Revision history for this message
Kathy Lussier (klussier) wrote :
Download full text (3.2 KiB)

Thanks for the feedback Mike. I understand your concern about choosing one path leading to dead code for the other path as I had similar concerns. At the same time, I think showing both options will leave many users scratching their heads over what the difference is between the two.

In our case, if it is left as is, we will most likely customize it to leave just one choice for the user or, perhaps, rename 'Popularity-Adjusted Relevance' as 'Relevance' and use it as the default sort option. Other sites may also make similar customizations, but I suspect many will leave the default sort options. At the same time that we are able to get more data and input from users, we will also be presenting them with something that I think is likely to lead to confusion. When I was recently showing this functionality to people who weren't closely tied with the project, the 1st question I heard from two people was 'why are there two?'.

I also wanted to talk more about the results we've seen in our own testing.

I know my testing cannot replicate what users will ultimately experience, but, from what I have seen, the popularity-adjusted relevance applied to an author or subject search tends to end up being very close to a "Most Popular" ranked search, with the possible exception of records for materials in a language that is not the default language. The popularity component of that ranking, even when set at 1.1, is so strong that I haven't really experienced a case where these searches are much different from the "Most Popular." I'm sure there will be many cases where I'm proven wrong once it's on a production site, but that's what I've experienced so far.

Something I forgot to mention in my original description is that we actually found the same behavior, where popularity-adjusted relevance is nearly the same as Most Popular, in keyword searches on a database that only uses the default keyword blob. We really only saw the big difference between those two ranking methods 1) for a keyword and 2) on databases that had added additional keyword indexes for weighting purposes.

Therefore, when implemented in a database that 1) has not added any field weighting to the keyword index and 2) leaves both ranking methods in the sort dropdown, I think users will find that there are two similar-sounding sort methods that produce very similar results, further leading to the question of why there are two available.

In response to "I think we need to consider more carefully how to present the various options, and whether we need to look at a more sophisticated (though, ideally, simplified for the user) interface that understand the user's context better, and presents the most useful option where appropriate."

YES, ABSOLUTELY! One way we would love to see "Most Popular" implemented is, not as an upfront sort option, but as a way to present the "most popular results" to the user when they are sorting by another method, perhaps in a slider along the top of the screen. See http://www.screencast.com/t/MwRawsYze1

I think there is so much we can do with these badges now that we have the data compiled in these badges.

If you feel strongly about this, I'm not going to push it...

Read more...

Revision history for this message
Dan Wells (dbw2) wrote :

Given that we are still early in the life of this feature, I wonder if we should just simplify the labeling to be less exact, but more understandable. Something like:

- Sort by Relevance
- Sort by Popularity
- Sort by Relevance + Popularity

(or "Relevance and Popularity")

I recognize it glosses some details, but something like that seems understandable and accurate enough, I think.

Changed in evergreen:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.