Title browse should index 245s with nonfiling characters twice

Bug #1321780 reported by Kathy Lussier on 2014-05-21
This bug affects 5 people
Affects Status Importance Assigned to Milestone

Bug Description

Evergreen version: 2.5 and up

The first indicator of the 740 field is used to identify non-filing characters, but the title browse currently isn't honoring the non-filing characters when sorting.

For example, we need to type "the heart sutra" in a browse search (http://catalog.mvlc.org/eg/opac/browse?blimit=10&qtype=title&bterm=the+heart+sutra&locg=1) to find http://catalog.mvlc.org/eg/opac/record/180882. When looking at the 740 field for that record, the first indicator properly identifies 4 non-filing characters.

In looking at the mods xsl stylesheet, it doesn't appear as if the code used for sorting in the 245 fields is also being used for the 740 field.

Ben Shum (bshum) wrote :

Actually, from the IRC logs (around http://irc.evergreen-ils.org/evergreen/2014-05-21#i_100042) we found that the current config.metabib_field definition for alternative title might be too inclusive and is indexing these titles twice, once without the non-filing characters and once with them. So both "the heart sutra" and "heart sutra" from the listed example are indexed and put into browse search.

Ideally, we would want these to be separated and the nfi version used for browse, but the full version indexed for search purposes.

Marking confirmed.

Changed in evergreen:
status: New → Confirmed
milestone: none → 2.next
Kathy Lussier (klussier) wrote :

Here's a radical idea. Let's forget cataloging correctness for a moment and think about what's easiest for the user.

How about if we keep the 740 browse indexing the way it is AND do the same for the other title browse indexes?

The downside is we would add more entries to the table. The upside is that users would find the correct title whether or not they enter initial articles in the search. We could then get rid of that prompt reminding them to enter initial articles and it would be one less thing to train patrons on when they use the catalog.

Mike Rylander (mrylander) wrote :

Indexing both versions was my suggestion regarding all fields with NFI capabilities at the time of development for exactly the reasons that Kathy lists, but it was rejected because the value would be listed twice, though normally in far-separated locations, in the browse output. I am still in favor of this approach.

Elizabeth Thomsen (et-8) wrote :

NOBLE strongly supports this approach -- what difference does it make if the value is listed twice if it means that the user always finds it where they logically would expect to find it? Few of our users have any experience (at least any recent experience) using print resources where this "drop the article" convention made sense.

There are also issues with articles in other languages. The student checking the library for a recording of "Die Fledermaus" doesn't necessarily know that "Die" in this context is a German article.

Jane Sandberg (sandbej) wrote :

So, just to clarify, the 740s are being added to the title browse both with and without the nonfiling characters, which seems pretty helpful both for catalogers and patron users. The 245s are only being added without the nonfiling characters.

So, should this bug shift a bit and change to "titles from 245 fields should be included in the title browse both with and without the nonfiling characters"?

tags: added: cataloging
Elaine Hardy (ehardy) wrote :

I do think this is a good idea for user discovery. My only concern would be if having so many more entries in the table would negatively impact search retrieval times?

Kathy Lussier (klussier) wrote :

+1 to Jane's question. I also share Elaine's concern. In my testing, browse searches are already much slower than keyword searches. I'm concerned about adding even worse response time.

Mike Rylander (mrylander) wrote :

Regarding performance, browse is slower than search because of authority-proxied visibility checking and linked-record counting, two things that search does not have to do. Both of those are performed after the procedure of actually finding the pivot "closest match" value and gathering the "before" and "after" values, which are themselves very fast. In short, more entries in the browse table does not mean a slower browse.

Elaine Hardy (ehardy) wrote :

If it isn't going to slow down the search, then I am +1

Jane Sandberg (sandbej) wrote :

I just threw a different title on this bug -- feel free to change it to something that better represents the content of this bug discussion.

summary: - Title browse isn't honoring non-filing characters in 740 field
+ Title browse should index 245s with nonfiling characters twice
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers