Some letters break the category splitting

Bug #1422116 reported by Jellby
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Charles Haley

Bug Description

I have authors whose name start with "A", among them "Aesop". If I change the spelling of the author-sort filed for Aesop to "Æsop", the author categories are split in "A", "Æ" and "A" again. The first "A" contains names before Æsop, the second "A" contains names after Æsop. I think the "Æ" category should be either inside or outside of "A" but the "A"s should not be split.

Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1422116

Changing the component for this bug.

 assignee cbhaley
 status triaged

Changed in calibre:
assignee: nobody → Charles Haley (cbhaley)
status: New → Triaged
Revision history for this message
Charles Haley (cbhaley) wrote :

@kovid: I can fix this, but I am not convinced that I should. The problem arises because when sorting, ICU considers Æ to the two characters AE, but when comparing Æ is a single letter. This means that Æ will always sort into its proper place in a group of items beginning with A.

Example: if the authors are displayed without categorization, the correct order is Aaa, Æa, Afa. If they are displayed with first letter categorization then the correct order is Aaa, Afa, Æa so that Æ is separated from the other As. I believe that get_categories should return the list in the non-categorized order, which means that if first-letter categorization is on then the tag browser must resort the list. Is it worth the performance penalty for a case that almost never happens?

Revision history for this message
Charles Haley (cbhaley) wrote :

@kovid: another possibility might be to do the "correct" sort in db get_categories. This would entail adding a parameter to get_categories telling it whether first letter grouping is enabled. This parameter must make it all the way to db.categories.sort_categories. Line 126 would become something like

    if first_letter_sorting:
        key=lambda x:(collation_order(x.sort), sort_key(x.sort))
    else:
        key=lambda x:sort_key(x.sort or x.name)

It isn't clear to me what would have to change to do the above. I think it would be sufficient to add a keyword parameter to db.cache.get_categories then change the tag browser model to use db.new_api.get_categories().

Revision history for this message
Charles Haley (cbhaley) wrote :

Ooops. The code would look something like

     if first_letter_sorting:
        key=lambda x:(collation_order(x.sort or x.name), sort_key(x.sort or x.name))
    else:
        key=lambda x:sort_key(x.sort or x.name)

Revision history for this message
Kovid Goyal (kovid) wrote :

Adding a keyword to new_api.get_categories is OK by me.

Charles Haley (cbhaley)
Changed in calibre:
status: Triaged → Fix Committed
Revision history for this message
Kovid Goyal (kovid) wrote : Fixed in master

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.