Problems with hyphen in spell checking

Bug #1656319 reported by leastcommonancestor
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Unassigned

Bug Description

Some types of hyphen are treated different in spell checking:

hyphen-minus (U+002D) works as expected (e.g. "one-half" is ok).
hyphen (U+2010) is shown as spelling error.
non-breaking-hyphen (U+2011) again is ok.
soft-hyphen (U+00AD) entity input is automatically converted, i.e. it is no longer visible, e.g. "ar­tistic" becomes "artistic", appears correct, but is shown as an spelling error.

The combination hyphen-minus + soft-hyphen (does not make much sense, but appeared in text pasted from a PDF) is also shown as an error. However, on right-click, the context menu items for spell checking are not shown.

hyphen-minus, hyphen and non-breaking hyphen should be treated consistently. The inconsistent handling of hyphen is a problem, since for good typography hyphen should be used instead og hyphen-minus.
soft-hyphen should be ignored by the spell-checker.

A way to make soft-hyphens visible would be useful. If there is, I did not find it.

Environment: Calibre 2.77 with Ubuntu 12.04 on 64-bit system.

Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1656319

The treatment of hyphens comes from the ICU library. If you disagree
with the rules ICU uses you should ask them to change it. As for making
soft-hyphens visible, simply use a search and replace to replace them
with some visible character, perform whatever operations you want and
then search and replace the character back with a soft-hyphen. Trying to
make the text editing widget display invisible characters, is waaay too
much work, as it requires changes to Qt code.

 status wontfix

Changed in calibre:
status: New → Won't Fix
Revision history for this message
Kovid Goyal (kovid) wrote :

Actually on second thoughts, I can work-around the ICU behavior fairly easily.

Changed in calibre:
status: Won't Fix → New
Revision history for this message
Kovid Goyal (kovid) wrote : Fixed in master

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: New → Fix Released
Revision history for this message
leastcommonancestor (leastcommonancestor) wrote : Re: [Bug 1656319] Re: calibre bug 1656319

Hello Mr. Goyal!

Thank you for your quick response.
However, the problem is not clear to me.
I take it that Calibre is using the ICU C-Library to normalize words
prior to lookup.
But the normalization chart for Punctuation-Dash
<http://www.unicode.org/charts/normalization/> suggests, that U+2011
under compatibility normalization would be changed to U+2010, so in both
cases the spell checker lookup should fail.
I'm quite willing to file a bug report at
http://bugs.icu-project.org/trac/, but for now, I do not think I have
enough information to specify what happens.
Of course, if the library changes U+2011 to U+002D (hyphen-minus) and
leaves U+2010 unchanged, this would be a bug.

Greetings + thanks again for your – minor problems aside – superb software

LCA

On 14.01.2017 04:09, Kovid Goyal wrote:
> The treatment of hyphens comes from the ICU library. If you disagree
> with the rules ICU uses you should ask them to change it. As for making
> soft-hyphens visible, simply use a search and replace to replace them
> with some visible character, perform whatever operations you want and
> then search and replace the character back with a soft-hyphen. Trying to
> make the text editing widget display invisible characters, is waaay too
> much work, as it requires changes to Qt code.
>
> status wontfix
>
> ** Changed in: calibre
> Status: New => Won't Fix
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.