aspell's British English dictionary has spelling mistakes.

Bug #18438 reported by Ralph Corderoy
54
This bug affects 4 people
Affects Status Importance Assigned to Milestone
aspell-en (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

aspell's British English dictionary is perturbed by my British English.

    $ echo offensive/advisor/panellist/practice | tr / \\012 | aspell pipe
    @(#) International Ispell Version 3.1.20 (but really Aspell 0.50.5)
    & offensive 6 0: offencive, offensives, offensively, inoffensive,
unoffensive, offence

    & advisor 9 0: adviser, advise, ad visor, ad-visor, advisory, advisee,
divisor, advice, advisor's

    & panellist 8 0: panelist, panel list, panel-list, panellists, panelists,
panelised, panelist's, panels

    & practice 6 0: practise, practiser, proactive, practised, practises, Prentice

    $ locale
    LANG=en_GB.UTF-8
    LC_CTYPE="en_GB.UTF-8"
    LC_NUMERIC="en_GB.UTF-8"
    LC_TIME="en_GB.UTF-8"
    LC_COLLATE="en_GB.UTF-8"
    LC_MONETARY="en_GB.UTF-8"
    LC_MESSAGES="en_GB.UTF-8"
    LC_PAPER="en_GB.UTF-8"
    LC_NAME="en_GB.UTF-8"
    LC_ADDRESS="en_GB.UTF-8"
    LC_TELEPHONE="en_GB.UTF-8"
    LC_MEASUREMENT="en_GB.UTF-8"
    LC_IDENTIFICATION="en_GB.UTF-8"
    LC_ALL=
    $ dpkg -l | grep aspell
    ii aspell 0.50.5-5 GNU Aspell spell-checker
    ii aspell-bin 0.50.5-5 GNU Aspell standalone spell-check utilities
    ii aspell-en 0.51-1-1 English dictionary for GNU Aspell
    ii libaspell15 0.50.5-5 The GNU Aspell spell-checker runtime toolkit
    $

Here's the Company Oxford English Dictionary's opinion.

    http://www.askoxford.com/concise_oed/offensive?view=uk
    http://www.askoxford.com/concise_oed/advise?view=uk

http://www.askoxford.com/results/?view=dict&freesearch=panellist&branch=13842570&textsearchtype=exact
    http://www.askoxford.com/concise_oed/practice?view=uk

Revision history for this message
Mark Florian (markrian) wrote :

There's more:

$ echo proven/racquet | tr / \\012 | aspell pipe
@(#) International Ispell Version 3.1.20 (but really Aspell 0.50.5)
& proven 29 0: pr oven, pr-oven, prov en, prov-en, prove, proving, prover,
proved, proves, profane, prone, proverb, riven, Pren, Provence, pron, prov,
provenly, prevent, driven, privet, progeny, protean, protein, provers, Provo,
Raven, preen, raven

& racquet 19 0: racket, rackety, racquet's, Raquel, acquit, racked, ragout,
rocket, parquet, react, requite, acute, Jacquetta, Jacquette, racker, rackets,
request, Raquela, ratchet

$ locale
LANG=en_GB
LC_CTYPE="en_GB"
LC_NUMERIC="en_GB"
LC_TIME="en_GB"
LC_COLLATE="en_GB"
LC_MONETARY="en_GB"
LC_MESSAGES="en_GB"
LC_PAPER="en_GB"
LC_NAME="en_GB"
LC_ADDRESS="en_GB"
LC_TELEPHONE="en_GB"
LC_MEASUREMENT="en_GB"
LC_IDENTIFICATION="en_GB"
LC_ALL=

$ dpkg -l | grep aspell
ii aspell 0.50.5-5 GNU Aspell spell-checker
ii aspell-bin 0.50.5-5 GNU Aspell standalone spell-check utilities
ii aspell-en 0.51-1-1 English dictionary for GNU Aspell
ii libaspell15 0.50.5-5 The GNU Aspell spell-checker runtime toolkit

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

(In reply to comment #1)
> $ echo proven/racquet | tr / \\012 | aspell pipe

I agree with Mark's two.

    http://www.askoxford.com/concise_oed/prove?view=uk
    http://www.askoxford.com/concise_oed/racket_1?view=uk

Revision history for this message
Nick McMahon (nmcmahon) wrote :

Tested, i am in en_GB for all
offensive is correct.
advisor should be adviser according to the oxford dictionary, advise, adviser,
advisory.
panellist isnt in my dictionary _at all_ however panel only has one L so i would
spell it 'panelist' but as far as i can help you, its not a word at all. - its
not in *my* dictionary, it may be in others.
Practise is a verb like learning the drums over and over, practice is a noun
like a doctors practice - the place where he works

Revision history for this message
Nick McMahon (nmcmahon) wrote :

i should clairfy, whem i say dictionary, i cross-checked the words in question
with an actual dictionary, one you can prop a sofa leg up with, not a website or
something.

Revision history for this message
Mark Florian (markrian) wrote :

*** Bug 15095 has been marked as a duplicate of this bug. ***

Revision history for this message
Mark Florian (markrian) wrote :

Nick, may I point you to

http://www.askoxford.com/results/?view=dict&freesearch=advisor&branch=13842570&textsearchtype=exact

which shows that 'advisor' is acceptable also. Additionally:

http://www.askoxford.com/results/?view=dict&freesearch=panellist&branch=13842570&textsearchtype=exact

regarding 'panellist'. Honestly I'm not sure why a printed Oxford English
English dictionary should be any more authoritative than the Oxford English
Dictionary's online edition.

I get the feeling that aspell's whole method of spell checking is wrong/awful.
There's more:

$ echo comic\'s/something\'s/defensive/offensive | tr / \\012 | aspell pipe
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.3)
& comic's 38 0: comics, colic's, conics's, Como's, Cormack's, conics, Com's,
Mic's, comic, commie's, comity's, Cami's, coma's, comb's, comer's, comes's,
comma's, Combs's, Comte's, combo's, comet's, comical, comings, cumin's, mics,
commies, commits, mimics, comas, comes, Combs, combs, comps, comers, commas,
combos, comets, compos

& something's 7 0: some thing's, some-thing's, somethings, something, seethings,
soothings, smoothing

& defensive 9 0: defencive, defensively, defence, definitive, defenced, deafens,
diffusive, defencing, offencive

& offensive 6 0: offencive, offensives, offensively, inoffensive, unoffensive,
offence

Corresponding pages on askoxford:

http://www.askoxford.com/concise_oed/defensive?view=uk
http://www.askoxford.com/concise_oed/offensive?view=uk

I think something's and comic's are obviously correct; they wouldn't appear in a
dictionary normally anyway.

The latter two are from the duplicate bug #15095. There are plenty of examples of
words being considered misspelled when suffixed by 's, to indicate possession.
How exactly does aspell deal with words like this? Does it rely on them existing
as separate words in its dictionary, or is it somewhat aware of punctuation?

Matt Zimmerman (mdz)
Changed in aspell-en:
assignee: nobody → dsilvers
Revision history for this message
Mark Florian (markrian) wrote :

This bug is quite important if you ask me. The spell-checking library is quite simply wrong on a large number of words. Marking confirmed.

Changed in aspell-en:
status: Unconfirmed → Confirmed
Revision history for this message
Chris Lord (cwiiis) wrote :

More:

http://www.askoxford.com/results/?view=dict&freesearch=juvenility&branch=13842570&textsearchtype=exact

$ echo juvenility | aspell pipe
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.4)
& juvenility 6 0: juvenile, juveniles, geniality, joviality, Juvenal, Juvenal's

Revision history for this message
Mark Florian (markrian) wrote :

And more!

$ echo shaven | aspell pipe
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.4)
& shaven 34 0: Shavian, shave, shaving, Haven, haven, shaver, shaken, shaved, shaves, Shane, Shawn, sheave, Daven, Gaven, heaven, Shaine, Shayne, Shalne, Sven, shriven, Shannen, sharpen, shavers, sheaves, Shaun, sheen, shove, shoving, Raven, maven, raven, seven, shiver, shaver's

Seriously, either the British dictionary for aspell is total rubbish or aspell itself is. I cannot rely on aspell at all for spell-checking other than for catching typos. This is pathetic.

Revision history for this message
Mark Florian (markrian) wrote :

More...

$ echo misspelt | aspell pipe
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.4)
& misspelt 9 0: miss pelt, miss-pelt, misspell, misspent, misspelled, misspells, misspend, misdealt, spelt

http://www.askoxford.com/concise_oed/misspell?view=uk

Is anyone watching this bug? I'm going to mark this bug's importance higher since I've literally been laughed at whilst showing off Ubuntu to friends. "The spell-checker doesn't even work! God knows what the rest of it is like!" they teased.

Okay, I lie, that didn't happen. They're all very impressed with Ubuntu. But can no one else see a problem here? Even IF this terrible example of spell-checking is limited to the British English dictionary that's still a large user base who's seeing poor quality.

If it were not a programming issue, which bug #36227 (et al.) suggests it is, I'd love to fix whatever plain text files are responsible. How would I do this?

I've added comments to http://sourceforge.net/tracker/?group_id=245&atid=100245 (aspell's SourceForge bug tracker [I hate SourceForge's bug tracker.]) and had no response at all.

Is there perhaps another spell-checker that can replace aspell? Some kind of drop-in replacement for it? Which is what aspell was supposed to be for ispell, if memory serves me?

Revision history for this message
Mark Florian (markrian) wrote :

I don't have enough oomph around here to change the importance. Hopefully someone who does agrees with me will do it?

Revision history for this message
Chris Lord (cwiiis) wrote :

Another word aspell can't spell: 'sweated':

$ echo sweated | aspell pipe
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.4)
& sweated 23 0: sweat ed, sweat-ed, swatted, seated, sweater, swotted, sweaters, wested, sedated, swathed, swayed, sweats, sated, sweat, sweatier, sweat's, sweaty, skated, slated, stated, sweater's, sweeter, swifted

http://www.askoxford.com/concise_oed/sweated?view=uk

Changed in aspell-en:
assignee: dsilvers → nobody
Revision history for this message
Mark Florian (markrian) wrote :

After a word with a wordlist/aspell developer it seems this problem could largely be solved by effectively using the various other dictionaries included by aspell-en, such as the variant dictionaries, and also switching to the larger dictionary size (say all the way up to 95).

To understand what I'm talking about, read /usr/share/doc/aspell-en/README.gz .

For instance, the word racquet is included in one of the variant dictionaries.

Revision history for this message
Mark Florian (markrian) wrote :

Another one for the list:

$ echo homophobe | aspell pipe
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.5)
& homophobe 5 0: homophobes, homophone, homophobia, homophobic, homophony

How can the plural exist in the dictionary, but not the singular? Bizarre.

Revision history for this message
Samuel Dennis (sjdennis3) wrote :

My issue with the dictionaries is not that some words are missing, but that words are IN the dictionary that should not be. When spell checking my thesis I found that "feces" was in the British english dictionary (despite being American spelling), and "faeces" was not.

A missing word isn't really a problem, I can readily add "faeces" to a custom dictionary. However if I have written "feces" anywhere I need to find it and correct it - but aspell would miss it because it believed the spelling was correct. This is very serious.

Was this dictionary written by simply taking the US dictionary and adding a few words? It really needs to start from a clean slate.

http://www.askoxford.com/concise_oed/faeces?view=uk
http://ubuntuforums.org/showthread.php?t=1170003

Revision history for this message
Samuel Dennis (sjdennis3) wrote :

It is rather concerning that such a major bug is ranked only "Medium" in importance and is currently unassigned.

Revision history for this message
rusivi2 (rusivi2-deactivatedaccount) wrote :

Thank you for posting this issue.

Does this issue occur in Lucid?

Changed in aspell-en (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
James Troup (elmo) wrote :

Yes, it's still an issue in both lucid and maverick. (There's trivial reproducers in the many comments in the bug)

Changed in aspell-en (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

"prizes" is the plural of the noun "prize" in British English.

    $ aspell pipe <<<prizes
    @(#) International Ispell Version 3.1.20 (but really Aspell 0.60.6)
    & prizes 40 0: prices, prises, Price's, praises, price's, pries, prise's, proses, prezzies, prides, primes, preses, prise, Perice's, praise's, priers, prose's, priest, Pres, Pris, pres, Prinz's, princes, pride's, prissies, Perez's, Price, Pryce's, price, razes, rices, rises, Prue's, Roze's, prier's, Prince's, prince's, Prissie's, Rice's, rice's

    $

Revision history for this message
Kevin Atkinson (kevin-ubuntu) wrote :

Aspell and English wordlist author here:

Most of these are now fixed upstream. Please see:
  https://sourceforge.net/tracker/index.php?func=detail&aid=1834890&group_id=10079&atid=1014602

Please direct non-Ubuntu specific comments to the bug report above.

Changed in aspell-en (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.