Add keyboard layout for Scottish Gaelic (gd)
Bug #1367210 reported by
GunChleoc
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ubuntu-keyboard |
Fix Released
|
Medium
|
GunChleoc | ||
ubuntu-keyboard (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
I am working on creating a keyboard layout for Scottish Gaelic, and I have a question:
Looking at the other languages, the database for predictive texting seems to be filled from a sample text (e.g. Buddenbrooks for de, Les trois mousquetaires for fr). We actually have a lexical database at our disposal that we already used for predictive texting in the Adaptxt keyboard for Android. How do you recommend we proceed for Ubuntu? Should we turn the database data into a plain text file? How is the database for Ubuntu Keyboard then generated?
Related branches
lp:~gunchleoc/ubuntu-keyboard/gd-keyboard
- GunChleoc (community): Needs Resubmitting
- Michael Sheldon (community): Needs Fixing
-
Diff: 13984 lines (+13902/-0)13 files modifieddebian/control (+9/-0)
debian/ubuntu-keyboard-scottish-gaelic.install (+1/-0)
plugins/gd/gd.pro (+9/-0)
plugins/gd/qml/Keyboard_gd.qml (+92/-0)
plugins/gd/qml/Keyboard_gd_email.qml (+93/-0)
plugins/gd/qml/Keyboard_gd_url.qml (+92/-0)
plugins/gd/qml/Keyboard_gd_url_search.qml (+93/-0)
plugins/gd/qml/qml.pro (+20/-0)
plugins/gd/src/gaelicplugin.h (+27/-0)
plugins/gd/src/gaelicplugin.json (+7/-0)
plugins/gd/src/src.pro (+48/-0)
plugins/gd/src/teacsa.txt (+13410/-0)
plugins/plugins.pro (+1/-0)
lp:~michael-sheldon/ubuntu-keyboard/layout-improvements
- PS Jenkins bot: Approve (continuous-integration)
- Ken VanDine: Approve (packaging)
- Ubuntu Phablet Team: Pending requested
-
Diff: 121990 lines (+46234/-74992)92 files modifieddebian/control (+86/-0)
debian/server.conf (+2/-1)
debian/ubuntu-keyboard-greek.install (+1/-0)
debian/ubuntu-keyboard-icelandic.install (+1/-0)
debian/ubuntu-keyboard-norwegian-bokmal.install (+1/-0)
debian/ubuntu-keyboard-romanian.install (+1/-0)
debian/ubuntu-keyboard-scottish-gaelic.install (+1/-0)
debian/ubuntu-keyboard-slovenian.install (+1/-0)
debian/ubuntu-keyboard-ukrainian.install (+1/-0)
plugins/de/src/buddenbrooks.txt (+0/-18794)
plugins/el/el.pro (+9/-0)
plugins/el/qml/Keyboard_el.qml (+91/-0)
plugins/el/qml/Keyboard_el_email.qml (+92/-0)
plugins/el/qml/Keyboard_el_url.qml (+92/-0)
plugins/el/qml/Keyboard_el_url_search.qml (+92/-0)
plugins/el/qml/qml.pro (+20/-0)
plugins/el/src/grazia_deledda-christos_alexandridis.txt (+4913/-0)
plugins/el/src/greekplugin.h (+25/-0)
plugins/el/src/greekplugin.json (+7/-0)
plugins/el/src/src.pro (+47/-0)
plugins/es/src/el_quijote.txt (+0/-28021)
plugins/fr/src/les_trois_mousquetaires.txt (+0/-23429)
plugins/gd/gd.pro (+9/-0)
plugins/gd/qml/Keyboard_gd.qml (+91/-0)
plugins/gd/qml/Keyboard_gd_email.qml (+92/-0)
plugins/gd/qml/Keyboard_gd_url.qml (+91/-0)
plugins/gd/qml/Keyboard_gd_url_search.qml (+91/-0)
plugins/gd/qml/qml.pro (+20/-0)
plugins/gd/src/gaelicplugin.h (+27/-0)
plugins/gd/src/gaelicplugin.json (+7/-0)
plugins/gd/src/src.pro (+48/-0)
plugins/gd/src/teacsa.txt (+13410/-0)
plugins/hr/qml/Keyboard_hr_url.qml (+1/-1)
plugins/hr/qml/Keyboard_hr_url_search.qml (+1/-1)
plugins/is/is.pro (+9/-0)
plugins/is/qml/Keyboard_is.qml (+93/-0)
plugins/is/qml/Keyboard_is_email.qml (+94/-0)
plugins/is/qml/Keyboard_is_url.qml (+95/-0)
plugins/is/qml/Keyboard_is_url_search.qml (+95/-0)
plugins/is/qml/qml.pro (+20/-0)
plugins/is/src/althingi_umraedur_2004_2005.txt (+12486/-0)
plugins/is/src/icelandicplugin.h (+25/-0)
plugins/is/src/icelandicplugin.json (+7/-0)
plugins/is/src/src.pro (+47/-0)
plugins/nb/nb.pro (+9/-0)
plugins/nb/qml/Keyboard_nb.qml (+94/-0)
plugins/nb/qml/Keyboard_nb_email.qml (+98/-0)
plugins/nb/qml/Keyboard_nb_url.qml (+94/-0)
plugins/nb/qml/Keyboard_nb_url_search.qml (+95/-0)
plugins/nb/qml/qml.pro (+20/-0)
plugins/nb/src/free_ebook.txt (+7227/-0)
plugins/nb/src/norwegianplugin.h (+25/-0)
plugins/nb/src/norwegianplugin.json (+7/-0)
plugins/nb/src/src.pro (+47/-0)
plugins/pl/src/ziemia_obiecana_tom_pierwszy_4.txt (+0/-4707)
plugins/plugins.pro (+7/-0)
plugins/ro/qml/Keyboard_ro.qml (+91/-0)
plugins/ro/qml/Keyboard_ro_email.qml (+92/-0)
plugins/ro/qml/Keyboard_ro_url.qml (+91/-0)
plugins/ro/qml/Keyboard_ro_url_search.qml (+92/-0)
plugins/ro/qml/qml.pro (+20/-0)
plugins/ro/ro.pro (+9/-0)
plugins/ro/src/amintiri_din_copilarie.txt (+946/-0)
plugins/ro/src/romanianplugin.h (+26/-0)
plugins/ro/src/romanianplugin.json (+7/-0)
plugins/ro/src/src.pro (+50/-0)
plugins/sl/qml/Keyboard_sl.qml (+91/-0)
plugins/sl/qml/Keyboard_sl_email.qml (+92/-0)
plugins/sl/qml/Keyboard_sl_url.qml (+91/-0)
plugins/sl/qml/Keyboard_sl_url_search.qml (+92/-0)
plugins/sl/qml/qml.pro (+20/-0)
plugins/sl/sl.pro (+9/-0)
plugins/sl/src/free_ebook.txt (+3596/-0)
plugins/sl/src/slovenianplugin.h (+25/-0)
plugins/sl/src/slovenianplugin.json (+7/-0)
plugins/sl/src/src.pro (+47/-0)
plugins/sv/qml/Keyboard_sv.qml (+2/-2)
plugins/sv/qml/Keyboard_sv_email.qml (+2/-2)
plugins/sv/qml/Keyboard_sv_url.qml (+2/-2)
plugins/sv/qml/Keyboard_sv_url_search.qml (+2/-3)
plugins/uk/qml/Keyboard_uk.qml (+97/-0)
plugins/uk/qml/Keyboard_uk_email.qml (+97/-0)
plugins/uk/qml/Keyboard_uk_url.qml (+96/-0)
plugins/uk/qml/Keyboard_uk_url_search.qml (+97/-0)
plugins/uk/qml/qml.pro (+20/-0)
plugins/uk/src/src.pro (+47/-0)
plugins/uk/src/ukrainianplugin.h (+25/-0)
plugins/uk/src/ukrainianplugin.json (+7/-0)
plugins/uk/uk.pro (+9/-0)
po/ubuntu-keyboard.pot (+56/-28)
qml/keys/LanguageMenu.qml (+7/-0)
tests/autopilot/ubuntu_keyboard/tests/test_keyboard.py (+109/-1)
Changed in ubuntu-keyboard: | |
assignee: | nobody → GunChleoc (gunchleoc) |
Changed in ubuntu-keyboard: | |
status: | New → Confirmed |
importance: | Undecided → Medium |
Changed in ubuntu-keyboard: | |
status: | Confirmed → Fix Released |
To post a comment you must log in.
Hi! It's awesome that you're developing a Scottish Gaelic keyboard for Ubuntu, thanks!
The predictive text data is stored in sqlite databases containing three tables of ngrams (specifically 1, 2 and 3-grams) in tables named "_1_gram", "_2_gram" and "_3_gram". Each table contains columns for the individual words in that ngram and a count of the times that ngram is encountered.
So the _1_gram table is of the structure:
word | count
_2_gram has the structure:
word_1 | word | count
and _3_gram has the structure:
word_2 | word_1 | word | count
The word columns are all text and the count column is an integer. A bit confusingly the highest numbered "word_" column is the first word in the ngram and "word" is always the last one. So an example from the _3_gram table would be:
seemed | to | him | 27
Meaning that in training it has seen the phrase "seemed to him" 27 times.
We use the text2ngram utility provided by the presage project (http:// presage. sourceforge. net/) to generate these database from ebooks (which isn't ideal, since this doesn't fully represent more conversational writing styles), but you might find it easier to convert your database directly depending on how it's formatted.
Any further questions just let me know :)
Cheers,
Mike.