Project to Create Free Software in Quechua
|Español English Runasimi|
Help Develop the Spell-Checker
Because the majority of Quechua speakers are only trained to read and write in Spanish, it is important that they have tools which train they to write in the Spanish alfabet. We are going to start copying the words from the Diccionario Quechua-Español, Qheswa-Español Simi Taqe from the Academia Mayor de la Lengua Quechua published by the Municipality of Cusco in 1995 (2005). After adding our changes, we are going to create an affix file in the Hunspell format which is used by OpenOffice, Mozilla and AbiWord.
We are organizing a schedule here in UNAMBA to copy the AMLQ dictionary. If you want to help or want to organize a group to copy dictionaries in your area, read these instructions. Choose which letters you are going to copy and inform the email list firstname.lastname@example.org before beginning.
After copying all the words in the dictionary, we need to pass through the list of words and update the spelling in the alphabet of the Primer Congreso Mundial de Quechua (2000). We are also looking for volunteers who want to pass through the list of verbs and indicate which infijos can be joined with which verbs.
It isn't easy to create a spell-checker for an aglutinative language like Quechua because the traditional programs for spell-checking like ispell and aspell were only designed for Indo-European languages which only have one suffix per word. In Quechua, however, 5 or 6 suffixes can be added to the root word. It is almost impossible to calculate all the possible combinations before hand. Hunspell is a special spell-checker created for the Hungarian language which permits two levels of suffixes. We hope that we will be able to represent Quechua verbs with compound words and two levels of suffixes in Hunspell. We aren't sure if it will be possible to do it in real time. See the hunspell page for more information about the technical details.
The order of affixes in Quechua is a special problem. We are looking for linguists to help us list the order of every possible combination of infixes and suffixes. Contact Amos Batto <amosbatto EN yahoo.com> if you would like to help us with this.
We have plans to copy an Ayacuchan Quechua dictionary
need to find linguists who want to update the spelling and help create
the affix file.
We have created a spell checking dictionary for Bolivian Quechua, but it is still very rough. We copied all the words from Jesus Lara's dictionary, but his orthography is not standard. Jesus Lara didn't use the letter "L" or "CHH." He refused to include loan words from Spanish like "necesitay", but Quechua speakers use these words in their daily speech and we likewise ought to include them. We are seeking Quechua speakers who are willing to go through the word list and correct the spelling and add to the list. Right now we only have a version with 5 vowels, but we have a script to convert it to 3 vowels if needed.
Unfortunately, we don't know which combinations of suffixes are possible. There are no grammar books which explain how to combine many suffixes. For example, is it possible to combine -taj, -raj, -wan, y -pi en the same word? If so, in what order? We already have over 500 pages of combinations of suffixes, but we need someone to review the list and eliminate the incorrect combinations.
We also need someone to add the most commonly used suffix combinations which we forgot to include. We hope that Quechua speakers will review our list and add the proper infixes to the verbs. For example, the infix -rpari is not yet included in the dictionary, because we don't know which verbs use it.
It difficult to create a complete dictionary because the available spell checking programs (ispell, aspell, and hunspell) weren't designed for extremely agglutinative languages like Quechua. We need someone to convert the ispell/aspell dictionary to hunspell because it is better at handling agglutinative languages and will allow us to cover more of the possible combinations of infixes and suffixes in Quechua. If we can convert the dictionary to myspell or hunspell, we can get it included in the next version of OfficeOffice. Read these instructions to convert it to myspell.
In order to help out, you need to learn the flag system that we are using to attach the suffixes to the root words. Its very simple to learn. To begin, download the dictionary, decompress it, and read the introductions to the files qu-BO-predict-0.02-0.txt and qu-BO-affix-ispell-0.02-0.txt.
Using the words found in the Diccionario Inga-Castellano by Francisco Tandioy <ftandioy AT indiana DOT edu> and an anthropologist, we have begun to create an Inga dictionary for spell checking. We already have the pre-dictionary file, but we need to eliminate all the duplicate words and add infixes to the verbs. We also need to adapt the Affix file for Bolivian Quechua to use with Inga.Unfortunately, we still do not have permision from the authors to use the words from the Diccionario Inga-Castellano. Until we get permission, we ask that you don't distribute the Inga spell checking dictionary in a complete form that can be used by the general public. Don't distribute it as a hash file for ispell nor as an aspell dictionary.
Other Quechua Dialects
Kevin Patrick Scannell has created a web crawler that collects words from web pages with Quechua text. Kevin already has already gathered together a list of 100,000 words, but we need someone to review the list and separate out the words from different dialects. With this list and an adapted Affix file, it's possible to create a dictionary for your particular dialect of Quechua. Write Amos Batto if you are interested in working on this.
Last Updated: Tue, 16 Aug 2011