Author: Ms Mmasibidi Setaka (SADiLaR Sesotho Researcher)
I was recently invited to join a webinar hosted by the Indigenous Language Action Forum, ILAF, (https://ilaf.co.za/) in short. It is an organisation that promotes indigenous languages, with the aim to ensure the active use of these languages in important sectors such as education, criminal justice, healthcare etc. The webinar was titled “Using the indigenous languages at universities: Why do it and can it work?”. It was a first of its kind for the organisation and it brought different people in advocating for the use of indigenous languages in higher education in one setting. What touched me about this organisation and the webinar itself was the concept of having a positive narrative for the use of indigenous languages. The idea was to have a conversation about languages without putting others down.
As a SADiLaR representative my role was to discuss technologies and resources developed for the African Languages that can help push the agenda of the use of indigenous languages in higher education and also give input on the technologies that have been devised to meet this mandate. Considering that SADiLaR has an enabling focusing on the development and research in human language technology, it was quite an honour to give a short overview of what is currently available. My topic was on Technologies for Indigenous Languages, and I briefly walked the participants through the following tools (I had to pick few relevant tools due to lack of time):
- Corpus portal available on the SADiLaR resource centre which allows for searches in corpora available within the SADiLaR RMA. Students can search for keywords in context, frequency lists, part of speech and lemma based searches. https://corpus.sadilar.org/corpusportal/search/simple
- The National Centre for Human Language Technology- NCHLT web services consisting of a collection of 61 text technologies that automatically process textual input and these include, the optical character recognition engines, language identifiers, tokenisers, part of speech taggers phrase chunkers and Named Entity Recognisers. https://hlt.nwu.ac.za/
- Autshumato Machine Translation- a system that can automatically translate, sentences, documents and web pages. It currently comes in 6 languages (Afrikaans, Sesotho, Sepedi, Xitsonga and isiZulu) https://mt.nwu.ac.za/
- Spelling checkers by SPEL -are available in the South African official languages and also include hyphenators. They use Microsoft® Office 365, Microsoft® Office 2019 or Microsoft® Office 2016 their installation is easy. See https://spel.co.za/en/product/african_spelling_checkers/
They test the spelling within the parameters of the standard written variant of the language and they also have comprehensive word lists. - Aweza app- It is a speech to speech translation app that is used to bridge language barriers communications.
– Aweza med- aims to bridge the language barriers between health care practitioners and patients.
– Aweza med- enables accessibility to quality learning resources for English second language learners. https://aweza.co.za/ - Online writing support- called the SADiLaR-Ku Leuven. The aim of this platform is to give students the opportunity to receive academic writing support. The main purpose of the project is to design, implement and refine an online writing tool and repository of texts (corpus) and other resources in both Afrikaans and English, with the aim to extend this to the African languages as well. https://www.sadilar.org/index.php/en/about/sadilar-nodes/icelda-node
- Open Education Resource Term Bank (OERTB)- is a project that aims to support the collaborative development and dissemination of terminological resources, and thereby promoting the use of African languages in teaching and learning at higher education institutions. More specifically, the project is aimed at promoting the African languages as vehicles for comprehending threshold concepts in academic disciplines through practices such as translanguaging – the process and practice of shuttling between students’ home languages and English to facilitate comprehension, foster social cohesion and recognise students’ linguistic identities. http://oertb.tlterm.com/about/
In conclusion, various technological tools are being developed to improve and foster the use and research in indigenous languages. However, it is worth noting that these tools require data (corpora both text and speech) in order for them to understand the languages which they are developed for. The tools need to be trained in those languages and data plays an essential role to help the tools to function at their highest level. With that said, there is hope for the indigenous languages and efforts are being made towards enriching them.