The South African Centre for Digital Language Resources (SADiLaR) hosted a successful 3rd Resources for African Indigenous Languages (RAIL) workshop at North-West University, Potchefstroom campus on the 30th of November 2022.
RAIL workshops are an interdisciplinary platform for researchers working on resources like data collections, tools, etc and specifically targeted towards African indigenous languages with an aim to create the conditions for the emergence of a scientific community of practice that focuses on data, as well as tools, specifically designed for or applied to indigenous languages found in Africa.
RAIL workshops bring together researchers who are interested in showcasing their research, thereby boosting the field of African indigenous languages. This provides an overview of the current state-of-the-art and emphasizes availability of African indigenous language resources, including both data and tools. Additionally, RAIL allow for information sharing among researchers interested in African indigenous languages and prompts discussions on improving the quality and availability of resources.
“Many African indigenous languages currently have no or very limited resources available and, additionally, they are often structurally quite different from more well-resourced languages, requiring the development and use of specialized techniques. By bringing together researchers from different fields (e.g., (computational) linguistics, sociolinguistics, language technology) to discuss the development of language resources for African indigenous languages, we hope to boost research in this field,” stated Research Professor at SADiLaR, Menno van Zaanen.
Febe de Wet presented “Localising the Mozilla Common Voice platform for South Africa’s official languages.” Digital humanities researcher at SADiLaR, Ms. Mmasibidi Setaka and Mr Johannes Sibeko presented “an Overview of Sesotho BLARK Content.”
Elsabé Taljard, Danie Prinsloo and Michelle Goosen covered the third presentation titled “On Creating electronic resources for African languages through digitisation: a technical report”
The last presentation titled “On Exploring Afrikaans word embeddings with analogies and nearest neighbours, “was presented by Tanja Gaustad and Roald Eiselen.
“The 3rd RAIL workshop was a huge success, and it is growing. We had interesting talks and presentations that were related to Resources for African Indigenous Languages. From overviews of available resources, topic modelling, Afrikaans word embeddings, lexical statistics for psycholinguistics, code-switching, Mozilla Common Voice, Vernacular Language Archive and so much more. There was something for everyone, and we hope that the upcoming 4th RAIL workshop in 2023 will be even bigger and better,” said digital humanities researcher at SADiLaR, Ms Mmasibidi Setaka.
Organising committee members of this workshop was chief administrative officer, Ms Jessica Mabaso and SADiLaR researchers; Ms Rooweither Mabuya, Dr Muzi Matfunjwa, and Ms Mmasibidi Setaka.