Resources for African Indigenous Languages (RAIL) online workshop

Author: Mmasibidi Setaka (SADiLaR Sesotho Researcher)

The South African Centre for Digital Language Resources (SADiLaR) organised a workshop (originally expected to be held at the LREC 2020 conference in Marseille, France) in the field of African Indigenous Language Resources. This workshop aimed at bringing together researchers who are interested in showcasing their research and thereby boosting the field of African indigenous languages. It provided an overview of the current state-of-the-art and emphasises availability of African indigenous language resources, including both data and tools. Additionally, it allowed for information sharing among researchers interested in African indigenous languages as well as starting discussions on improving the quality and availability of the resources. Many African indigenous languages currently have no or very limited resources available and, additionally, they are often structurally quite different from more well-resourced languages, requiring the development and use of specialised techniques. By bringing together researchers from different fields (e.g., (computational) linguistics, sociolinguistics, language technology) to discuss the development of language resources for African indigenous languages, we hoped the workshop would boost research in this field.

The Resources for African Indigenous Languages (RAIL) workshop is an interdisciplinary platform for researchers working on resources (data collections, tools, etc.) specifically targeted towards African indigenous languages.  It aims to create the conditions for the emergence of a scientific community of practice that focuses on data, as well as tools, specifically designed for or applied to indigenous languages found in Africa. With the UNESCO-supported International Year of Indigenous Languages, there is currently much interest in indigenous languages.  The Permanent Forum on Indigenous Issues mentioned that “40 percent of the estimated 6,700 languages spoken around the world were in danger of disappearing” and the “languages represent complex systems of knowledge and communication and should be recognised as a strategic national resource for development, peace building and reconciliation.” As such, the workshop falls within one of the hot topic areas of this year’s conference: “Less Resourced and Endangered Languages”.

Organising this workshop was both fun and challenging. There are many aspects that have to be prepared and organised. To start, a proposal had to be sent to the LREC organisers indicating our intent to organise a workshop. Once that was approved, the actual organisation could take place. This involved sending out calls for papers, finding reviewers for the papers and asking people to be on the programme committee. In our case this was not easy, because we had to find people who were interested in language related fields not just in South Africa but in Africa as a whole. Next, the reviewing process took place. This meant sending the submissions to reviewers, followed by the process of collecting and combining their findings. This serves as the basis for the decision to accept or reject submissions. Many high quality submissions were received, but only limited time was available for the actual presentations. Once this was done, we heard that the originally planned LREC conference was not going to take place in France. We then decided to still hold the workshop, but in virtual form. This meant that we had to find an online platform that was suitable to host a large audience which was also secure. As an additional requirement, we tried to find a platform that works well with limited internet access. In the end, we decided to use Zoom. Before the meeting, we made sure that we sent out the details of the workshop to the people who had registered. Before they were allowed into the meeting room, they were put in a waiting room to be verified first. We asked the presenters to prepare a presentation (online), which they sent to us beforehand (with the aim to reduce the amount of technical problems). We could then play the presentations and afterwards let the participants interact.

On a personal note, we were a little nervous on the day of the workshop. In particular, because it was our first time organising a workshop together. The fact that this was our first online workshop added to this.

Overall, it was a very positive experience. The process of people registering at the actual workshop went well, the presentations all played perfectly and the participants made the workshop a very interactive event. It was a beautiful experience seeing the workshop unfold as it did. We received positive feedback from many of the participants and many requested that we organise a second RAIL workshop.  Overall, the event was very well received. We are now considering organising another successful (RAIL) event.

The workshop featured presenters from different backgrounds and language projects that work on African Indigenous languages. A list of the presentations and presenters can be found below.   The presentations were all recorded and are made available on the website:  The accepted publications can be found in the proceedings which can be found on the LREC website:


  1. Endangered African Languages Featured in a Digital Collection: The Case of the ‡Khomani San | Hugh Brody Collection Kerry Jones and Sanjin Muftic 
  2. Usability and Accessibility of Bantu Language Dictionaries in the Digital Age: Mobile Access in an Open Environment Thomas Eckart, Sonja Bosch, Uwe Quasthoff, Erik Körner, Dirk Goldhahn and Simon Kaleschke 
  3. Investigating an approach for low resource language dataset creation, curation and classification: Setswana and Sepedi Vukosi Marivate, Tshephisho Sefara and Abiodun Modupe
  4. Comparing Neural Network Parsers for a Less-resourced and Morphologically-rich Language: Amharic Dependency Parser Binyam Ephrem Seyoum, Yusuke Miyao and Baye Yimam Mekonnen 
  5. Mobilizing Metadata: Open Data Kit (ODK) for Language Resource Development in East Africa Richard Griscom 
  6. A computational grammar of Ga Lars Hellan 
  7. Navigating Challenges of Multilingual Resource Development for Under-Resourced Languages: The Case of the African Wordnet Project Marissa Griesel and Sonja Bosch 
  8. Building Collaboration-based Resources In Endowed African Languages: Case Of NTeALan Dictionaries Platform MBONING TCHIAZE Elvis, BASSAHAK Jean Marc, BALEBA Daniel, WANDJI Ornella and Assoumou Jules 

We have learned several things. Firstly, we learned the importance of teamwork. Organising the workshop was definitely a team effort and this was very important during the workshop. To make a virtual workshop run smoothly, at least three or four people are needed to handle the chairing, the registration, playing the presentations, and helping people with technical difficulties. Secondly, we realised that we should check technical issues first before going live so that we can find solutions to problems we might encounter earlier. In the end, this workshop went well, but initially we had some problems playing the presentations using a shared screen. Thirdly, we experienced the entire process of organising a workshop. This includes learning how to develop a call for papers, how to navigate the conference management system to allocate papers to our reviewers, send information to authors, and collect videos, amongst other things. We also learnt the importance of having a task list (and the one we have now can serve as a template for the next workshop) and updating it each time a milestone was achieved or when we thought of other tasks that needed to be done en route to the workshop. Finally, we have met (virtually) many people interested in the field of African indigenous languages. We hope that this group of friendly and interested people will become part of the community that is interested in the research topic of resources for African indigenous languages.