
Co-located with LREC 2026
RAIL Workshop date: 12 May 2026
RAIL website: https://sadilar.org/en/seventh-workshop-on-resources-for-african-indigenous-languages-rail-2026/
Submission link for the RAIL workshop: https://softconf.com/lrec2026/RAIL2026/
LREC Conference dates: 11-16 May 2026
LREC website: https://www.elra.info/lrec2026/
Venue: Palau de Congressos de Palma, Palma de Mallorca (Spain)
The Resources for African Indigenous Languages (RAIL) workshop provides an interdisciplinary platform for researchers working on resources such as data collections and annotations, Human Language Technologies (HLT) and Natural Language Processing (NLP) tools, and their applications, specifically targeted towards African indigenous languages. In particular, it aims to create the conditions for the emergence of a scientific community of practice that focuses on data, as well as computational linguistic tools specifically designed for or applied to indigenous languages found in Africa. The seventh Resources for African Indigenous Languages (RAIL) workshop will be co-located with the Language Resources and Evaluation Conference (LREC) 2026 in Palau de Congressos de Palma, Palma, Mallorca (Spain).
Many African languages are under-resourced while only a few are considered to be somewhat better resourced. These languages often share interesting properties such as writing systems, making them different from most high-resourced languages. From a computational perspective, these languages lack enough corpora to undertake high level development of NLP and HLT tools, which in turn impedes the development of African languages in these areas. During previous workshops, it was noted that the problems and solutions presented were not only applicable to African languages but were also relevant to many other low-resource languages across the world. Because these languages share similar challenges, this workshop provides researchers with opportunities to work collaboratively on issues of language resource development and learn from each other.
The RAIL workshop has several aims. First, the workshop brings together researchers who work on African indigenous languages, forming a community of practice for people working on indigenous languages. Second, the workshop aims to reveal currently unknown or unpublished existing resources (corpora, NLP tools, and applications), resulting in a better overview of the current state-of-the-art, and also allows for discussions on novel, desired resources for future research in this area. Third, it enhances sharing of knowledge on the development of low-resource languages. Finally, it enables discussions on how to improve the quality as well as availability of the resources.
The workshop theme is “Creating resources for less-resourced African languages”, but submissions on any topic related to properties of African indigenous languages (including related non-African languages) may be accepted. Suggested topics include (but are not limited to) the following:
- Digital representations of linguistic structures
- Descriptions of corpora or other data sets of African indigenous languages
- Building resources for (under-resourced) African indigenous languages
- Developing and using African indigenous languages in the digital age
- Effectiveness of digital technologies for the development of African indigenous languages
- Revealing unknown or unpublished existing resources for African indigenous languages
- Developing desired resources for African indigenous languages
- Improving quality, availability and accessibility of African indigenous language resources
- Applications that make use of data collections of African indigenous languages
Submission requirements:
We invite papers on original, unpublished work related to the topics of the workshop. Submissions, presenting completed work, should adhere to the LREC conference requirements. These requirements are described in LREC’s authors kit: https://lrec2026.info/authors-kit/. The submission should be double blind and each submission should be between four and eight pages. Only oral papers should be submitted. The maximum number of pages excludes a compulsory ethics statement, discussion on limitations, and references and optional acknowledgements, as well as data and code availability statements if applicable. Appendices or supplementary material are allowed, but this information will not necessarily be taken into account during the review process.
The submission link for the RAIL workshop: https://softconf.com/lrec2026/RAIL2026/
Authors are encouraged to upload their datasets to the SADiLaR repository: https://repo.sadilar.org/. In case of difficulties uploading the datasets, please reach out to Benito Trollip (benito.trollip@nwu.ac.za).
Important dates:
Submission deadline: 1 March 2026 AoE
Date of notification: 18 March 2026 AoE
Camera ready copy deadline: 30 March 2026 AoE
Workshop: 12 May 2026
Organising Committee:
Muzi Matfunjwa, South African Centre for Digital Language Resources (SADiLaR), South Africa
Mmasibidi Setaka, South African Centre for Digital Language Resources (SADiLaR), South Africa
Rooweither Mabuya, South African Centre for Digital Language Resources (SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources (SADiLaR), South Africa
Note that the RAIL workshop is part of a series of workshops. You can find information on the other workshops at https://sadilar.org/en/rail/.
RAIL 2026: Program
| Tuesday, May 12, 2026 | |
| 14:00 – 18:00 |
Session RAIL –
The Seventh Workshop on Resources for African Indigenous Languages 2026
– Room #2
Chair: Muzi Matfunjwa, South African Center for Digital Language Resources Co-Chair: Mmasibidi Setaka and Menno van Zaanen, South African Center for Digital Language Resources |
| 14:00 – 14:15 |
Opening
Menno van Zaanen |
| 14:15 – 14:30 |
Session RAIL –
A Morpho-Syntactically Annotated Corpus of Ògè Folk Narratives with a Focus on Nominal Structure
Priscilla Adenuga Independent Researcher |
| 14:30 – 14:45 |
Session RAIL –
Extension of Linguistic Resources for South African Languages: Part-of-Speech Annotated Domain-Specific Data
Tanja Gaustad1, Roald Eiselen2, Cindy Arlene McKellar3 1Centre for Text Technology (CTexT), North-West University, 2Centre for Text Technology, North-West University, 3Centre for Text Technology, North-West University, Potchefstroom Campus |
| 14:45 – 15:00 |
Session RAIL –
Mining Large Language Models for Low-Resource Language Data: Comparing Elicitation Strategies for Hausa and Fongbe
Pericles Adjovi1, Prasenjit Mitra1, Roald Eiselen2 1Carnegie Mellon University Africa, 2Northwestern University |
| 15:00 – 15:15 |
Session RAIL –
Comparing Source Language Selection Strategies for Multi-Source Cross-Lingual Transfer to African Languages
Tewodros Kederalah Idris1, Prasenjit Mitra2, Roald Eiselen3 1Carnegie Mellon University Africa, 2Leibniz University of Hannover, 3Centre for Text Technology, North-West University |
| 15:15 – 15:30 |
Session RAIL –
Benchmarking text embedding models for South African languages
Ockert de Villiers and Roald Eiselen Centre for Text Technology, North-West University |
| 15:30 – 15:45 |
Session RAIL –
Improving Amharic Information Retrieval with Translative and Multi-Agent Debate Retrieval Augmented Generation
Abel Alemu Jotie and Prasenjit Mitra Carnegie Mellon University Africa |
| 15:45 – 16:00 |
Session RAIL –
Less can be More: Towards a Parameter-Efficient Fine-Tuning of Wav2Vec2 XLSR for Low-Resource Cape Verdean Creole ASR
Mateus Neves Andrade1, Mouhamadou Lamine BA2, Idy Diop2, Arlindo Oliveira da Veiga3 1University of Cape Verde, 2Uiversité Cheikh Anta Diop, Senegal, 3University of Cape Verde, Cape Verde |
| 16:00 – 16:30 | Afternoon Coffee Break |
| 16:30 – 16:45 |
Session RAIL –
From Script to Semantics: Prompting Strategies for African NLI
Anuj Tiwari1, Terry Oko-odion2, Hannah Nwokocha2 1Noida Institute of Engineering and Technology, 2ML Collective |
| 16:45 – 17:00 |
Session RAIL –
HaYo: Repurposing DiaSafety Dataset for Dialogue Safety Evaluation in Hausa and Yoruba
Tunde Oluwaseyi Ajayi1, Bolade Deborah Ashaolu2, Falalu Ibrahim Lawan3, Daud Olamide Abolade4, Amina Imam Abubakar5, Oluwatosin Ayomide Akinrinde6, Murja Sani Gadanya7, Omodolapo Dorcas Ashaolu4, Abubakar Khalid Auwal7, Adewumi Awujoola2, Shamsuddeen Umaru Adamu8, Israel Olawole Ashaolu2, Mihael Arcan9, Paul Buitelaar10 1Insight Research Ireland Centre for Data Analytics, Data Science Institute, University of Galway, 2University of Ilorin, 3Federal University of Technology Babura, 4Masakhane, 5University of Abuja, 6Ladoke Akintola University of Technology at Ogbomosho, 7Bayero University Kano, 8Kaduna State University, 9Lua Health, 10University of Galway |
| 17:00 – 17:15 |
Session RAIL –
Reclaiming African Voices: Surveying Indigenous Writing Systems for Inclusive NLP
Mamady Traore1, Ngoc Tan Le2, Fatiha Sadat1 1UQAM, 2Universite du Quebec a Montreal |
| 17:15 – 17:30 |
Session RAIL –
Getting Close to Cloze: Towards Readability Resources for Afrikaans
Susan Lotz, Rik van Noord, Gertjan van Noord University of Groningen |
| 17:30 – 17:45 |
Session RAIL –
The Hundzula Retreat-Based Infrastructure Model for African Natural Language Processing
Johannes Sibeko1, Seani Rananga2, Neo N Putini3, Dan Masethe4 1Nelson Mandela University, 2University of Pretoria, 3University of KwaZulu-Natal, 4Tshwane University of Technology |
| 17:45 – 18:00 |
Session RAIL –
Open but Incompatible: A License Compatibility Analysis of Corpora for Low-Resource African Languages
Ernst A.P. van Gassen Arktos Applied Intelligence |