2nd workshop on Resources for African Indigenous Language (RAIL)
– PROGRAMME NOW AVAILABLE BELOW
The South African Centre for Digital Language Resources (SADiLaR) is organising the second RAIL workshop in the field of African Indigenous Language Resources. This workshop aims to bring together researchers who are interested in showcasing their research and thereby boosting the field of African indigenous languages. This provides an overview of the current state-of-the-art and emphasises availability of African indigenous language resources, including both data and tools. Additionally, it will allow for information sharing among researchers interested in African indigenous languages and also start discussions on improving the quality and availability of the resources. Many African indigenous languages currently have no or very limited resources available and, additionally, they are often structurally quite different from more well-resourced languages, requiring the development and use of specialized techniques. By bringing together researchers from different fields (e.g., (computational) linguistics, sociolinguistics, language technology) to discuss the development of language resources for African indigenous languages, we hope to boost research in this field.
The Resources for African Indigenous Languages (RAIL) workshop is an interdisciplinary platform for researchers working on resources (data collections, tools, etc.) specifically targeted towards African indigenous languages. It aims to create the conditions for the emergence of a scientific community of practice that focuses on data, as well as tools, specifically designed for or applied to indigenous languages found in Africa.
Suggested topics include the following:
- Computational linguistics for African indigenous languages
- Descriptions of corpora or other data sets of African indigenous languages
- Building resources for (under resourced) African indigenous languages
- Developing and using African indigenous languages in the digital age
- Effectiveness of digital technologies for the development of African indigenous languages
- Revealing unknown or unpublished existing resources for African indigenous languages
- Developing desired resources for African indigenous languages
- Improving quality, availability and accessibility of African indigenous language resources
Submission Guidelines
Link for submissions: DHASA Conference – ConfTool – Login
RAIL 2021 asks for the following type of submissions:
- RAIL asks for full papers from 4 pages to 8 pages (plus more pages for references if needed), which must strictly follow the DHASA styles guide which will be available on the conference website Style guides | DHASA 2021
- Papers must be submitted through the DHASA submission platform (ConfTool) and will be peer-reviewed.
When sending in your submission, be sure to select RAIL 2021 Submissions.
Important dates:
- Submission deadline: 13 September 2021
- Extension on submission deadline: 20 September 2021
- Final extension on submission deadline: 30 September 2021
- Date of notification: 15 October 2021
- Extension on date of notification: 25 October 2021
- Camera ready copy deadline: 10 November 2021
- RAIL Workshop: 29 November – 08:30 – 13:00 SAST
Programme: Session Chair – Benito Trollip
08:30 – 08:40 | Opening and Welcoming | Mmasibidi Setaka |
08:40 – 09:00 | Development of linguistically annotated parallel language resources for four South African languages | Tanja Gaustad, Martin J. Puttkammer |
09:00 – 09:20 | New uses for old books: Description of digitised corpora based Setswana language collection at WITS Cullen Africana Collection | Malebogo Thabong, Nina Lewin, Taariq Surtee |
09:20 – 09:40 | Digitising Afrikaans: Establishing a protocol for digitalizing historical sources for Early Afrikaans (1675-1925) as apossible template for indigenous South African languages |
Roné Wierenga, Wannie Carstens |
09:40 – 10:00 | Investigating the feasibility of harvesting broadcast speech data to develop resources for South African languages |
Jaco Badenhorst, Febe de Wet |
10:00 – 10:20 |
A novel method for redefining language ecology and endangerment in Nigeria – towards a geospatial
solution |
Imelda Udoh, Moses Ekpenyong, Eno-Abasi Urua, Harrison Adeniyi, Gregory Obiamalu, Ayo Yusuff, Ogbonna Anyanwu, Ebitare Obikudo |
10:20 – 10:25 | Masakhane: Bridging the gap between NLP practitioners and linguists | Olanrewaju Samuel |
10:25 – 10:30 | Carpentries session | Mmasibidi Setaka |
10:30 – 11:00 | BREAK | |
11:00 – 11:20 | An Open Source System for Crowd Sourcing an African Language Short Story Corpus | Benson Muite |
11:20 – 11:40 | Training Cross-Lingual embeddings for Setswana and Sepedi | Mack Makgatho, Vukosi Marivate, Tshephisho Sefara, Valencia Wagner |
11:40 – 12:00 | Wordsmith Tools as an Enabler for Text Analysis | Rooweither Mabuya |
12:00 – 12:20 | Canonical Segmentation and Syntactic Morpheme Tagging of Four Resourcescarce Nguni Languages | Jakobus S. du Toit, Martin J. Puttkammer |
12:20 – 12:40 | Using MonoConc Pro to teach and learn lexical collocations in Xitsonga | Respect Mlambo, Muzi Matfunjwa |
12:40 – 13:00 | CLOSING |
Registration
The RAIL workshop will be co-located with the DHASA conference, and therefore registration will run through the DHASA website.
Participants will have to register for the conference and choose to attend the RAIL workshop during the registration process.
Organising committee
Rooweither Mabuya
Mmasibidi Setaka
Deon Du Plessis
Dimakatso Mathe
Respect Mlambo
Liané Van Den Bergh
Cascious Mofokeng
Muzi Matfunjwa
South African centre for Digital Language Resources (SADiLaR), South Africa
Program committee
Ayodele James Akinola, Michigan Technological University, USA
Sonja Bosch, University of South Africa, South Africa
Elias Malete, University of the Free State, South Africa
Emmanuel Ngue Um, University of Yaoundé I, Cameroon
Pule Phindane, Central University of Technology, South Africa
Felix Ameka, Leiden University, Netherlands
Elsabé Taljard, University of Pretoria, South Africa
Mpho Raborife, University of Johannesburg, South Africa
Marissa Griesel, University of South Africa, South Africa
Roald Eiselen, North-West Universty, South Africa
Sree Thottempudi, South African Centre for Digital Language Resources, South Africa
Deon du Plessis, South African Centre for Digital Language Resources, South Africa
Dimakatso Mathe, South African Centre for Digital Language Resources, South Africa
Benito Trollip, South African Centre for Digital Language Resources, South Africa
Muzi Matfunjwa, South African Centre for Digital Language Resource, South Africa