First workshop on Resources for African Indigenous Languages (RAIL)
Free, online workshop
The South African Centre for Digital Language Resources (SADiLaR) is organizing a workshop (originally expected to be held at the LREC 2020 conference in Marseille, France) in the field of African Indigenous Language Resources. This workshop aims to bring together researchers who are interested in showcasing their research and thereby boosting the field of African indigenous languages. This provides an overview of the current state-of-the-art and emphasizes availability of African indigenous language resources, including both data and tools. Additionally, it allows for information sharing among researchers interested in African indigenous languages as well as starting discussions on improving the quality and availability of the resources. Many African indigenous languages currently have no or very limited resources available and, additionally, they are often structurally quite different from more well-resourced languages, requiring the development and use of specialized techniques. By bringing together researchers from different fields (e.g., (computational) linguistics, sociolinguistics, language technology) to discuss the development of language resources for African indigenous languages, we hope to boost research in this field.
The Resources for African Indigenous Languages (RAIL) workshop is an interdisciplinary platform for researchers working on resources (data collections, tools, etc.) specifically targeted towards African indigenous languages. It aims to create the conditions for the emergence of a scientific community of practice that focuses on data, as well as tools, specifically designed for or applied to indigenous languages found in Africa. With the UNESCO-supported International Year of Indigenous Languages, there is currently much interest in indigenous languages. The Permanent Forum on Indigenous Issues mentioned that "40 percent of the estimated 6,700 languages spoken around the world were in danger of disappearing" and the "languages represent complex systems of knowledge and communication and should be recognized as a strategic national resource for development, peace building and reconciliation." As such, the workshop falls within one of the hot topic areas of this year's conference: "Less Resourced and Endangered Languages".
Topics include the following:
Language collections (description and creation)
Computational linguistic tools
The RAIL workshop will, unfortunately, not be held in Marseille, France this year, due to the Covid-19 pandemic. Instead, the workshop will take place online. Participation is free. However, if you want to participate, you will need to register on EventBrite. Details on how to join the workshop will be sent out to registered participants. The workshop will take place on Saturday 16 May 2020 from 9:00 until 13:00 SAST.
Free virtual workshop on African indigenous languages
09:10-09:30 Endangered African Languages Featured in a Digital Collection: The Case of the ‡Khomani San | Hugh Brody Collection
Kerry Jones and Sanjin Muftic
09:30-09:50 Usability and Accessibility of Bantu Language Dictionaries in the Digital Age: Mobile Access in an Open Environment
Thomas Eckart, Sonja Bosch, Uwe Quasthoff, Erik Körner, Dirk Goldhahn and Simon Kaleschke
09:50-10:10 Investigating an Approach for Low Resource Language Dataset Creation, Curation and Classification: Setswana and Sepedi
Vukosi Marivate, Tshephisho Sefara and Abiodun Modupe
10:10-10:30 Comparing Neural Network Parsers for a Less-resourced and Morphologically-rich Language: Amharic Dependency Parser
Binyam Ephrem Seyoum, Yusuke Miyao and Baye Yimam Mekonnen
10:30-10:50 Mobilizing Metadata: Open Data Kit (ODK) for Language Resource Development in East Africa
10:50-11:20 Coffee break
11:20-11:40 A Computational Grammar of Ga
11:40-12:00 Navigating Challenges of Multilingual Resource Development for Under-Resourced Languages: The Case of the African Wordnet Project
Marissa Griesel and Sonja Bosch
12:00-12:20 Building Collaboration-based Resources in Endowed African Languages: Case of NTeALan Dictionaries Platform
Elvis Mboning Tchiaze, Jean Marc Bassahak, Daniel Baleba, Ornella Wandji and Jules Assoumou
Identify, Describe and Share your LRs!
Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.
As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2020 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.