Latest News
The one-day SWiP event comprised an exciting panel discussion by experts in the fields of preservation of languages, culture, and digitisation of information, and an introductory mini workshop to Wikipedia focusing on editing, translating, and making content available online.
The one-day SWiP event comprised an exciting panel discussion by experts in the fields of preservation of languages, culture, and digitisation of information, and an introductory mini workshop to Wikipedia focusing on editing, translating, and making content available online.
-
Communicative Development Inventories for all South Africa’s eleven official languages
Project Type: NodeProject Start date: 1 January 2018Project Status: Phase 1 complete; Phase 2 In Progress Project Aims: The aim of this project is to collect and digitize data on children’s language development from 8 to 30 months and from these data construct and validate Communicative Development Inventories (COIs), which are parent completed questionnaires (for…
-
Development of a multi-level, multi-genre learner corpus academic writing
Project Type: NodeProject Start Date: 1 March 2017Project Status: Completed Project Aims: Development of a multi-genre, multi-level learner corpus of academic writing in order to develop, refine and implement an online academic writing tool Collect additional data from various universities in South Africa to grow the multi-level, multi-genre learner corpus of academic writing Redevelopment…
-
Enabling localised language technology applications: A Computational Wide coverage resource grammar for isiZulu
Project Type: NodeProject Start Date: 1 April 2020Project Status: Completed Project Aims: The CSIR node of SADiLaR recently completed a project with as its main aim to deliver to the research community a high-quality, computational, wide coverage resource grammar (WCRG) for isiZulu. WCRGs unlock opportunities for the South African languages to participate in multilingual research,…
-
Digitisation of Language resources
Project Type: NodeProject Start Date: 1 April 2017Project Status: Ongoing Project Aims and Motivation: The UP digitisation node focuses on the preservation of invaluable language (and cultural) resources for the African languages by digitising textual, video and audio material, and providing language communities with access to these digital resources via the SADiLaR repository. Digitised content delivered by…
-
Linguistic corpus enrichment for conjunctively written South African languages
Project Type: NodeStart Date: 1 October 2017Project Status: Completed and delivered Project Aims: This project was developed under the Nodes Specialisation Project, makes linguistically enriched corpora available for the four official South African languages with a conjunctive orthography, i.e. isiNdebele, isiXhosa, isiZulu, and Siswati. The parallel corpora consist of approximately 50,000 tokens each, aligned…
-
Mobile Dictionary application framework
Project Type: NodeProject Start Date: 1 August 2020Project Status: In Progress Project Aims: The project aims to develop an open-source hybrid mobile application framework that will allow for online access to a TMS and dictionary API, managed through a TMS API manager (TAM) and offline access to local dictionary content. The framework will create a…
-
Spoken data corpus for Afrikaans, Setswana, Sesotho sa Leboa
Project Type: Open CallProject Start Date: 1 January 2020Project Status: Completed Project Aims: The phonetics and phonology of Coloured Afrikaans have as yet barely received any serious attention. This is largely due to the lack of adequate spoken data corpora. Without it, no complete and reliable acoustic descriptions are possible. In relation to this, satisfactory…
-
Corpus and system development for automatic captioning of official speeches
Project Type: SADiLaR Node – CSIR Speech NodeProject Start Date: 1 April 2020 Project Status: In progress Project Aims: The primary aim of the proposed project is to create a corpus of automatically transcribed government speeches. The CSIR proposes to start with the current president (Mr Cyril Ramaphosa) and then expand the corpus with speeches made…
-
SADiLaR Publications
List of published and/or submitted research output (conference or journal papers, book chapters and other academic dissemination) Bosch, S and M, Griesel. 2018. African Wordnet: facilitating language learning in African languages. 9th Global Wordnet Conference, Singapore: Nanyang Technological University (NTU). Type: Conference Paper Link: (1) https://aclanthology.org/2018.gwc-1.36/ (2) http://compling.hss.ntu.edu.sg/events/2018-gwc/pdfs/GWC2018_paper_22.pdf Baumann, A and Wissing, D.P. 2018. Stabilising…