Project Type: Node
Project Start Date: 1 April 2020
Project Status: Completed
Project Aims:
The CSIR node of SADiLaR recently completed a project with as its main aim to deliver to the research community a high-quality, computational, wide coverage resource grammar (WCRG) for isiZulu. WCRGs unlock opportunities for the South African languages to participate in multilingual research, nationally and internationally.
The project focused on developing various foundational components of the WCRG, namely the isiZulu resource grammar itself, a lexicon aimed at enabling wide-coverage, and a framework for development and evaluation based on a manually curated treebank. Furthermore, an extension module was developed to enable chunk parsing via the grammar, and a web service was developed to provide parsing and linearisation functionality. A web user interface was developed to showcase the isiZulu RG and make it available to the Natural Language Processing (NLP) community as end users.
Project Deliverables:
1. Resource Grammar for isiZulu
Implementation of isiZulu RG functions, merged into the official GF RGL repository
Access at: https://github.com/GrammaticalFramework/gf-rgl
2. GF Lexicon modules
Monolingual and multilingual GF concrete and abstract syntax modules
Access at: https://github.com/GrammaticalFramework/gf-rgl
Phrase-level adjectival qualificative GF concrete and abstract syntax modules
Access at: https://github.com/LauretteM/gf-afwn
3. Treebanks
A manually curated treebank of 1000 sentences was developed and a set of treebanks for regression testing was developed
Access at: https://github.com/LauretteM/gf-zulu-resources
Automatically generated treebanks: VulaBula Graded Reader treebank, isiZulu Wordnet usage examples treebank
Access at: https://github.com/LauretteM/gf-zulu-resources
4. GF chunk extension module
GF modules PChunk.gf and PChunkZul.gf, merged into the official GF RGL repository
Access at: https://github.com/GrammaticalFramework/gf-rgl
5. REST API web service and a web user interface
A web service for parsing of isiZulu sentences and linearisation of abstract parse trees.
Access at: https://rhonda.qfrency.com/api/v1/mt/zulurg/v1
A web user interface to serve end users of the RG.
Access at: https://grammar.qfrency.com/
6. Capacity development and research outputs
Slides presented at GF Summer School
Access at: https://github.com/LauretteM/gf-zulu-resources
Slides presented at GF online seminars
Access at: https://github.com/LauretteM/gf-zulu-resources
International workshop
Title: Approximating a Zulu GF concrete syntax with a neural network for natural language understanding
Presented at CNL 2021
Access at: https://sadilar.org/wp-content/uploads/2021/11/2021.cnl-1.4.pdf
Title: Extending the Usage of Adjectives in the Zulu AfWN
Presented at GWC 2023
Access at: https://sadilar.org/wp-content/uploads/2021/11/GWC2023_paper_5400.pdf
Title: Parsing Zulu text using Grammatical Framework
Submitted to CLIRAI (special session) 2023
Not available yet.
Title: Leveraging a resource grammar for developing language resources for Zulu
Submitted to Language, Resources and Evaluation
Not available yet
Contact Person:
Dr Laurette Marais, node manager: LMarais@csir.co.za