The Unisa Node of the South African Centre for Digital Language Resources (SADiLaR) recently had the privilege of hosting Dr Gertrud Faaβ, lecturer and computational linguist at the University of Hildesheim, Germany. Dr Faaβ has a special interest in developing digital language resources and natural language processing tools for the South African languages and she has been a valued collaborator on several local research projects at Unisa since 2013. The goal of her most recent visit was to present two workshops. One focused on digitising a historic German-Sotho dictionary, the Endemann’s Wörterbuch der Sothosprache, while the other provided members of the African Wordnet team with advanced corpus linguistic skills.
In the first workshop, held on 20 February 2025 at the University of Pretoria, team members of the digitisation project of Karl Endemann’s Wörterbuch der Sothosprache met to discuss recent progress towards releasing parts of the dictionary in a more accessible format. This dictionary is a unique multilingual resource published in 1911 that provides translations of Sotho languages and dialects into German. This rare dictionary is invaluable for linguistic and cultural research, but remains largely inaccessible, since only three copies are known to exist.
Prior to this workshop, the data were transformed to a machine-readable format using open-source optical character recognition software. About 300 articles were then selected from these scanned documents, translated into English and quality assured by experts in Sesotho sa Leboa and Sesotho. Efforts are currently focused on developing a database and designing a user-friendly website that will ultimately showcase the work of this team. More information can be found on the project’s website.
The second stop on Dr Faaβ’s research visit was the Department of African Languages at Unisa to share her vast knowledge in the field of corpus linguistics with the African Wordnet team and other guests. This workshop, held on 21 February 2025, had more than 40 participants and focused largely on the principles and practices to be adhered to when engaging with resource development. Dr Faaβ shared insights from her own research on South African languages (particularly isiZulu and Sesotho sa Leboa) and provided a valuable tutorial on the use of specialised tools such as AntConc to perform searches on large corpora.
The African Wordnet team and guests who attended this workshop came away with new knowledge and skills to apply to their own research, development and teaching, in addition to a broader view of current trends in the field of corpus linguistics. Participants noted that this workshop was very valuable to them and that they were eager to put this knowledge to good use in further developing South African languages as scientific fields of interest.
Dr. Faaβ’s visit was made possible by funding through SADiLaR to the African Wordnet project and we would like to express our sincere thanks to the team at SADiLaR who also assisted with travel arrangements and workshop preparations. A word of thanks also to the University of Pretoria and Unisa nodes of SADiLaR for hosting the two workshops.

Team members not pictured here: Dr Elias Malete (University of the Free State), Dr Johannes Sibeko (Nelson Mandela University) & Prof Sonja Bosch (Unisa).