A former high school teacher’s quest to improve the poor reading ability of learners in low-resource languages has resulted in a groundbreaking PhD centered on the development of a means to measure text readability while developing digital language resources for Sesotho. This research is the first of its kind and will also serve as a model for other Bantu languages.
Dr Johannes Sibeko, who left high school teaching to become a lecturer in digital humanities at Nelson Mandela University (NMU) where he coordinates the Digital Humanities Hub, was recently conferred a PhD in Languages and Literature by North-West University. He conducted his innovative research under the supervision of Menno van Zaanen, Professor of Digital Humanities at the South African Centre for Digital Language Resources (SADiLaR).
“My PhD study titled ‘Measuring Text Readability in Sesotho’ aimed to address the reading challenge within the South African context of low-resource languages by proposing methods for selecting texts suitable for learners’ reading at the levels of specific grades,” says Sibeko. “Text readability indicates how difficult or easy a text is to read. For my research, I surveyed readability measures (formulas that measure how easy or difficult a text is to read) for high-resource languages and adapted them for Sesotho, using data from examination papers, and translations such as the Bible. I also developed four basic language resources for Sesotho including two syllabification systems (to chop words into their syllables) which are the first of their kind for Southern African languages,” he explains. “Such resources are essential for the creation of the readability measures, with text features from Sesotho examination texts used to compute readability values and generate specific models.”
Laying the groundwork for machine learning applications
According to Sibeko, this work resulted in nine readability measures for Sesotho, providing an objective ranking of Sesotho text readability. “The adaptation of traditional English readability formulas addresses a gap in South African indigenous languages, with the models laying the groundwork for machine learning applications in Sesotho readability assessment and beyond to other low-resource languages. The findings also address the reading ability gaps by addressing the readability levels of the texts thereby ensuring that texts are correctly aligned to the target readers.”
Sibeko’s educational journey started with a BA degree in Language Practice with Sesotho, followed by a Master of Arts in Applied Language Studies and a Postgraduate Certificate in Education. After a few years of teaching Sesotho and English at a high school, he returned to academia by enrolling for a PhD. His research interests initially focused on language policy and linguistic landscapes, later shifting to language teaching and assessment upon completing his MA, before he decided to transition to digital humanities and the development of resources for low-resource languages.
From humanities to digital humanities
“Johannes has done very well during his PhD,” says Van Zaanen. “He was not really computational when he started out, but quite a bit of work in his PhD is now computational. In that sense, he is a wonderful example of someone from the field of humanities who has moved into the field of digital humanities.
“It was an exciting journey. Johannes did his thesis by publication, which means that he published his research at workshops, conferences, and in journals. He is very good at writing and has been very productive. Though his topic is most certainly related to digital humanities, his PhD is officially in Languages and Literature as there is no curriculum yet for Digital Humanities in South Africa,” Van Zaanen adds.
Sibeko has received several accolades for his research, both locally and abroad. For example, he received the Best Paper Award at the Digital Humanities Association of Southern Africa (DHASA) conference in 2021. And, in 2022, he was the only student presenter from Africa invited to the 2022 CLARIN (Common Language Resources and Technology Infrastructure) Annual Conference, and his PhD research paper was included in the Post Conference Proceedings publication of the 2022 CLARIN Annual Conference. During the same year, Sibeko’s research outputs led to him being named the Faculty of Humanities Emerging Researcher of the Year at Nelson Mandela University.
Digital Humanities presents endless possibilities
Despite his success, Sibeko remains humble as he remembers his early days in digital humanities.
“The field of digital humanities is still relatively new in South Africa. To be honest, I didn’t know much about digital humanities before landing my job at NMU. Shortly after joining, I learned that SADiLaR was hiring a professor in digital humanities, Menno van Zaanen, who would start in August that year. I made sure to contact him and secure myself a supervisor. He understood that I was clueless at the start, so he began by familiarising me with the possibilities in the field,” he recalls.
“I believe many of us are intrigued by digital humanities, but uncertain about its potential. I’m interested in this field primarily for the possibilities it presents. There is so much pleasure in being able to manipulate big data and to develop something that other people can use too. I encourage others to consider digital humanities as a field of research.”
Asked how he feels about his new status, Sibeko says it still feels unreal. “Being capped as a PhD graduate at North-West University was incredible. At first, I wasn’t planning to go, but the Deanery of the Faculty of Humanities at NMU insisted, and I’m so glad I attended – it was worth it. My PhD journey was long and lonely at times, but here I am, and the end is quite sweet. The scary part is now moving on from the PhD – real life has begun, and I am worried whether I am ready for it,” he adds.
His advice for other prospective PhD students is to do their thesis by publication. “It gives you a handful of achievements, and keeps you motivated, while you work towards the bigger picture. There is no feeling that compares to seeing one of your submissions in publication state.”
(Written by Birgit Ottermann)