PhD research paper puts SADiLaR in the global spotlight

The South African Centre for Digital Language Resources (SADiLaR) is enjoying international bragging rights, thanks to a PhD research paper that was included in the Post Conference Proceedings publication of the 2022 CLARIN Annual Conference.

Johannes Sibeko, a PhD student supervised by SADiLaR's Prof Menno van Zaanen, was one of twelve doctoral scholars selected to present an abstract of their research in the PhD Student Session of the 11th annual CLARIN conference, which took place from 10 to 12 October 2022 in Prague, Czech Republic. It was such a success that he was invited to submit a full paper for inclusion in the Post Conference Proceedings, an open-access publication that presents the highlights of the entire conference.

CLARIN, which is short for Common Language Resources and Technology Infrastructure, is a digital infrastructure that offers data, tools and services to support research based on language resources. Sibeko's paper presents his doctoral research project that explores the development of resources for measuring text readability in Sesotho, a Bantu language spoken by more than 10 million speakers across Southern Africa.

The only student presenter from Africa

“The acceptance letter meant the world to me,” says Sibeko, who is a lecturer in Digital Humanities at the Nelson Mandela University in Gqeberha. “I was really nervous to present at the conference, but at the same time excited for the opportunity to be on an international stage. Being chosen to represent SADiLaR, as well as being the only student presenter from Africa, was a great honour for me. I felt very proud.”

Unfortunately, Sibeko was unable to attend the conference in person because of visa issues. "I ended up attending online only. It was very disappointing because I couldn’t participate in the interesting activities organised by CLARIN and I also missed out on networking opportunities. However, I remain positive that there will be more travel opportunities in the future.”

Reflecting on his achievement as an early-career researcher, Sibeko says he should start believing more in himself and approach international publication outlets with more confidence. "I feel like I am just arriving where I wished to be in my research journey.”

Addressing learners’ poor reading ability

Asked about the topic for his PhD research, Sibeko explains that South African learners are lacking in reading skills. “In education, teachers are expected to choose and adapt texts to their learners’ levels. However, these processes are intuitive and subjective. As a result, there is no objective way of assuring that texts administered for learning, teaching and assessment are of the correct readability levels,” he says.

“An objective measure of text readability in Sesotho will help in the selection and adaptation of texts for different purposes and expected levels. My study, therefore, aims to develop metrics for measuring text readability that can benefit researchers, authors, teachers, and readers. The aim is to adapt nine existing readability metrics into Sesotho using English as a higher-resourced helper language. All of the modules will be published open access on SADiLaR’s repository.”

Sibeko also hopes to develop a web-based application to provide access to automated text readability analysis that will allow the user to paste texts and receive a readability analysis report.

According to Menno van Zaanen, Professor of Digital Humanities at SADiLaR and North-West University, Sibeko’s research illustrates the importance and applicability of digital language resources for South African languages. “Being able to measure the readability of Sesotho texts allows lecturers to select suitable texts for learners, and professional writers to adjust their texts to their relevant audiences. Not only is Johannes’ work interesting from an academic perspective, it illustrates how these resources can be used to boost the South African languages, like Sesotho, for the general public,” he comments.

Former high school teacher

Interestingly, Sibeko is a former high school teacher in Sesotho and English whose research is driven by his desire to maximise opportunities for language learning. “The challenge of selecting and aligning reading texts with readers goes beyond Sesotho. Nonetheless, Sesotho is selected as the initial approach to tackle this overarching issue. I am hopeful that readability metrics can be developed for other indigenous languages too.”

Sibeko was exposed to Digital Humanities for the first time when he landed a post at Nelson Mandela University and was required to explore this new field. “I had no idea what it was when I first started out,” he recalls. “Now I have immersed myself in this field. I am having a really good time doing my PhD and the research is becoming that much more interesting.”

Caption: Mr Johannes Sibeko,

(Written by Birgit Ottermann)