{"id":6663,"date":"2023-04-11T14:26:06","date_gmt":"2023-04-11T14:26:06","guid":{"rendered":"https:\/\/sadilar.org\/emnlp-2022\/"},"modified":"2023-04-11T14:26:06","modified_gmt":"2023-04-11T14:26:06","slug":"emnlp-2022","status":"publish","type":"post","link":"https:\/\/sadilar.org\/en\/emnlp-2022\/","title":{"rendered":"SADiLaR ambassadors for Nguni languages at international conference"},"content":{"rendered":"<div class=\"googlefontscall\"><\/div>\n<div class=\"pagebuilderckparams\" data-colorpalettefromtemplate=\"\" data-colorpalettefromsettings=\",,,,\" data-styles=\"\"><\/div>\n<div class=\"rowck ckstack3 ckstack2 ckstack1 uick-sortable\" id=\"row_ID1681222971942\" data-gutter=\"2%\" data-nb=\"1\" style=\"position: relative;\">\n<style class=\"ckcolumnwidth\">[data-gutter=\"2%\"][data-nb=\"1\"]:not(.ckadvancedlayout) [data-width=\"100\"] {width:100%;}[data-gutter=\"2%\"][data-nb=\"1\"].ckadvancedlayout [data-width=\"100\"] {width:100%;}<\/style>\n<div class=\"inner animate clearfix\">\n<div class=\"blockck\" id=\"block_ID1681222971942\" data-real-width=\"100%\" data-width=\"100\" style=\"position: relative;\">\n<div class=\"ckstyle\"><\/div>\n<div class=\"inner animate resizable\">\n<div class=\"innercontent uick-sortable\">\n<div id=\"ID1681223010545\" class=\"cktype\" data-type=\"text\" style=\"position: relative;\">\n<div class=\"tab_effects ckprops\" fieldslist=\"\"><\/div>\n<div class=\"tab_blocstyles ckprops\" blocbackgroundpositionend=\"100\" blocbackgrounddirection=\"topbottom\" blocbackgroundimageattachment=\"scroll\" blocbackgroundimagerepeat=\"no-repeat\" blocbackgroundimagesize=\"auto\" blocbordertopstyle=\"solid\" blocborderrightstyle=\"solid\" blocborderbottomstyle=\"solid\" blocborderleftstyle=\"solid\" blocbordersstyle=\"solid\" blocshadowinset=\"0\" fieldslist=\"blocbackgroundpositionend,blocbackgrounddirection,blocbackgroundimageattachment,blocbackgroundimagerepeat,blocbackgroundimagesize,blocalignementleft,blocalignementcenter,blocalignementright,blocalignementjustify,blocbordertopstyle,blocborderrightstyle,blocborderbottomstyle,blocborderleftstyle,blocbordersstyle,blocshadowinset\"><\/div>\n<div class=\"tab_edition ckprops\" fieldslist=\"\"><\/div>\n<div class=\"ckstyle\">\n<style><\/style>\n<\/div>\n<div class=\"cktext inner\" style=\"position: relative;\" spellcheck=\"false\">\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">Two digital humanities researchers from the South African Centre for Digital Languages Resources (SADiLaR) attended the 2022 Conference on Empirical Methods in Natural Language Processing (<a href=\"https:\/\/2022.emnlp.org\/\">EMNLP 2022<\/a>) in Abu Dhabi from 7-11 December 2022.<\/span><\/p>\n<\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">It was the first time that Andiswa Bukula and Rooweither Mabuya, SADiLaR language researchers for IsiXhosa and IsiZulu respectively, had the opportunity to attend a Natural Language Processing (NLP) conference. They attended many interesting sessions and presented a poster on MasakhaneNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition, the largest human-annotated NER dataset for 20 African languages.<\/span><\/p>\n<\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">The conference was hosted at the Abu Dhabi National Exhibition Centre by New York University Abu Dhabi (NYUAD), in partnership with Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), and featured keynote speeches by leading voices in artificial intelligence, including Mona Diab, lead AI research scientist with Meta, and Professor of Computer Science at the George Washington University; Neil Cohn, an American cognitive scientist best known for his research on the overlap in structure and cognition between language and graphic communication including comics and emojis; Gary Marcus, a scientist, best-selling author, and serial entrepreneur; and Nazneen Rajan, research lead at Hugging Face, a startup with a mission to democratise machine learning. The conference also comprised 24 workshops and six tutorials. About 2500 participants attended the five-day conference both virtually and in person.<\/span><\/p>\n<\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\"><strong>Shining on the global stage<\/strong><\/span><\/p>\n<\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">\u201cIt was very intriguing to be part of such an experience,\u201d says Andiswa Bukula. \u201cNatural Language Processing (NLP) is a field we are slowly being introduced to as budding South African researchers.\u201d<\/span><\/p>\n<\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">The immediate thing that stood out to Bukula was the beauty of Abu Dhabi and how beautiful their culture and cultural practices are. The conference itself gave her the opportunity to learn more about the research being done on a global scale pertaining NLP. \u201cAnd, to be able to share what we are doing within the South African context on a global stage was the highlight of the entire conference for me, especially speaking about some of our indigenous languages in South Africa and the larger contribution we did in the creation of the largest human-annotated Named Entity Recognition (NER) dataset for African languages,\u201d she comments.<\/span><\/p>\n<\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">According to Rooweither Mabuya it was a real privilege to attend one of the best events in the field. \u201cAs a result of attending the conference, I was able to expand my network of contacts within the field, having the opportunity to meet and interact with scholars coming from diverse geographical backgrounds, including Africa and beyond. This experience has created possibilities for future collaborations in future research.\u201d<\/span><\/p>\n<\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\"><strong><\/strong><\/span><\/p>\n<\/div><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"ckstyle\"><\/div>\n<\/div>\n<div class=\"rowck ckstack3 ckstack2 ckstack1 uick-sortable\" id=\"row_ID1681223083577\" data-gutter=\"2%\" data-nb=\"3\" style=\"position: relative;\">\n<style class=\"ckcolumnwidth\">[data-gutter=\"2%\"][data-nb=\"3\"]:not(.ckadvancedlayout) [data-width=\"33.333333333333336\"] {width:32%;}[data-gutter=\"2%\"][data-nb=\"3\"].ckadvancedlayout [data-width=\"33.333333333333336\"] {width:33.333333333333336%;}[data-gutter=\"2%\"][data-nb=\"3\"]:not(.ckadvancedlayout) [data-width=\"33.333333333333336\"] {width:32%;}[data-gutter=\"2%\"][data-nb=\"3\"].ckadvancedlayout [data-width=\"33.333333333333336\"] {width:33.333333333333336%;}[data-gutter=\"2%\"][data-nb=\"3\"]:not(.ckadvancedlayout) [data-width=\"33.333333333333336\"] {width:32%;}[data-gutter=\"2%\"][data-nb=\"3\"].ckadvancedlayout [data-width=\"33.333333333333336\"] {width:33.333333333333336%;}<\/style>\n<div class=\"inner animate clearfix\">\n<div class=\"blockck\" id=\"block_ID1681223083577\" data-real-width=\"32%\" data-width=\"33.333333333333336\" style=\"position: relative;\">\n<div class=\"ckstyle\"><\/div>\n<div class=\"inner animate resizable\">\n<div class=\"innercontent uick-sortable\">\n<div id=\"ID1681223088216\" class=\"cktype\" data-type=\"image\" onshow=\"ckAddDndForImageUpload(jQuery('#ID1681223088216')[0]);\" style=\"position: relative;\">\n<div class=\"ckstyle\">\n\t\t\t<\/div>\n<div class=\"imageck\">\n\t\t\t\t<img decoding=\"async\" width=\"100%\" height=\"auto\" src=\"https:\/\/sadilar.org\/wp-content\/uploads\/2023\/04\/AR-3-scaled.jpeg\" data-src=\"https:\/\/sadilar.org\/wp-content\/uploads\/2023\/04\/AR-3-scaled.jpeg\">\n\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"blockck\" id=\"block_ID1681223085165\" data-real-width=\"32%\" data-width=\"33.333333333333336\" style=\"position: relative;\">\n<div class=\"ckstyle\"><\/div>\n<div class=\"inner animate resizable\">\n<div class=\"innercontent uick-sortable\"><\/div>\n<\/div>\n<\/div>\n<div class=\"blockck\" id=\"block_ID1681223085461\" data-real-width=\"32%\" data-width=\"33.333333333333336\" style=\"position: relative;\">\n<div class=\"ckstyle\"><\/div>\n<div class=\"inner animate resizable\">\n<div class=\"innercontent uick-sortable\">\n<div id=\"ID1681223092736\" class=\"cktype\" data-type=\"image\" onshow=\"ckAddDndForImageUpload(jQuery('#ID1681223092736')[0]);\" style=\"position: relative;\">\n<div class=\"ckstyle\">\n\t\t\t<\/div>\n<div class=\"imageck\">\n\t\t\t\t<img decoding=\"async\" width=\"100%\" height=\"auto\" src=\"https:\/\/sadilar.org\/wp-content\/uploads\/2023\/04\/AR-1.jpeg\" data-src=\"https:\/\/sadilar.org\/wp-content\/uploads\/2023\/04\/AR-1.jpeg\">\n\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"ckstyle\"><\/div>\n<\/div>\n<div class=\"rowck ckstack3 ckstack2 ckstack1 uick-sortable\" id=\"row_ID1681223046717\" data-gutter=\"2%\" data-nb=\"1\" style=\"position: relative;\">\n<style class=\"ckcolumnwidth\">[data-gutter=\"2%\"][data-nb=\"1\"]:not(.ckadvancedlayout) [data-width=\"100\"] {width:100%;}[data-gutter=\"2%\"][data-nb=\"1\"].ckadvancedlayout [data-width=\"100\"] {width:100%;}<\/style>\n<div class=\"inner animate clearfix\">\n<div class=\"blockck\" id=\"block_ID1681223046728\" data-real-width=\"100%\" data-width=\"100\" style=\"position: relative;\">\n<div class=\"ckstyle\"><\/div>\n<div class=\"inner animate resizable\">\n<div class=\"innercontent uick-sortable\">\n<div id=\"ID1681223046728\" class=\"cktype\" data-type=\"text\" style=\"position: relative;\">\n<div class=\"tab_effects ckprops\" fieldslist=\"\"><\/div>\n<div class=\"tab_blocstyles ckprops\" blocbackgroundpositionend=\"100\" blocbackgrounddirection=\"topbottom\" blocbackgroundimageattachment=\"scroll\" blocbackgroundimagerepeat=\"no-repeat\" blocbackgroundimagesize=\"auto\" blocbordertopstyle=\"solid\" blocborderrightstyle=\"solid\" blocborderbottomstyle=\"solid\" blocborderleftstyle=\"solid\" blocbordersstyle=\"solid\" blocshadowinset=\"0\" fieldslist=\"blocbackgroundpositionend,blocbackgrounddirection,blocbackgroundimageattachment,blocbackgroundimagerepeat,blocbackgroundimagesize,blocalignementleft,blocalignementcenter,blocalignementright,blocalignementjustify,blocbordertopstyle,blocborderrightstyle,blocborderbottomstyle,blocborderleftstyle,blocbordersstyle,blocshadowinset\"><\/div>\n<div class=\"tab_edition ckprops\" fieldslist=\"\"><\/div>\n<div class=\"ckstyle\">\n<style><\/style>\n<\/div>\n<div class=\"cktext inner\" style=\"position: relative;\" spellcheck=\"false\">\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\" data-mce-style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\"><\/span><\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\" data-mce-style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\"><strong>African languages under-represented in NLP research and development<\/strong><\/span><\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\" data-mce-style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">Both Bukula and Mabuya are part of a research team working on MasakhaneNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition. According to the research abstract, African languages are spoken by over a billion people, but they are under-represented in NLP research and development. Multiple challenges exist, including the limited availability of annotated training and evaluation datasets as well as the lack of understanding of which settings, languages, and recently proposed methods like cross-lingual transfer will be effective. <a href=\"https:\/\/sadilar.org\/wp-content\/uploads\/2023\/04\/MasakhaNER_2_0-1.pdf\" data-mce-href=\"https:\/\/sadilar.org\/wp-content\/uploads\/2023\/04\/MasakhaNER_2_0-1.pdf\">In their research paper<\/a>, the research team explains their move towards solutions for these challenges, focusing on the task of named entity recognition (NER); and the creation of the largest to-date human-annotated NER dataset for 20 African languages.<\/span><\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\" data-mce-style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">For their poster presentation, Bukula and Mabuya discussed the research findings, highlighting the behaviour of state-of-the-art cross-lingual transfer methods in an Africa-centric setting, empirically demonstrating that the choice of source transfer language significantly affects performance. \u201cWhile much previous work defaults to using English as the source language, the research team\u2019s results show that choosing the best transfer language improves zero-shot F1 scores by an average of 14% over 20 languages as compared to using English.\u201d<\/span><\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\" data-mce-style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">Both Bukula and Mabuya found it very rewarding to be able to explain the complexities of isiZulu and isiXhosa whenever they were approached with questions by those interested in the Nguni languages.<\/span><\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\" data-mce-style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\">An absolute highlight for them was when they were invited to be part of the Practical AI podcast \u2013 Episode #205 to share about their work at SADiLaR and their individual interests. Listen to the podcast here: <a href=\"https:\/\/changelog.com\/practicalai\/205\" data-mce-href=\"https:\/\/changelog.com\/practicalai\/205\">https:\/\/changelog.com\/practicalai\/205<\/a><\/span><\/p>\n<p><span style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\" data-mce-style=\"font-family: 'trebuchet ms', geneva, sans-serif; font-size: 10pt;\"><em>(Written by Birgit Ottermann)<\/em><\/span><\/p>\n<\/div><\/div>\n<div id=\"ID1681223046729\" class=\"cktype\" data-type=\"text\" style=\"position: relative;\">\n<div class=\"ckstyle\"><\/div>\n<div class=\"cktext inner\" style=\"position: relative;\" spellcheck=\"false\">\n<p>&nbsp;<\/p>\n<\/div><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"ckstyle\"><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Two digital humanities researchers from the South African Centre for Digital Languages Resources (SADiLaR) attended the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) in Abu Dhabi from 7-11 December 2022. It was the first time that Andiswa Bukula and Rooweither Mabuya, SADiLaR language researchers for IsiXhosa and IsiZulu respectively, had the [&hellip;]<\/p>\n","protected":false},"author":246,"featured_media":6660,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[730],"tags":[],"class_list":["post-6663","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-general"],"acf":[],"_links":{"self":[{"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/posts\/6663","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/users\/246"}],"replies":[{"embeddable":true,"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/comments?post=6663"}],"version-history":[{"count":0,"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/posts\/6663\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/media\/6660"}],"wp:attachment":[{"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/media?parent=6663"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/categories?post=6663"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sadilar.org\/en\/wp-json\/wp\/v2\/tags?post=6663"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}