Deep learning has advanced the field of machine translation immensely. However, these advances have not been fully realised for all South African languages, because they are low-resourced and lack sufficient training data. Additionally, the Nguni languages of South Africa (isiXhosa, isiZulu, isiNdebele, and Siswati) are highly agglutinative (long words are formed by stringing together subwords), rendering standard subword techniques in NLP inadequate. In this talk, I will present our work on improving machine translation for the Nguni languages by explicitly modelling subword structure during training. We have developed a Transformer-based model that simultaneously learns translation and subword segmentation, leading to improved translation performance and subwords that are more linguistically plausible.
Francois Meyer: I am a PhD student at UCT, working under Jan Buys in the Computer Science Department. My project is about text generation for South African languages. I am developing neural network architectures that model the complex subword structure of the Nguni languages. My broader research interests lie in linguistically informed modelling and interpretability. I am interested in how deep neural networks learn and use aspects of language like morphology, syntax, and compositionality. Previously I completed my masters in AI at the University of Amsterdam and my undergraduate studies at Stellenbosch University.
8 November 2023