********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * *********************************************************************** Zeit: Montag, 07. Februar 2022, 14:00 Uhr Zoom: https://rwth.zoom.us/j/99299040714?pwd=amY1QldwUTdPUUNwN0FrNW9wRkNMQT09 Meeting-ID: 992 9904 0714 Kenncode: 146526 Referent: Yunsu Kim, M.Sc. Thema: Neural Machine Translation for Low-Resource Scenarios Abstract: Machine translation has been tackled for decades mainly by statistical learning on bilingual text data. In the most recent paradigm with neural networks, building a machine translation system requires more data than ever to make the best use of the modeling capacity and yield a reasonable performance. Unfortunately, however, there is not a sufficient amount of bilingual corpora for many language pairs and domains. To expand the coverage of neural machine translation, this talk investigates state-of-the-art methods to enhance the modeling, training, or data for such low-resource scenarios. These are categorized into two emerging paradigms in machine learning: First, in semi-supervised learning, we review the language model integration, monolingual pre-training, and back-translation. Second, in transfer learning, we study the end-to-end cascading of translation models and a series of sequential cross-lingual transfer techniques. These methods are empirically verified, compared, and combined in extensive experiments, providing the best practice for both English-centric and non-English language pairs when the data is scarce. Es laden ein: die Dozentinnen und Dozenten der Informatik -- Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences HLTPR - Human Language Technology and Pattern Recognition RWTH Aachen University Ahornstraße 55 D-52074 Aachen Tel: +49 241 80-216-06 Fax: +49 241 80-22219 sek@hltpr.rwth-aachen.de www.hltpr.rwth-aachen.de