Einladung Informatik-Oberseminar Jan Rosendahl, 22. Februar 2024,
********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * *********************************************************************** Zeit: Donnerstag, 22. Februar 2024, 14:00 Uhr Ort: Raum 9222, E3, Informatikzentrum Zoom:https://rwth.zoom-x.de/j/67896121061?pwd=RlJTNUw5RFJYU0NwMVRWNEhvQ0EwZz09 Meeting-ID: 678 9612 1061 Kenncode: 493665 Referent: Jan Rosendahl, M.Sc.; Lehrstuhl Informatik 6 Thema: Attention-Based Machine Translation Using Monolingual Data Abstract: Neural networks present a major advance in modeling for statistical machine translation systems. In this dissertation, we focus on two central aspects of neural machine translation systems, namely the training data and the attention layer that connects the encoder and decoder. The parameters of a neural machine translation system are determined by minimizing the cross-entropy loss on a corpus of bilingual training data, i.e. a set of sentence pairs where one is the translation of the other. Since such sentence-aligned bilingual data is a scarce resource and availability depends on the language pair, we investigate using monolingual data to improve the performance of the machine translation system (Methods used: language model integration, monolingual pre-training, and back-translation). Inspired by existing work on alignment models, we also incorporate a first-order dependency in the encoder-decoder attention layer. In contrast with previous machine translation models, the transformer is a pure feed-forward model without any recurrent layers. That means that no information about the previous attention decision is input to the computation of the attention layer. Modeling attention with first-order dependencies allows the attention layer to access previous attention decisions, which is a prerequisite to express, e.g. source coverage. Es laden ein: die Dozentinnen und Dozenten der Informatik -- Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences Chair of Computer Science 6 ML - Machine Learning and Reasoning RWTH Aachen University Theaterstraße 35-39 D-52062 Aachen Tel: +49 241 80-21601 sek@ml.rwth-aachen.de www.hltpr.rwth-aachen.de
participants (1)
-
Stephanie Jansen