**********************************************************************
*
*
*                          Einladung
*
*
*
*                     Informatik-Oberseminar
*
*
*
***********************************************************************

Zeit: Donnerstag, 22. Februar 2024, 14:00 Uhr

Ort: Raum 9222, E3, Informatikzentrum

Zoom: https://rwth.zoom-x.de/j/67896121061?pwd=RlJTNUw5RFJYU0NwMVRWNEhvQ0EwZz09

Meeting-ID: 678 9612 1061
Kenncode: 493665


Referent: Jan Rosendahl, M.Sc.; Lehrstuhl Informatik 6


Thema: Attention-Based Machine Translation Using Monolingual Data


Abstract:

Neural networks present a major advance in modeling for statistical machine translation systems. In this dissertation, we focus on two central aspects of neural machine translation systems, namely the training data and the attention layer that connects the encoder and decoder. The parameters of a neural machine translation system are determined by minimizing the cross-entropy loss on a corpus of bilingual training data, i.e. a set of sentence pairs where one is the translation of the other. Since such sentence-aligned bilingual data is a scarce resource and availability depends on the language pair, we investigate using monolingual data to improve the performance of the machine translation system (Methods used: language model integration, monolingual pre-training, and back-translation). Inspired by existing work on alignment models, we also incorporate a first-order dependency in the encoder-decoder attention layer. In contrast with previous machine translation models, the transformer is a pure feed-forward model without any recurrent layers. That means that no information about the previous attention decision is input to the computation of the attention layer. Modeling attention with first-order dependencies allows the attention layer to access previous attention decisions, which is a prerequisite to express, e.g. source coverage.

Es laden ein: die Dozentinnen und Dozenten der Informatik


-- 
Stephanie Jansen

Faculty of Mathematics, Computer Science and Natural Sciences
Chair of Computer Science 6
ML - Machine Learning and Reasoning
RWTH Aachen University
Theaterstraße 35-39
D-52062 Aachen
Tel: +49 241 80-21601
sek@ml.rwth-aachen.de
www.hltpr.rwth-aachen.de