**********************************************************************
*
*
*                          Einladung
*
*
*
*                     Informatik-Oberseminar
*
*
*
***********************************************************************

Zeit: Mittwoch, 12. Juni 2024, 09:30 Uhr

Ort: Raum 9222, E3, Informatikzentrum

Zoom: https://rwth.zoom-x.de/j/68743953886?pwd=HMUlqnO8qakacCpfazBAzKy8b222EK.1
Meeting-ID: 687 4395 3886
Kenncode: 183143


Referent: Christian Herold, M.Sc.; Lehrstuhl Informatik 6

Thema: Context-Aware Neural Machine Translation


Abstract: 

Despite the known limitations, most automatic machine translation (MT) systems today still operate on the sentence-level, ignoring cross-sentence context information. This is, because considering cross-sentence context leads to (i) exponentially increasing complexity, (ii) limits us in terms of the available training data, and (iii) sometimes even reduces translation quality on general MT benchmarks. In this talk, we discuss our efforts to combat these issues and to improve context-aware MT systems.
First, we discuss the different decoding strategies for document-level MT and explain how constraining the model attention can result in a more efficient translation system. Second, to tackle the problem of scarce document-level training data, we elaborate on our efforts to utilize monolingual document-level data for MT. Finally, we discuss our efforts on data filtering for MT, which can benefit both sentence- and document-level systems.


Es laden ein: die Dozentinnen und Dozenten der Informatik

-- 
Stephanie Jansen

Faculty of Mathematics, Computer Science and Natural Sciences
Chair of Computer Science 6
ML - Machine Learning and Reasoning
RWTH Aachen University
Theaterstraße 35-39
D-52062 Aachen
Tel: +49 241 80-21601
sek@ml.rwth-aachen.de
www.hltpr.rwth-aachen.de