**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
***********************************************************************

Zeit: Mittwoch, 15. März 2023, 14:00 Uhr

Zoom: https://rwth.zoom.us/j/93749104715?pwd=bEFSR0lsR1o1Tzl2enVMSzM0aTE2Zz09

Meeting-ID: 937 4910 4715

Kenncode: 743829

Referent: Weiyue Wang, M.Sc.

Thema: Neural Hidden Markov Model for Machine Translation

Abstract:

Recently, neural machine translation systems have shown promising performance. One of the key components that almost all modern neural machine translation systems contain is the attention mechanism, which helps an encoder-decoder model attend to specific positions on the source side to produce a translation. However, recent studies have found that using attention weights straight out of the box to align words results in poor alignment quality. This inspires us to introduce an explicit alignment model into the neural architecture in order to improve the alignment and thus also the translation quality of the overall system. To this end, we propose a novel neural hidden Markov model consisting of neural network-based lexicon and alignment models trained jointly with the forward-backward algorithm.

Various neural network architectures are used to model the lexicon and the alignment probabilities. We start with feedforward neural networks and apply our first model to re-rank n-best lists generated by phrase-based systems and observe significant improvements. In order to build a monolithic neural hidden Markov model, the more powerful recurrent neural networks are applied to the architecture, and a standalone decoder is implemented. By replacing the attention mechanism with an alignment model, we achieve comparable performance to the baseline attention model while significantly improving the alignment quality. We also apply the state-of-the-art transformer architecture to the neural hidden Markov model and the experimental results show that the transformer-based hidden Markov model outperforms the standard self-attentive transformer model in terms of TER scores.

In addition to the work on the neural hidden Markov model, we propose two novel metrics for machine translation evaluation, called CHARACTER and EED. These are easy-to-use and perform promisingly in the annual WMT metrics shared tasks.

Es laden ein: die Dozentinnen und Dozenten der Informatik

-- 
Stephanie Jansen

Faculty of Mathematics, Computer Science and Natural Sciences
HLTPR - Human Language Technology and Pattern Recognition
RWTH Aachen University
Theaterstraße 35-39
D-52062 Aachen
Tel: +49 241 80-21601
sek@hltpr.rwth-aachen.de
www.hltpr.rwth-aachen.de