**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
***********************************************************************
Zeit: Mittwoch, 15. März 2023, 14:00 Uhr
Zoom: https://rwth.zoom.us/j/93749104715?pwd=bEFSR0lsR1o1Tzl2enVMSzM0aTE2Zz09
Meeting-ID: 937 4910
4715
Kenncode: 743829
Referent: Weiyue Wang, M.Sc.
Thema: Neural Hidden Markov Model for Machine Translation
Abstract:
Recently, neural machine translation systems have shown promising
performance. One of the key components that almost all modern
neural machine translation systems contain is the attention
mechanism, which helps an encoder-decoder model attend to specific
positions on the source side to produce a translation. However,
recent studies have found that using attention weights straight
out of the box to align words results in poor alignment quality.
This inspires us to introduce an explicit alignment model into the
neural architecture in order to improve the alignment and thus
also the translation quality of the overall system. To this end,
we propose a novel neural hidden Markov model consisting of neural
network-based lexicon and alignment models trained jointly with
the forward-backward algorithm.
Various neural network architectures are used to model the lexicon
and the alignment probabilities. We start with feedforward neural
networks and apply our first model to re-rank n-best lists
generated by phrase-based systems and observe significant
improvements. In order to build a monolithic neural hidden Markov
model, the more powerful recurrent neural networks are applied to
the architecture, and a standalone decoder is implemented. By
replacing the attention mechanism with an alignment model, we
achieve comparable performance to the baseline attention model
while significantly improving the alignment quality. We also apply
the state-of-the-art transformer architecture to the neural hidden
Markov model and the experimental results show that the
transformer-based hidden Markov model outperforms the standard
self-attentive transformer model in terms of TER scores.
In addition to the work on the neural hidden Markov model, we
propose two novel metrics for machine translation evaluation,
called CHARACTER and EED. These are easy-to-use and perform
promisingly in the annual WMT metrics shared tasks.
Es laden ein: die Dozentinnen und Dozenten der Informatik
-- Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences HLTPR - Human Language Technology and Pattern Recognition RWTH Aachen University Theaterstraße 35-39 D-52062 Aachen Tel: +49 241 80-21601 sek@hltpr.rwth-aachen.de www.hltpr.rwth-aachen.de