**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
***********************************************************************
Zeit: Mittwoch, 23. September 2020, 13:00 Uhr
Zoom:
https://us02web.zoom.us/j/88115756486?pwd=TktPZkFRVCtxSVIydWtLS0Z5WEJNQT09
Referent: Diplom-Informatiker Vlad Andreas Guta
Thema: Search and Training with Joint Translation and Reordering
Models for Statistical Machine Translation
Abstract:
Statistical
machine translation describes the task of automatically
translating a written
text from a natural language into another. This is done by means
of statistical
models, which implies defining suitable models, searching the
most likely
translation of the given text using them and training their
parameters on given
bilingual sentence pairs. Phrase-based machine translation
emerged two decades
ago—and it became the state of the art throughout the following
years.
Nevertheless, the breakthrough of neural machine translation in
2014 triggered
an abrupt conversion towards neural models.
A fundamental drawback of the traditional approach is the
phrases themselves.
They are extracted from word-aligned bilingual data via
hand-crafted
heuristics. The phrase translation models are estimated using
the extraction
counts resulting from the applied phrase extraction heuristics.
Moreover, the
translation models exclude any phrase-external information,
which in turn
limits the context used to generate a target word during search.
To complement
the restricted models, a variety of additional models and
heuristics are used.
However, the potentially largest downside is that the word
alignments required
for the phrase extraction are trained with IBM and hidden Markov
models. This
results in a discrepancy between the models applied in training
and those that
are actually used in search.
Although the neural approach clearly outperforms the phrasal
one, it remains to
be answered whether it is the complexity of neural models, which
capture
dependencies between whole source sentences and their
translations, or the
coherent application of the same models in both, training and
decoding, that
leads to the superior performance of neural machine translation.
We aim at
answering this research question by developing a coherent
modelling pipeline
that improves over the phrasal approach by relying on fewer but
stronger
models, discarding dependencies on phrasal heuristics and
applying the same
word-level models in training and search.
First, we investigate two different types of word-based
translation models:
extended translation models and joint translation and reordering
models. Both
are enhanced with extended context information and estimate
lexical and
reordering probabilities. They are integrated directly into the
phrase-based
search and evaluated against state-of-the-art phrasal baselines
to investigate
their benefit on top of phrasal models.
In the second part, we develop a novel beam-search decoder that
generates the
translation word-wise, thus discarding any dependencies on
heuristic phrases,
and incorporates a joint translation and reordering model. It
includes far less
features than its phrasal systems and its performance is
analyzed in comparison
to the above-mentioned phrasal baseline systems.
The final goal is to achieve a sound and coherent end-to-end
machine
translation framework. For this purpose, we apply the same
models and search
algorithm that are employed in word-based translation also in
training. To this
end, we develop an algorithm for optimizing word alignments and
model
parameters alternatingly, which is performed iteratively with an
increasing
model complexity.
Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences HLTPR - Human Language Technology and Pattern Recognition RWTH Aachen University Ahornstraße 55 D-52074 Aachen Tel. Stephanie Jansen: +49 241 80-216 06 Tel. Luisa Wingerath: +49 241 80-216 01 Fax: +49 241 80-22219 sek@i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de Tel: +49 241 80-216 06/01 Fax: +49 241 80-22219 sek@i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de