[Informatik-Vortraege] Einladung Informatik-Oberseminar Patrick Doetsch, Oktober 08, 2020

22 Sep 2020

      **********************************************************************
*
*
*                          Einladung
*
*
*
*                     Informatik-Oberseminar
*
*
*
***********************************************************************

Zeit: Donnerstag, 08. Oktober 2020, 14:00 Uhr

Zoom: 
https://us02web.zoom.us/j/83164563245?pwd=THlJNTdtd3ZBK3Z5TDh4RWpWbXhlQT09

Referent: Diplom-Informatiker Patrick Doetsch

Thema: Alignment models for recurrent neural networks

Abstract:

Over the last decade a new standard for modeling automatic speech 
recognition systems (ASR) and handwriting recognition systems (HWR) has 
been established by combining hidden Markov models (HMM) with recurrent 
neural network (RNN) observation models. While earlier approaches with 
feed-forward neural networks require a fine-graded time-synchronous 
alignment between the input data and the output transcription, RNNs are 
capable of modeling the sequential nature of the speech signal or text 
line image directly. The aim of this thesis is to investigate how these 
sequential modeling properties affect the training of ASR and HWR 
observation models on large-scale corpora.

In the first part of the thesis we investigate the training procedure of 
several RNN topologies. We hereby focus on variants of the long 
short-term memory (LSTM) and measure their performance on different 
corpora. For this purpose we introduce a software package for 
large-scale RNN training, which was developed as part of this thesis. 
Different methods to improve training performance are discussed and we 
demonstrate their effectiveness on several large tasks.

In the second part of this thesis we study the effects of the temporal 
modeling capabilities of RNNs on the time-synchronous alignment 
approach, which has been used in combination with HMMs over the last 
decade. Our focus here are variants of the connectionist temporal 
classification (CTC) HMM topology. Based on the insights gained from 
this study, we investigate label-synchronous alignment approaches for 
HWR and ASR. These alignment methods do not rely on time alignments, but 
generate the output transcription label-by-label while taking specific 
parts of the input signal into account. First, we describe an 
encoder-decoder system with an attention mechanism for HWR. We then 
combine this idea with the classical approach by deriving so-called 
inverted alignments, which allow to formalize label-synchronous 
alignments in the context of HMMs. We evaluate our novel approach in 
different experimental settings and present results on a large ASR corpus.

Es laden ein: die Dozentinnen und Dozenten der Informatik

-- 
--
Stephanie Jansen

Faculty of Mathematics, Computer Science and Natural Sciences
HLTPR - Human Language Technology and Pattern Recognition
RWTH Aachen University
Ahornstraße 55
D-52074 Aachen
Tel. Stephanie Jansen:  +49 241 80-216 06
Tel. Luisa Wingerath: +49 241 80-216 01
Fax: +49 241 80-22219
sek@i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de
Tel: +49 241 80-216 06/01
Fax: +49 241 80-22219
sek@i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de

[Informatik-Vortraege] Einladung Informatik-Oberseminar Patrick Doetsch, Oktober 08, 2020

Sekretariat I6