[Informatik-Vortraege] Einladung Informatik-Oberseminar Pavel Golik, August 14, 2020

3 Aug 2020

      **********************************************************************
*
*
*                          Einladung
*
*
*
*                     Informatik-Oberseminar
*
*
*
***********************************************************************

Zeit: Freitag, 14. August 2020, 14:00 Uhr

Zoom: 
https://us02web.zoom.us/j/83272559800?pwd=Nk5yU1c3anRZeE9yYU5GMU0yaHQ3Zz09

Referent: Diplom-Informatiker Pavel Golik

Thema: Data-Driven Deep Modeling and Training for Automatic Speech 
Recognition

Abstract:

Many of today's state-of-the-art automatic speech recognition (ASR) 
systems are based on hybrid hidden Markov models (HMM) that rely on 
neural networks to provide acoustic and language model probabilities. 
The training of the acoustic model will be the main focus of this thesis.

In the first part of this thesis we will be concerned with the question, 
to which extent can the extraction of acoustic features be learned by 
the acoustic model. We will show that not only can a neural network 
learn to classify the HMM states from the raw time signal, but also 
learn to perform the time-frequency decomposition in its input layer. 
Inspired by this finding, we will replace the fully-connected input 
layer by a convolutional layer and demonstrate that such models show 
competitive performance on real data.

In the second part we will investigate the objective function that is 
optimized during the supervised acoustic training. In principle, both 
cross entropy and squared error can be used in frame-wise training. We 
will compare the objective functions and demonstrate that it is possible 
to train a hybrid acoustic model using squared error criterion.

In the third part of this study we will investigate how i-vectors can be 
used for acoustic adaptation. We will show that i-vectors can help to 
obtain a consistent reduction of word error rate on multiple tasks and 
perform a careful analysis of different integration strategies.

In the fourth and final part of this thesis we will apply these and 
other methods to the task of speech recognition and keyword search on 
low-resource languages. The limited amount of available resources makes 
the acoustic training extremely challenging. We will present a series of 
experiments performed in the scope of the IARPA Babel project that make 
heavy use of multilingual bottleneck features.

Es laden ein: die Dozentinnen und Dozenten der Informatik

-- 
--
Stephanie Jansen

Faculty of Mathematics, Computer Science and Natural Sciences
HLTPR - Human Language Technology and Pattern Recognition
RWTH Aachen University
Ahornstraße 55
D-52074 Aachen
Tel. Stephanie Jansen:  +49 241 80-216 06
Tel. Luisa Wingerath: +49 241 80-216 01
Fax: +49 241 80-22219
sek@i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de
Tel: +49 241 80-216 06/01
Fax: +49 241 80-22219
sek@i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de

[Informatik-Vortraege] Einladung Informatik-Oberseminar Pavel Golik, August 14, 2020

Sekretariat I6