Einladung: Informatik-Oberseminar Kazuki Irie

24 Apr 2020

      ******************************************************
*
*
*                          Einladung
*
*
*
*                     Informatik-Oberseminar
*
*
*
*******************************************************

Zeit:  Dienstag, 05. Mai 2020, 14.00 Uhr

Zoom: 
https://us02web.zoom.us/j/84813327259?pwd=Y2NydlRMRzE1dkpkcmpERkFwMWZYZz09

Referent: Kazuki Irie, M.Sc.

Thema: Advancing Neural Language Modeling in Automatic Speech Recognition//

Abstract:

Statistical language modeling is one of the fundamental problems in 
natural language processing. In the recent years, language modeling has 
seen great advances by active research and engineering efforts in 
applying artificial neural networks, especially those which are 
recurrent. The application of neural language models to speech 
recognition has now become well established and ubiquitous. Despite this 
impression of some degree of maturity, we claim that the full potential 
of the neural network based language modeling is yet to be explored. In 
this thesis, we further advance neural language modeling in automatic 
speech recognition, by investigating a number of new perspectives. From 
the architectural view point, we investigate the newly proposed 
Transformer neural networks for language modeling application. The 
original model architecture proposed for machine translation is studied 
and modified to accommodate the specific task of language modeling. 
Particularly deep models with about one hundred layers are developed. We 
present an in-depth comparison with the state-of-the-art recurrent 
neural network language models based on the long short-term memory.

While scaling up language modeling to larger scale datasets, the 
diversity of the data emerges as an opportunity and a challenge. The 
current state-of-the-art neural language modeling lacks a mechanism of 
handling diverse data from different domains for a single model to 
perform well across different domains. In this context, we introduce 
domain robust language modeling with neural networks, and propose two 
solutions. As a first solution, we propose a new type of adaptive 
mixture of experts model which is fully based on neural networks. In the 
second approach, we investigate knowledge distillation from multiple 
domain expert models, as a solution to the large model size problem seen 
in the first approach. Methods for practical applications of knowledge 
distillation to large vocabulary language modeling are proposed, and 
studied to a large extent.

Finally, we investigate the potential of neural language models to 
leverage long-span cross-sentence contexts for cross-utterance speech 
recognition. The appropriate training method for such a scenario is 
under-explored in the existing works. We carry out systematic 
comparisons of the training methods, allowing us to achieve improvements 
in cross-utterance speech recognition. In the same context, we study the 
sequence length robustness for both recurrent neural networks based on 
the long short-term memory and Transformers, because such a robustness 
is one of the fundamental properties we wish to have, in neural networks 
with the ability to handle variable length contexts. Throughout the 
thesis, we tackle these problems through novel perspectives of neural 
language modeling, while keeping the traditional spirit of language 
modeling in speech recognition.

Es laden ein: die Dozentinnen und Dozenten der Informatik

-- 
--
Stephanie Jansen

Faculty of Mathematics, Computer Science and Natural Sciences
HLTPR - Human Language Technology and Pattern Recognition
RWTH Aachen University
Ahornstraße 55
D-52074 Aachen
Tel. Frau Jansen:  +49 241 80-216 06
Tel. Frau Andersen: +49 241 80-216 01
Fax: +49 241 80-22219
sek@i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de
Tel: +49 241 80-216 01/06
Fax: +49 241 80-22219
sek@i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de

Sekretariat I6

tags

participants (1)