Kommunikationstechnisches Kolloquium am IKS
Sehr geehrte Abonnenten des Kolloquium-Newsletters, gerne informieren wir Sie über den nächsten Termin unseres Kommunikationstechnischen Kolloquiums. *Donnerstag, 2. Mai 2019** **Vortragender:* Patrick von Platen *Ort: *Hörsaal 4G IKS *Zeit:* 14:00 Uhr *Master-Vortrag: *Speech Recognition with Deep Neural Networks for Raw Multichannel Signals Traditional automatic speech recognition (ASR) systems often use an acoustic model (AM) built on handcrafted acoustic features, such as log Mel-filter bank (FBANK) values. Recent studies found that AMs with convolutional neural networks (CNNs) can directly use the raw waveform signal as input. Given sufficient training data, these AMs can yield a competitive word error rate (WER) to those built on FBANK features. In this thesis a novel multi-span structure for acoustic modelling based on both single- and multi-channel raw waveform signal is proposed, which is based on multiple streams of CNN input layers, each processing a different span of the raw waveform signal. Experiments on both CHiME4 and AMI single-channel data show that the multi-span structure can significantly outperform conventional AMs based on FBANKs. Furthermore, it is shown that a widely used single-span raw waveform AM can be improved significantly by using a smaller CNN kernel size and increased stride to yield improved WERs. Experiments on CHiME4 multi-channel data show that CNN input layer kernels can learn to filter frequencies synchronously on multiple channel inputs. While the WERs obtained for multi-channel raw waveform acoustic modelling are encouraging, they still lag behind WERs obtained by AMs built on more robust log-Mel filterbank acoustic features, which are preprocessed by beamforming. Analysis reveals that, the AM's increased set of parameters for multi-channel raw waveform signal input aggravates its CNN input layer kernels to learn robust feature representations. In further work more sophisticated regularization techniques and additional experiments for multi-channel raw waveform acoustic modelling can be investigated. Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich. Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der Termine des Kommunikationstechnischen Kolloquiums finden Sie unter: http://www.iks.rwth-aachen.de/aktuelles/kolloquium/ -- Irina Ronkartz Institute of Communication Systems(IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958(phone) ronkartz@iks.rwth-aachen.de http://www.iks.rwth-aachen.de/
participants (1)
-
Ronkartz