Dear subscribers of the colloquium newsletter,

we are happy to inform you about the next date of our Communication Technology Colloquium.

Monday, November 16, 2020
Speaker: Lorenz Schmidt
Time: 11:00 a.m.
Location: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09

Meeting-ID: 979 0415 7921
Passwort: 481650

Master Lecture: Noise Reduction Combining Conventional Approaches and Artificial Neural Networks

The suppression of noise for single channel speech enhancement is one of the most prominent challenges in signal processing and has been addressed for decades. In recent years, the popularization of Machine Learning algorithms and advances in deep neural network (DNN) architectures have opened new perspectives and approaches to this field, yielding impressive results. Many of these algorithms, however, require a computational effort that exceeds the available resources of a real-time application. One approach, called RNNoise, combines the methods of classical signal processing with DNNs. Within a common noise reduction architecture, a small weighting mask is sufficient to achieve impressive results. The mask is estimated by a very small neural network with a low computational complexity.

In this thesis, RNNoise is subject to several modifications that are intended to improve its denoising performance, while maintaining its affordable complexity. In a first step, the gated recurrent units (GRUs) of the RNNoise architecture are replaced simple recurrent units (SRUs), which improve its performance while speeding up the training process. The DNN is expanded to estimate the pitch frequency, which is used in the reconstruction of the harmonics with a comb filter. A new binary IIR comb filter is developed and added to the signal processing of RNNoise. Besides the modifications of RNNoise itself, a pitch estimator, based on ordinary regression, and a mutual information metric are developed. The evaluation shows a good performance for pitch estimation and voice activity detection (VAD). A preliminary study analyzes the upper limits, which can be achieved by the reduced spectral weighting mask. With bark scaling, 22 gains are a reasonable tradeoff between performance and complexity. Then, a theoretical evaluation shows that the new network architecture improves the estimation considerably, especially in non-stationary noise situations. A final evaluation compares a classical noise suppression method, an end-to-end neural network approach, classical RNNoise and the improved model by means of their cepstral distances, speech-to-noise enhancement and perceptual measures. The results show that the modifications give the new architecture an edge over classical RNNoise. On the other hand, the developed IIR binary comb filter falls back in the expectation and does not improve noise suppression performance.

All interested parties are cordially invited, registration is not required.

General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium

-- 
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz@iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/