Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date
of our Communication Technology Colloquium.
Monday, November 16, 2020
Speaker: Lorenz Schmidt
Time: 11:00 a.m.
Location: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
Master Lecture: Noise Reduction Combining Conventional Approaches and Artificial Neural Networks
The suppression of noise
for single channel speech enhancement is one of the most
prominent challenges in signal processing and has been
addressed for decades. In recent years, the popularization of
Machine Learning algorithms and advances in deep neural
network (DNN) architectures have opened new perspectives and
approaches to this field, yielding impressive results. Many of
these algorithms, however, require a computational effort that
exceeds the available resources of a real-time application.
One approach, called RNNoise, combines the methods of
classical signal processing with DNNs. Within a common noise
reduction architecture, a small weighting mask is sufficient
to achieve impressive results. The mask is estimated by a very
small neural network with a low computational complexity.
In this thesis, RNNoise is
subject to several modifications that are intended to improve
its denoising performance, while maintaining its affordable
complexity. In a first step, the gated recurrent units (GRUs)
of the RNNoise architecture are replaced simple recurrent
units (SRUs), which improve its performance while speeding up
the training process. The DNN is expanded to estimate the
pitch frequency, which is used in the reconstruction of the
harmonics with a comb filter. A new binary IIR comb filter is
developed and added to the signal processing of RNNoise.
Besides the modifications of RNNoise itself, a pitch
estimator, based on ordinary regression, and a mutual
information metric are developed. The evaluation shows a good
performance for pitch estimation and voice activity detection
(VAD). A preliminary study analyzes the upper limits, which
can be achieved by the reduced spectral weighting mask. With
bark scaling, 22 gains are a reasonable tradeoff between
performance and complexity. Then, a theoretical evaluation
shows that the new network architecture improves the
estimation considerably, especially in non-stationary noise
situations. A final evaluation compares a classical noise
suppression method, an end-to-end neural network approach,
classical RNNoise and the improved model by means of their
cepstral distances, speech-to-noise enhancement and perceptual
measures. The results show that the modifications give the new
architecture an edge over classical RNNoise. On the other
hand, the developed IIR binary comb filter falls back in the
expectation and does not improve noise suppression
performance.
All interested parties are
cordially invited, registration is not required.
General information on the
colloquium, as well as a current list of dates of the
Communication Technology Colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
-- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz@iks.rwth-aachen.de http://www.iks.rwth-aachen.de/