Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our Communication
Technology Colloquium.
*Wednesday, January 18, 2022*
*Speaker*: Konstantin Wehmeyer
*Time:* 2:00 p.m.
*Location*: hybrid - Lecture room 4G and
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Bachelor-Lecture: ***Deep Learning-Based Speech Synthesis as
Post-Processing of a Noise Reduction
/Audio and speech signals are often disturbed by noise signals in
frequency- and/or time-limited parts. To attenuate or remove these
distortions, several methods, including deep learning- based approaches,
are known. Often, however, only the magnitude spectrum is processed and
the phase spectrum is taken over unchanged due to its comparatively
lower relevance. Consequently, the noisy phase is reused when
synthesizing the waveform from the processed magnitude spectrum.
Therefore, distortions in the magnitude spectrum can be reduced, but not
in the phase spectrum which inevitably leads to a deterioration in
speech quality and intelligibility./
/This thesis presents methods that allow a reconstruction of the phase
spectrum of speech signals based on noise-reduced magnitude spectra. At
the Institute of Communication Systems at RWTH Aachen University a phase
reconstruction algorithm was developed and this algorithm has already
been evaluated in a previous study for the case of smoothed magnitude
spectra. It was shown that the deep neural network (DNN) used can
benefit from targeted training on the smoothed magnitude spectra even
without further modification of the network structures. However, even
slight smearing of the magnitude spectra already leads to a significant
loss in performance compared to the use of perfect magnitude spectra. In
this work, therefore, the DNNs used are optimized for the case of
noise-reduced magnitude spectra. //
/
/Several deep learning-based models are introduced and compared with
each other and with the models already developed. Their properties and
aspects such as causality are addressed. Moreover, a new loss function
and assessment measure specifically designed to estimate and assess the
phase spectrum of speech signals is developed and tested. In order to be
able to evaluate the results as independently as possible of a specific
type of noise reduction, ideal masks are developed, used, and discussed.
/
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of
dates of the Communication Technology Colloquium can be found at:
https://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Simone Sedgwick
Institute of Communication Systems(IKS)
Prof. Dr.-Ing. Peter Jax
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26956(phone)
+49 241 80 22254(fax)
sedgwick(a)iks.rwth-aachen.de
https://www.iks.rwth-aachen.de/
Dear subscribers of the Colloquium Newsletter,
we are happy to inform you about the next date of our Communication
Technology Colloquium.
*Wednesday, 30. November 2022**
**Speaker:* Frederick Pietschmann
*Time*: 14:00
*Location:* Lecture room 4G and
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master-Lecture*: Perceptual Optimization and Evaluation of a Binaural
Signal Modification Algorithm
With individualized binaural signals, it is possible to reproduce
auditory scenes such that the signal is perceived similar to the real
scene. However, perceptual similarity is no longer achieved when the
binaural signal doesn’t fully adapt to different listeners and different
orientations of the listener’s head. To address these problems, a
perceptually motivated algorithm referred to as the Binaural Cue
Adaptation (BCA) system has been developed at the Institute of
Communication Systems. The BCA system is capable of adding both
interactivity and individualization to existing binaural signals,
thereby achieving a higher degree of perceptual similarity to a
corresponding real auditory scene.
In this thesis, a perceptual optimization of the existing BCA system is
conducted in that new approaches for some components of the algorithm
are proposed, all parametrization options are identified and the overall
best parametrization is chosen. To identify the best parametrization,
both an isolated analysis of individual components is conducted and a
perceptually motivated optimization procedure for a full system analysis
is proposed and implemented.
Finally, a perceptual evaluation based on the result of the perceptual
optimization is realized. For this, two listening tests with a total
number of 17 participants are conducted – one for a normal and one for a
highly reverberant scenario. The results of these listening tests
suggest that signals produced by the optimized BCA system achieve a high
degree of perceptual plausibility for both reverberation scenarios, with
an averaged 2AFC probability to detect a BCA-generated signal of 0.563
for the normal scenario and 0.604 for the highly reverberant scenario.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of
dates of the Communication Technology Colloquium can be fount at:
https://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
esser(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/