Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über die nächsten Termine unseres Kommunikationstechnischen Kolloquiums.
Mittwoch, 5. Juni 2019
Vortragender: Andreas Behler
Ort: Hörsaal 4G IKS
Zeit: 11:15 Uhr
Master-Vortrag: Investigations on the Translational Displacement within a Higher Order Ambisonics Sound Field Representation
With the advance of video recordings in the field of six degrees of freedom, the audio recordings have to adapt. The availability of spherical microphone arrays and a growing research interest make Ambisonics an attractive tool and its possible ability to facilitate soundfield navigation is intriguing.
This thesis explored a way to enable a
position change within higher order Ambisonics recording with a
single spherical microphone array. For this a parametric
decomposition of the recorded sound field was used based on the
coding and multidirectional parametrisation of Ambisonic
sound scenes method. With it an algorithm was developed
with the ability to adjust the loudness and rotate primary
signal parts to compute a translational shift. As a prerequisite
the algorithm needs distance information of the recorded scene.
Furthermore, a way to adapt to differently conditioned signals
was developed. It was evaluated along with the proposed
algorithm in a realistic surrounding with objective
measurements.
For a subjective impression a listening test
was conducted with 31 participants using a spherical loudspeaker
setup. The objective and subjective tests included a comparison
to higher oder Ambisonics warping, a zooming method for
Ambisonics. The evaluations of both tests returned positive
results.
und
Donnerstag, 6. Juni 2019
Vortragender: Marcel Czaplinski
Ort: Hörsaal 4G IKS
Zeit: 11:15 Uhr
Master-Vortrag: Machine Learning Techniques to Reconstruct Lost Parts of Speech Signals
Applications for Speech transmission and
mobile communication have high demands for speech
intelligibility and authenticity. Errors and corruptions of
different types are commonly occuring. Often, parts of the
speech signal are missing completely.
A major task in speech processing and
transmission is the enhancement of a speech signal and
minimizing distortions that occur as a result of corruptions. If
the distorted signal parts of the speech signal cannot be
restored, they can be dropped completely and reconstructed to a
certain degree from the uncorrupted parts of the signal. This
technique is the core idea of the well-known packet loss
concealment (PLC) and bandwidth extension (BWE).
However, these tools assume the missing parts to be of time or
frequency limited shape.
Speech inpainting, the task of the reconstruction of lost parts of a speech signal of any shape, extends BWE and PLC to a generalized concept. Some dictionary based speech inpainters have been proposed from various researchers in the past. Despite the progress and promising results of recent machine learning research from related topics like image inpainting, not many endeavours have been made to use signal processing and machine learning jointly to build speech inpainters.
A general framework and overview of a machine
learning assisted speech inpainter will be provided. A selection
of preprocessing tools and algorithms will be analyzed in the
context of different corruption types and the results compared
to a simple interpolation algorithm. Furthermore, time and
frequency interpolation capabilities of algorithms and speech
features will become interpretable and new insights into
existing problems will be granted.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich.
Allgemeine Informationen zum Kolloquium, sowie
eine aktuelle Liste der Termine des Kommunikationstechnischen
Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium/
-- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz@iks.rwth-aachen.de http://www.iks.rwth-aachen.de/