Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date
of our Communication Technology Colloquium.
Monday, June 23, 2025
Speaker: Christian Schulz
Time: 2:00 p.m.
Location: hybrid - Lecture room 4G and
https://rwth.zoom.us/j/61215027648?pwd=MTJvayt5bkdka04raWZVempPZGE0Zz09
Meeting-ID: 612 1502 7648
Passwort: 380386
Master-Lecture:
Spatial Upscaling of
Higher-Order Ambisonics
Signals Using Machine Learning
Ambisonics is a
widely adopted spatial audio format that enables the capture,
processing, and playback of three-dimensional sound fields. It has
a wide array of applications such as immersive virtual reality
(VR) and advanced teleconferencing. Recent technological
advancements in multi-channel audio systems have made the
transition from Ambisonics to higher-order Ambisonics (HOA)
possible, which provides improved spatial resolution and a more
immersive listening experience. However, the HOA order, which
determines spatial accuracy, is often limited by hardware
constraints, such as the number of available microphones or
loudspeakers in the recording or reproduction setup, respectively.
As a result, much of the existing Ambisonics-based sound field
information can only be obtained in lower orders, which motivates
many methods that aim to enhance the spatial detail of these
signals. This thesis investigates the use of machine learning
techniques for artificially increasing HOA signal orders to
enhance spatial resolution of Ambisonics signals. The primary
objective is to develop data-driven models that are capable of
inferring higher-order spatial information from lower-order
Ambisonics signals, thereby improving spatial fidelity without
requiring additional recording equipment. For this purpose,
several neural network architectures are explored and trained. In
the time-domain, both fully connected (FC) networks and gated
recurrent units (GRUs) are tested. In the time-frequency domain,
the concept of sparse subband networks that process one subband at
a time is introduced. The proposed neural network architectures
are evaluated using two quantitative performance metrics. Spatial
similarity, a well-established metric, is employed to evaluate the
spatial fidelity between different HOA signals. In addition, this
thesis introduces a novel approach for estimating the effective
HOA order based on the normalized reconstruction error (NRE).
Simulation results demonstrate that adopting a sparse network
structure enhances model performance. The resulting networks
exhibit reduced complexity and require less training data, while
simultaneously surpassing the performance of dense network
counterparts. Among the evaluated models, the time-frequency
domain sparse subband networks achieve superior overall
performance, including enhanced generalization capabilities. These
findings provide insight into both the potential and current
limitations of data-driven upscaling techniques, as determined by
appropriate evaluation criteria. Overall, the proposed models
demonstrate significant utility in estimating HOA signals in
scenarios where obtaining actual HOA recordings is impractical.
All interested parties are
cordially invited, registration is not required.
General information on the
colloquium, as well as a current list of dates of the
Communication Technology Colloquium can be found at:
https://www.iks.rwth-aachen.de/aktuelles/kolloquium
Simone Sedgwick
Secretariat
Institute of Communication Systems(IKS)
Prof. Dr.-Ing. Peter Jax
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26956(phone)
+49 241 80 22254(fax)
sedgwick@iks.rwth-aachen.de
https://www.iks.rwth-aachen.de/