Dear subscribers of the colloquium newsletter,

we are happy to inform you about the next date of our Communication Technology Colloquium.

Monday, June 23, 2025
Speaker: Christian Schulz
Time: 2:00 p.m.
Location: hybrid - Lecture room 4G and

https://rwth.zoom.us/j/61215027648?pwd=MTJvayt5bkdka04raWZVempPZGE0Zz09

Meeting-ID: 612 1502 7648
Passwort: 380386

Master-Lecture:

Spatial Upscaling of Higher-Order Ambisonics Signals Using Machine Learning

Ambisonics is a widely adopted spatial audio format that enables the capture, processing, and playback of three-dimensional sound fields. It has a wide array of applications such as immersive virtual reality (VR) and advanced teleconferencing. Recent technological advancements in multi-channel audio systems have made the transition from Ambisonics to higher-order Ambisonics (HOA) possible, which provides improved spatial resolution and a more immersive listening experience. However, the HOA order, which determines spatial accuracy, is often limited by hardware constraints, such as the number of available microphones or loudspeakers in the recording or reproduction setup, respectively. As a result, much of the existing Ambisonics-based sound field information can only be obtained in lower orders, which motivates many methods that aim to enhance the spatial detail of these signals. This thesis investigates the use of machine learning techniques for artificially increasing HOA signal orders to enhance spatial resolution of Ambisonics signals. The primary objective is to develop data-driven models that are capable of inferring higher-order spatial information from lower-order Ambisonics signals, thereby improving spatial fidelity without requiring additional recording equipment. For this purpose, several neural network architectures are explored and trained. In the time-domain, both fully connected (FC) networks and gated recurrent units (GRUs) are tested. In the time-frequency domain, the concept of sparse subband networks that process one subband at a time is introduced. The proposed neural network architectures are evaluated using two quantitative performance metrics. Spatial similarity, a well-established metric, is employed to evaluate the spatial fidelity between different HOA signals. In addition, this thesis introduces a novel approach for estimating the effective HOA order based on the normalized reconstruction error (NRE). Simulation results demonstrate that adopting a sparse network structure enhances model performance. The resulting networks exhibit reduced complexity and require less training data, while simultaneously surpassing the performance of dense network counterparts. Among the evaluated models, the time-frequency domain sparse subband networks achieve superior overall performance, including enhanced generalization capabilities. These findings provide insight into both the potential and current limitations of data-driven upscaling techniques, as determined by appropriate evaluation criteria. Overall, the proposed models demonstrate significant utility in estimating HOA signals in scenarios where obtaining actual HOA recordings is impractical.

All interested parties are cordially invited, registration is not required.

General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at:
https://www.iks.rwth-aachen.de/aktuelles/kolloquium


Simone Sedgwick
Secretariat
Institute of Communication Systems(IKS)
Prof. Dr.-Ing. Peter Jax
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26956(phone)
+49 241 80 22254(fax) 
sedgwick@iks.rwth-aachen.de
https://www.iks.rwth-aachen.de/