Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next dates of our Communication Technology Colloquium.
*Wednesday, November 18, 2020* *Speaker:* Leonie Geyer *Time: *2:00 p.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921 Passwort: 481650
*Master Lecture*: Sound Field Conversion Using Machine Learning Methods
The reproduction of realisitc sound fields is necessary for the efficient evaluation of modern communication devices. Depending on the application, different microphone arrangements are used to record the sound fields. Not all desired signals are directly available in the required microphone configuration. The goal of sound field conversion is to convert any signal between different recording systems in an identical sound field.
This thesis compares different approaches to convert sound fields. The conversion between the signals of two concrete measurement systems, a binaural artificial head and a microphone array with eight channel, is investigated. Three new approaches, which use artificial neural networks, are developed. First a convolutional approach, which has a simple end-to-end structure. Secondly, a time filter approach. Here the network outputs a FIR filter, which is applied to the input signal. Third a variant of wavenet, which is divided into to subnetworks. One analyses a section of the time signal and output a feature vector, which is available to the conversion network when converting the time signal.
The neural networks are trained using data from real and defined acoustic environments. The performance is measured by metrics in time and frequency domain. As a comparison, sound field conversion by equalising and measuring with the target recording device is performed. Different experiments are conducted on the structure and parametrisation, as well as the training process of the neural networks. Their performance is observed and optimised. The Wavenet variant achieves in all experiments the best results than the neural approaches. Training with a loss function, which includes the mean square error and frequency metrics, can reduce the error in the frequency domain, although a higher error in the time domain is observed, compared to the pure mean square error as loss function.
and
*Wednesday, November 18, 2020** **Speaker:* Shahd Al Hares *Time*: 3:00 p.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921 Passwort: 481650
*Master Lecture*: Signal-Adaptive Approaches to Sound Field Translation
With the growing demands for spatial audio, Ambisonics became to attract more attention in both fields of recording and reproduction of sound field. Thus, the demand increased for applying sound field translation that allows the use to move freely in different directions in the acoustic scene. In Higher Order Ambisonics (HOA), the sound field incidence is described in the reference point by a set of mathematical functions known as Spherical Harmonics (SH). However, the reproduction is restricted to the surrounding area of the reference point due to the underlying recording hardware, which can be observed in the Ambisonics domain as bandwidth limitation. In this thesis, an Ambisonics sound field translation is investigated. Recent approaches were proposed that allow the listener to move a few centimeters away from the reference point (3DoF+). Another approach provides further translation of the sound field (6DoF) but requires multiple Shperical Microphone Arrays (SMA) to be used during the recording process.
In this thesis, an enhanced method for sound field translation is proposed. It is based on upscaling the Higher Order Ambisonics (HOA) signal to a higher SH order using Compressed Sensing (CS). CS is a framework that is used to recover a signal from an under-determined linear system. Different aspects of HOA upscaling and sound field translation are studied theoretically and practically. Noise reduction with CS for HOA signals is discussed, and the influence of source distance on the translated signal is investigated in the Near-Field-Compensated Higher Order Ambisonics (NFC-HOA) domain. A systematic comparison between multiple translation approaches, namely plane wave translation and space warping, is performed based on two Monte Carlo experiments. Moreover, a new formulae is derived that defines the limits of the upscaling SH order as a function of the normalized translation distance and initial SH order. Finally, a method is introduced for sound field translation of realistic signals based on upscaling in frequency domain. The method is evaluated in frequency domain for multiple Bark bands.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as current list of the dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium
kommunikationstechnik-kolloquium@lists.rwth-aachen.de