Dear subscribers of the colloquium newsletter,

we are happy to inform you about the next date of our Communication Technology Colloquium.

Thursday, November 14, 2024

Speaker: Sebastian Nagel, M. Sc.

Topic: Interactive Reproduction of Binaurally Recorded Signals

Time: 11:30 a.m.

Location: hybrid - Lecture room 4G

Online: https://rwth.zoom.us/j/61215027648?pwd=MTJvayt5bkdka04raWZVempPZGE0Zz09
Meeting-ID: 612 1502 7648
Passwort: 380386

Abstract of the Dissertation

Interactive Reproduction of Binaurally Recorded Signals
by Sebastian Nagel, M.Sc.

Motivation, Goal, and Task of this Dissertation

The goal of acoustic augmented or virtual reality is to artificially invoke a realistic auditory impression in the listener. One technique to achieve this goal is binaural reproduction, which refers to the reproduction of certain audio signals at the two ears of the listener. The ear signals can be obtained by binaural recording or binaural rendering, that is, by spatially sampling a real or virtual sound field with two microphones, one at each ear of a recording head. Such signals are referred to as binaurally recorded signals in the following.

For a moving listener to perceive sound sources as fixed in the environment, the reproduced signals need to match the listener’s movements. State-of-the-art methods for this interactive binaural reproduction generate such signals based on denser spatial samplings of sound fields (i.e., more than two microphones), or they perform real-time binaural rendering. The goal of this thesis is to achieve interactive binaural reproduction based on binaurally recorded signals, that is, on two ear signals originally intended for non-interactive binaural reproduction for a non-moving listener.

This is desirable for two major reasons. It makes binaural recordings usable for immersive playback. This potentially improves the user experience in applications such as telecommunications, or for consumer-generated content, where greater technical effort for recording may not be feasible. Furthermore, the resulting methods seamlessly interoperate with established technologies. Technically, binaurally recorded signals are ordinary stereo signals, and their spatial relationships are defined by human anatomy. This eliminates the coordination and standardization efforts that would be required to make the state-of-the-art methods widely usable.

The task of this dissertation is to develop algorithms that interact with a complex biological system (the human auditory system). The ultimate evaluation criterion is the subjective quality of the listening experience to the human listener, which can only be assessed through listening experiments. Therefore, to validate the result, a listening experiment is presented at the end of the dissertation. Before that, algorithms are developed and evaluated on a theoretical basis. Derivations are based on signal models, and evaluations are based on the interpretation of signal properties, both of which are rooted in knowledge of human auditory perception.

Major Scientific Contributions

The model-based algorithms are developed in two steps. First, binaurally recorded signals with only coherent sound from a single sound source are regarded. The derived algorithms act as a time-variant filter to modify the spatial properties of the binaurally recorded signals. The filter is parameterized with the measured listener head motion and the source direction estimated from the recorded signals. This novel principle allows the listener to perceive the source in a stable position in the environment. It was first proposed by the author in [NJ18], and it has been patented [NJ23]. This dissertation provides analyses of different filter design methods and architectures.

The subjective quality of the first algorithms is not suitable for signals which violate the model assumption of only direct sound, such as reverberant signals. Therefore, signals with a mix of coherent and incoherent sound are regarded in the second step. The derived algorithms perform a coherence-adaptive trade-off between the spatial modification of the coherent sound and the preservation of incoherent sound [NHJ20]. Their performance is theoretically limited by the limited ability of linear spatial filters to separate coherent and incoherent signal components [NJ21]. This dissertation provides a thorough analysis of the theoretical limitations. It also proposes perceptually motivated improvements to the filter structure that significantly extend the theoretical limits.

Finally, this dissertation presents a listening experiment to validate the proposed method. Acoustic scenes with speech sources were binaurally recorded at the ears of human test subjects. A real-time prototype with head tracking provided interactive binaural reproduction of these recordings via headphones. Subjects were asked to distinguish, in an indirect comparison, the artificial binaural reproduction from reality (ground-truth signals emitted by the loudspeakers in the same room). Results show that this task was difficult even for expert listeners, indicating that the method provides a natural and plausible listening experience.

Major Publications

[NJ18] S. Nagel and P. Jax, “Dynamic Binaural Cue Adaptation,” in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), September 2018.

[NHJ20] S. Nagel, D. Haupt, and P. Jax, “Coherence-Adaptive Binaural Cue Adaptation,” in Proceedings of International Conference on Audio for Virtual and Augmented Reality (AVAR), August 2020.

[NJ21] S. Nagel and P. Jax, “On the Use of Additional Microphones in Binaural Cue Adaptation,” in Proceedings of ITG Conference on Speech Communication, September 2021.

[NJ23] S. Nagel and P. Jax, “Methods for obtaining and reproducing a binaural recording,”
U.S. patent, US 11,546,703 B2, January 2023.

All interested parties are cordially invited, registration is not required.

General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at:
https://www.iks.rwth-aachen.de/aktuelles/kolloquium

Best regards

Anett Schindler

Anett Schindler
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958
schindler@iks.rwth-aachen.de
https://www.iks.rwth-aachen.de/