Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our Communication Technology Colloquium.
Topic: Interactive Reproduction of
Binaurally Recorded Signals
Location: hybrid - Lecture room 4G
Abstract of the Dissertation
Interactive
Reproduction of Binaurally
Recorded Signals
by
Sebastian Nagel, M.Sc.
The goal of acoustic augmented or virtual reality is to artificially invoke a realistic auditory impression in the listener. One technique to achieve this goal is binaural reproduction, which refers to the reproduction of certain audio signals at the two ears of the listener. The ear signals can be obtained by binaural recording or binaural rendering, that is, by spatially sampling a real or virtual sound field with two microphones, one at each ear of a recording head. Such signals are referred to as binaurally recorded signals in the following.
For a moving listener to perceive sound sources as fixed in the environment, the reproduced signals need to match the listener’s movements. State-of-the-art methods for this interactive binaural reproduction generate such signals based on denser spatial samplings of sound fields (i.e., more than two microphones), or they perform real-time binaural rendering. The goal of this thesis is to achieve interactive binaural reproduction based on binaurally recorded signals, that is, on two ear signals originally intended for non-interactive binaural reproduction for a non-moving listener.
This is desirable for two major reasons. It makes binaural recordings usable for immersive playback. This potentially improves the user experience in applications such as telecommunications, or for consumer-generated content, where greater technical effort for recording may not be feasible. Furthermore, the resulting methods seamlessly interoperate with established technologies. Technically, binaurally recorded signals are ordinary stereo signals, and their spatial relationships are defined by human anatomy. This eliminates the coordination and standardization efforts that would be required to make the state-of-the-art methods widely usable.
The task of this dissertation is to develop algorithms that interact with a complex biological system (the human auditory system). The ultimate evaluation criterion is the subjective quality of the listening experience to the human listener, which can only be assessed through listening experiments. Therefore, to validate the result, a listening experiment is presented at the end of the dissertation. Before that, algorithms are developed and evaluated on a theoretical basis. Derivations are based on signal models, and evaluations are based on the interpretation of signal properties, both of which are rooted in knowledge of human auditory perception.
The model-based algorithms are developed in two steps. First, binaurally recorded signals with only coherent sound from a single sound source are regarded. The derived algorithms act as a time-variant filter to modify the spatial properties of the binaurally recorded signals. The filter is parameterized with the measured listener head motion and the source direction estimated from the recorded signals. This novel principle allows the listener to perceive the source in a stable position in the environment. It was first proposed by the author in [NJ18], and it has been patented [NJ23]. This dissertation provides analyses of different filter design methods and architectures.
[NJ18] S. Nagel and P. Jax, “Dynamic Binaural Cue Adaptation,” in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), September 2018.
[NHJ20] S. Nagel, D. Haupt, and P. Jax, “Coherence-Adaptive Binaural Cue Adaptation,” in Proceedings of International Conference on Audio for Virtual and Augmented Reality (AVAR), August 2020.
[NJ21] S. Nagel and P. Jax, “On the Use of Additional Microphones in Binaural Cue Adaptation,” in Proceedings of ITG Conference on Speech Communication, September 2021.
[NJ23]
S. Nagel and P.
Jax, “Methods for
obtaining and reproducing a binaural recording,”
U.S. patent, US 11,546,703 B2, January 2023.
All interested parties are
cordially invited, registration is not required.