Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our Communication
Technology Colloquium.
*Monday, April 25, 2022*
*Speaker*: Alexander Sobolew
*Time*: 03:00 p.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master-Lecture*: Investigation of Specialized Recurrent Units for
Acoustic Echo Cancellation
In today's communication, hands-free devices e.g. remote communication
are widely used. Without further action, these would suffer from an
acoustic echo that arises from the coupling between speaker and
microphone. To minimize these disturbances, acoustic echo cancellation
is indispensable. Model-based adaptive algorithms exist to solve this
issue. However, they require careful tuning of parameters whose optimum
differs between devices and acoustic situations.
In this thesis, a new data-driven approach for acoustic echo
cancellation is developed and investigated. In contrast to the purely
model-based approach, the algorithm is supposed to learn the optimal
performance from data without the necessity to be tuned. In new
situations, the unknown parameters should be estimated. At its core, the
novel structure is similar to a frequency adaptive filter. However, it
is extended by the gating mechanism known from recurrent neural
networks. The development also includes the determination of optimal
training paradigms. When choosing the model structure, attention is paid
to a reasonable training complexity. Major challenges in this thesis
include the investigation of the gating mechanism, which is represented
by a learn gate and a reset gate. The prior is used to estimate a
time-varying step size of the iterative algorithm. Gated Recurrent Units
provide an internal memory to accommodate the sequential information in
speech, while skip connections optimize the gradient flow during
training. Independent use of a reset gate to reset the impulse response
estimation in case of situation change is outperformed by exploitation
of weight sharing. Using weight sharing, the learn and reset gates have
direct information about each other's behavior due to a shared partial
network. It was shown that when using backpropagation through time, the
truncation order can be reduced to a certain extent, which reduced the
training complexity but did not decrease the performance. The developed
model outperforms the tuned Fast Block Normalized Least-Mean-Square
algorithm in reconvergence speed and steady-state performance in far-end
single talk and double talk. Furthermore, our model repeatedly
outperforms the tuned diagonalized Kalman filter in certain scenarios
and offers significantly improved overall performance in single talk.
and
*Monday, April 25, 2022*
*Speaker*: Alexej Sobolew
*Time*: 04:00 p.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master-Lecture*: Investigation of Generative Neural Networks for Speech
Enhancement
Speech enhancement aims to reduce noise in speech signals and is widely
used in hearing aids and mobile speech communication. Speech synthesis,
on the other hand, aims to generate high-quality human speech and is
used, e. g., in text-to-speech generation. Noise reduction and speech
synthesis can be combined since conventional noise reduction methods
often only improve the magnitude spectrum and keep the noisy phase.
However, the phase has an important influence on speech quality and
intelligibility. In addition, training neural networks with complex
spectrograms is more difficult, so it is reasonable to first denoise the
magnitude spectrum and subsequently synthesize the waveform of the
speech based on it. Mentioned applications often require low execution
times and low computational overhead. This is achievable by exploiting
parallel processors using the non-autoregressive property and by
reducing the number of parameters in the neural network. Hence, this
thesis aims to investigate noise reduction, speech synthesis, and joint
interaction.
In this thesis, the first use case considered is phase reconstruction
and speech synthesis based on clean data. A non-autoregressive
three-stage speech enhancement system is developed for the second use
case of combined noise reduction and speech synthesis based on magnitude
spectra. For speech synthesis on clean magnitude spectra such as
mel-spectrograms, the neural network called WaveGlow from Nvidia is
taken as a basis. WaveGlow achieves similar subjective performance as
the Griffin-Lim algorithm but is better suited for fast applications.
For the reduction of parameters in WaveGlow, the SqueezeWave is used,
resulting in a decrease in the number of parameters and the inference
time by up to 70%. During the switch to additional noise reduction
besides speech synthesis, it is shown that WaveGlow alone is not
suitable for performing both tasks simultaneously. Consequently, the
problem is divided into three stages: masking, inpainting, and
synthesis. The models for masking and inpainting are adapted to
mel-spectrograms and studied in detail for noise reduction. As result,
they are able to reduce the noise significantly. It is worth noting that
in this thesis the mel-filter is used as downsampling adapted to human
perception due to its non-linear property, resulting
in a reduction of the number of computations in the first two models.
Subsequently, the performance of the entire three-stage speech
enhancement system is investigated. It improves the speech quality and
intelligibility of noisy data while exploiting parallel processors and
can compete with existing state-of-the-art methods. The system achieves
better noise reduction than the Convolutional Recurrent Network (CRN)
and additionally does not rely on the noisy phase.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of
dates of the Communication Technology Colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our Communication
Technology Colloquium.
*Wednesday, March, 30, 2022*
*Speaker*: Nora Pöhlau
*Time*: 10:00 a.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master Lecture*: Performance Evaluation of Sound Field Translation
Methods for Recorded Virtual Reality
The Higher-Order Ambisonics (HOA) format allows directional recording
and playback of sound, making it an attractive tool for spatial audio or
immersive sound applications. Because Higher-Order Ambisonics are
mathematically based on Spherical Harmonics (SHs), they offer full
rotational freedom for the listener (3DoF). However, the sound field can
only be correctly reconstructed in a small area around the original
recording position due to physical constraints. Three algorithms
developed at the Institute for Communications Systems (IKS) make it
possible to allow an additional translational movement of the user, even
beyond the sweet spot. These algorithms deviate from the physically
correct reconstruction in favour of an acoustically plausible playback.
In this thesis, the three algorithms of Space Warping (SW), Adaptive
Space Warping (ASW) and Adaptive Beamforming (ABF) are perceptually
compared by conducting multiple listening tests. ABF and ASW split the
sound signal into a primary and an ambient part and apply the
translation operation only to the primary part. In two web-based
listening tests, it was found that this separation is an acoustically
valid approach. It was not distinguishable for the listeners if the
primary part contained only direct sound or additional early reflexions.
In a second step, a listening test in the laboratory was conducted.
Here, the algorithms were compared for different translation distances.
For small distances, ABF showed the best performance of all algorithms.
ABF introduced fluctuating residual noise for higher distances but still
obtained the highest source position ratings. Besides that, a newly
proposed variant of SW has proven to perform surprisingly well and
scored second best in all ratings behind ABF.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of
dates of the Communication Technology Colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres
Kommunikationstechnischen Kolloquiums.
*Dienstag, 22. März 2022*
*Vortrage**nder*: Anatolii Skovitin
*Zeit*: 14:00 Uhr
*Ort*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Bachelor-Vortrag:* Sprachsynthese von gestörten Betragsspektren mit
Methoden des maschinellen Lernens
Sprachsignale sind ein wichtiges Mittel der Kommunikation zwischen
Menschen. In der digitalen Welt werden Sprachsignale über Telefone oder
das Internet übertragen. Dazu müssen sie zunächst in den
Zeit-Frequenzbereich transformiert werden. Ein resultierendes
Zeit-Frequenzspektrum setzt sich zusammen aus dem Magnituden- und dem
Phasenspektrum. Sprachsignale sind häufig Störungen ausgesetzt, bei
denen das Nutzsignal verzerrt wird. Es gibt Methoden die diese
Verzerrungen abschwächen oder entfernen können. Häufig wird jedoch nur
das Magnitudenspektrum betrachtet und das Phasenspektrum bleibt aufgrund
seiner vergleichsweise geringen Bedeutung unverändert. Allerdings werden
sich im Normalfall die verarbeiteten Magnitudenspektren den perfekten
Magnitudenspektren annähern. Andere Methoden liefern überhaupt kein
Phasenspektrum, sondern nur eine Schätzung des Magnitudenspektrums.
In dieser Arbeit wird die Methode untersucht, die das Phasenspektrum von
Sprachsignalen aus den geschätzten oder gestörten Magnitudenspektren
rekonstruiert. Zu diesem Zweck werden Ansätze aus dem Bereich des
maschinellen Lernens verwendet. Um die Methoden der Phasenrekonstruktion
möglichst unabhängig von den spezifischen Störungsarten eines bestimmten
Systems zu untersuchen, wird eine künstliche Störung verwendet. Die
vorbereiteten Daten werden für das Training der neuronalen Netze
verwendet. Die besten Modelle des neuronalen Netzes werden dann
ausgewählt. Sie werden auf unterschiedlich gestörten Daten angewendet,
um herauszufinden, wie gut die neuronalen Netze für verschiedene Arten
von Störungen geeignet sind. Schließlich werden die verschiedenen
Phasenrekonstruktionsmethoden angewendet und die resultierenden
Sprachsignale bewertet. Außerdem wird ein Vergleich mit dem
Griffin-Lim-Algorithmus durchgeführt.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
https://www.iks.rwth-aachen.de/aktuelles/kolloquium/
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
esser(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne laden wir Sie zu einem Promotionsvortrag ein.
Vortragender: Herr Matthias Schrammen, M. Sc.
Thema: *Front-End Signal Processing for Far-Field Speech Communication*
Zeit: Freitag, 18. März 2022, 10:00 Uhr
Zoom-Meeting:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
<https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09>Meeting-ID:
979 0415 7921
Passwort: 481650
Devices for speech communication operated in hands-free mode offer a
very natural way of human communication. The capturing device, e.g., a
smartphone, smart speaker or tablet, is often located up to several
meters away from the human speaker. Furthermore, detrimental effects
like noise and reverberation are present in everyday acoustic
environments. Therefore, the signal-to-noise ratio at the microphones
mounted on the device is typically too low to offer sufficient speech
quality for the listener at the other end of the communication link. In
addition, the loudspeaker of the device is located much closer to the
microphones than the human speaker. Therefore, a strong echo signal from
the loudspeaker couples into the microphones degrading the conversation
quality for the remote listener even further.
State-of-the-art approaches that tackle the above-mentioned problems
usually rely on multiple microphones to improve the signal-to-noise
ratio with methods like beamforming. Beamforming combines the digitally
filtered signals of several microphones to obtain an enhanced speech
signal at the output. In addition, acoustic echo cancellation is
employed to attenuate the echo signal more specifically. This is
achieved by adaptive estimation of a digital model of the acoustic echo
path and subsequent subtraction of the synthesized echo signal from the
microphone signal.
However, the solutions are usually optimized for one specific device and
are only applicable when the positions of the microphones are fixed and
known to the algorithm. Furthermore, the combination of multi-microphone
enhancement and echo cancellation is not trivial and low complexity
solutions are lacking performance in terms of tracking dynamic acoustic
scenarios. Finally, low costs, small form factors, and high desired
sound pressure levels result in loudspeakers that operate at their
physical limits. This adds significant nonlinear components to the sound
emitted by the loudspeaker. Therefore, conventional linear acoustic echo
cancellation cannot compensate for the nonlinear parts of the echo and
the conversational quality is not satisfactory.
The task of the dissertation is to alleviate these shortcomings. The
developed signal processing algorithms should be more flexible with
respect to desired features in real devices. Among these are microphone
positions that are unknown or change during operation and the use of
beamforming and acoustic echo cancellation at the same time.
Furthermore, the developed solutions should be able to handle nonlinear
echo paths and should introduce a low computational complexity to be
attractive for battery-powered devices, too.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
esser(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres
Kommunikationstechnischen Kolloquiums.
*Mittwoch, 9. März 2022*
*Vortragender*: Marcel Kohn
*Zeit*: 14:00 Uhr
*Ort*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master-Vortrag*: Robuste Schätzung der Sprecherstimme bei Hearables und
Hörgeräten mit Mehr-Sensor-Systemen
Durch den Verschluss des Gehörgangs durch Hörgeräte oder Hearables wird
die Eigenwahrnehmung der eigenen Stimme gestört. Wenn keine
Gegenmaßnahmen ergriffen werden, wird die eigene Stimme als dröhnend
empfunden, was als Okklusionseffekt bezeichnet wird. Er kann durch die
so genannte aktive Okklusionsunterdrückung (AOC) verhindert werden.
Dieser Algorithmus nutzt jedoch in der Regel einen akustischen
Hear-Through aus, so dass er in lauten Umgebungen eher nachteilig ist.
Bei einem alternativen Ansatz arbeitet das Gerät im aktiven
Geräuschunterdrückungsmodus (ANC), so dass alle Umgebungsgeräusche
blockiert werden. Dann kann eine verbesserte Wahrnehmung der eigenen
Stimme erreicht werden, wenn eine Schätzung des Luftschalls der eigenen
Stimme durch das Gerät wiedergegeben wird, um das Gefühl eines
unverschlossenen Ohres zu erzeugen. Da ANC-Geräte jedoch in der Regel in
lauten Umgebungen getragen werden, ist die Schätzung dieses Signals eine
Herausforderung. Eine Möglichkeit, die natürliche Wahrnehmung
wiederherzustellen, besteht darin, den gedämpften Luftschall der Stimme
zu rekonstruieren und so eine natürliche Wahrnehmung zu erzeugen.
In dieser Arbeit wrid ein neuronales Netzwerk in ein ANC-System
integriert, um die Sprachkomponenten der Stimme des Sprechers von den
Umgebungsgeräuschen zu trennen. Nach Anwendung eines Equalizers zur
Berücksichtigung weiterer akustischer Einflüsse wird das entrauschte
Sprachsignal über einen Kopfhörerlautsprecher im Gehörgang
wiedergegeben. Im Vergleich zu bestehenden Sprachverbesserungssystemen
wird das Signal eines zusätzlichen Mikrofons an der Innenseite des
Kopfhörers als Nebeninformation berücksichtigt. Innerhalb einer
Messreihe werden die für das Training benötigten Daten mit Testpersonen
aufgenommen. Darüber hinaus werden gerätebezogene Übertragungsfunktionen
gemessen, die zusammen mit Ambisonics-Aufnahmen höherer Ordnung (HOA)
zur Vergrößerung der Trainingsdatenmenge verwendet werden können, was zu
1736 Stunden Audiodaten für 21 Testpersonen führt.
Eine Untersuchung verschiedener rekurrenter neuronaler Faltungsnetzwerke
zeigt insbesondere, dass die Verwendung des inneren Mikrofons sowohl zu
der gewünschten Störgeräuschreduktion als auch zu einer Verschlechterung
der Sprachqualität führt. Weitere Änderungen an der Netzarchitektur des
untersuchten Netzes führen zu erhöhten Werten bei
wahrnehmungsmotivierten Metriken. Darüber hinaus wird eine auf
Multimasking basierende Netzwerkerweiterung getestet, die in der Lage
ist, die Dämpfung von Störsignalen dynamisch durch einen einzigen
Parameter zu reduzieren, was zu vergleichbar hohen Ergebnissen führt.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne laden wir Sie zu einem Promotionsvortrag ein.
Vortragender: Herr Stefan Kühl, M. Sc.
Thema: *Adaptive Algorithms for the Identification of Time-Variant
Acoustic Systems*
Zeit: Montag, 7. März 2022, 11:00 Uhr
Zoom-Meeting:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
<https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09>Meeting-ID:
979 0415 7921
Passwort: 481650
Many digital speech and audio communication systems incorporate models
of acoustic systems during signal processing. Often, the impulse
responses of these acoustic systems have to be identified during or
before using a communication system by means of adaptive algorithms.
Acoustic systems describe how sound is affected during transmission
between a source and a receiver. Examples of acoustic systems comprise
rooms with reflections from boundaries and scattering from objects,
communication devices such as smartphones or smart home devices, or even
a human head where shadowing effects occur. In many situations, the
impulse responses of these systems need to be identified. Possible
scenarios are acoustic measurements or system identification in speech
communication applications, e.g., for accoustic echo cancellation (AEC).
Depending on the specific use case, vertain aspects have to be taken
into account. For a measurement the excitation signal can be designed,
whereas for speech communication applications the system's excitation is
the speech signal that cannot be altered. Hence, for the latter case,
correlation has to be considered during the system identification. In
addition, the systems to be identified may vary over time when the
acoustic environment changes, e.g., due to moving objects. Therefore,
adaptive algorithms must be used to track the system's state. Additional
challenges arise when considering the identification of multiple
channels simultaneously.
This thesis considers different aspects of systems identification of
time-variant acoustic systems in a joint framework for diverse scenarios.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our Communication
Technology Colloquium.
*Friday, January 21, 2022*
*Speaker*: Ali Yilmaz Yildirim
*Time*: 11:00 a.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master Lecture*: Dynamic Spectrum Access in Highly Mobile Environments
Dynamic Spectrum Access (DSA) has been introduced to enable the
opportunistic utilization of licensed frequency bands by secondary users
(SUs). DSA was introduced for ground-based users, and algorithms were
developed by considering those users. With the increase in air data
traffic, DSA has started to be considered for airborne communication.
Nevertheless developed algorithms for ground-based users can not be
reused for airborne communication. The main reason for that is the high
mobility of aerial vehicles. Therefore, current DSA algorithms should be
examined and adapted for highly mobile environments.
With this need, this thesis aims to develop algorithms for DSA in highly
mobile environments by focusing on the aspects of DSA that require to be
adapted. As a result of SUs being considered as temporary users, SUs
should vacate the channel with the transmission of the primary users
(PUs) and find a new common idle channel to continue their
communication. With the highly varying spectral environment as a result
of high mobility, communication between SUs will be interrupted with PU
transmission more frequently compared to stationary environments. That
is why SUs should detect PU transmission and react by finding a new
common idle channel in the shortest time. By considering the need for an
adaptation in this aspect of DSA, this thesis proposes algorithms and a
frame structure to identify the PU presence and to find a new common
channel in highly mobile environments.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of
dates of the Communication Technology Colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our Communication
Technology Colloquium.
*Wednesday, November 24, 2021*
*Speaker*: Georg Krekel
*Time*: 2:00 p.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Bachelor Lecture*: Comparison of Sound Field Translation Methods
Higher Order Ambisonics (HOA) signals can be used to allow directional
playback of sound signals, where the listener can rotate their head
around all three rotational axis. HOA signals can also be adapted to
allow translational movement of the listener using sound field
translation. Sound field translation is the process of evaluating a
sound field at a different position than it was recorded at. At the IKS
(Institute of Communication Systems, RWTH Aachen) multiple algorithms
have been proposed, that try to achieve psychoacoustically plausible
sound field translation, however the inner workings of these algorithms
are not well understood.
This thesis compares different aspects of two algorithms, Adaptive
Beamforming and Adaptive Spacewarping, mainly by deriving analytical
expressions that allow deeper understanding of the underlying effects.
The algorithms were split into two main stages, the translation stage
and the filtering stage. There are significant differences between ASW
and ABF. In the translation stage ASW is distorted compared to ABF.
However, differences shrink with the order of the HOA input signal. The
filtering stage for ABF and ASW is very similar, especially if oracle
information of covariance matrices is not available.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of
dates of the Communication Technology Colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unserer
Kommunikationstechnischen Kolloquiums.
*Dienstag, 12.10.2021*
*Vortragender:* Henning Konermann
*Zeit: *11:00 Uhr
*Ort*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Bachelor-Vortrag*: Modellbasiertes 3D Headtracking zur Anwendung in
adaptiven Übersprechkompensationssystemen
In adaptiven Übersprechkompensationssystemen werden verlässliche 3D
Positions- und Orientierungsdaten eines Zuhörers in Echtzeit benötigt.
Für die Bestimmung dieser Daten wird meist ein zusätzliches physisches
Messsystem installiert. Eine Vermeidung dieses Aufbaus hat erhebliche
Kosten- und Aufwandsvorteile.
Durch das Schätzen der Laufzeit aus den in adaptiven
Übersprechkompensationssystemen verwendeten kopfbezogenen
Impulsantworten kann anhand eines geometrischen Ansatzes Headtracking
betrieben werden. Die Literatur zeigt, dass mit zusätzlichen physischen
Messsystemen geometrisch basiertes 2D und 3D Headtracking möglich ist.
Außerdem wurde gezeigt, dass für den 2D Fall ohne zusätzliches
Messsystem ein modellbasierter Ansatz einem geometrischen Ansatz
überlegen ist.
Ziel dieser Arbeit ist die Erstellung und Realisierung eines Modells,
welches den modellbasierten Ansatz auf den 3D Fall erweitert. Bei der
Erstellung eines durchschnittlichen Modells sowie individualisierter
Modelle werden kopfbezogene Übertragungsfunktionen verwendet. Das
Tracking erfolgt durch die Minimierung eines aus der Literatur
adaptierten Optimierungsproblems. Im Anschluss werden Experimente
simuliert. Die Trackinggenauigkeit verschieden stark personalisierter
Modelle wird mit einem geometrischen Ansatz verglichen. In den meisten
Anwendungsfällen lässt sich die Überlegenheit des Modells feststellen.
Weiterhin wird erkannt, dass das Modell durch Orientierungslimitierungen
einzuschränken ist, um sinnvolle Ergebnisse zu erzielen.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our Communication
Technology Colloquium.
*Friday, October 8, 2021*
*Speaker:* Frederick Pietschmann
*Time:* 10:00 a.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Bachelor Lecture*: Model-Based Algorithms for Retrospective
Individualization of Binaural Signals
With binaural signals, it is possible to capture the sensation of
spatial hearing that humans are used to from real-life auditory scenes.
However, because binaural recordings are always individualized for a
specific target person, the authenticity of a binaural recording is
perceived differently by different listeners. It is therefore desirable
that a binaural recording is always individualized for the respective
listener.
In this thesis, the existing Binaural Cue Adaptation (BCA) system from
the Institute of Communication Systems is adjusted and reinterpreted for
it to be able to retrospectively individualize an existing binaural
signal for a new target person. Existing approaches to the main
components of the BCA system, the Localization block and the Modification
block, are reviewed and analyzed and multiple new concepts and ideas are
introduced and algebraically described: New optimization target
functions, a concept for joint optimization over multiple frequencies
and multiple approaches that attempt to lessen the impact of artifacts
which may occur due to model mismatches and / or erroneous location
estimates.
A theoretical evaluation is conducted in which the localization
performance of both existing and new approaches is benchmarked. The
results confirm that some of the newly proposed ideas and concepts
improve the overall performance of the BCA system.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of
dates of the Communication Technology Colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/