Einladung Promotionsvortrag

10 Mar 2022

      Sehr geehrte Abonnenten des Kolloquium-Newsletters,

gerne laden wir Sie zu einem Promotionsvortrag ein.

Vortragender: Herr Matthias Schrammen, M. Sc.
Thema: *Front-End Signal Processing for Far-Field Speech Communication*
Zeit: Freitag, 18. März 2022, 10:00 Uhr
Zoom-Meeting: 
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
<https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09>Meeting-ID: 
979 0415 7921
                     Passwort: 481650

Devices for speech communication operated in hands-free mode offer a 
very natural way of human communication. The capturing device, e.g., a 
smartphone, smart speaker or tablet, is often located up to several 
meters away from the human speaker. Furthermore, detrimental effects 
like noise and reverberation are present in everyday acoustic 
environments. Therefore, the signal-to-noise ratio at the microphones 
mounted on the device is typically too low to offer sufficient speech 
quality for the listener at the other end of the communication link. In 
addition, the loudspeaker of the device is located much closer to the 
microphones than the human speaker. Therefore, a strong echo signal from 
the loudspeaker couples into the microphones degrading the conversation 
quality for the remote listener even further.

State-of-the-art approaches that tackle the above-mentioned problems 
usually rely on multiple microphones to improve the signal-to-noise 
ratio with methods like beamforming. Beamforming combines the digitally 
filtered signals of several microphones to obtain an enhanced speech 
signal at the output. In addition, acoustic echo cancellation is 
employed to attenuate the echo signal more specifically. This is 
achieved by adaptive estimation of a digital model of the acoustic echo 
path and subsequent subtraction of the synthesized echo signal from the 
microphone signal.

However, the solutions are usually optimized for one specific device and 
are only applicable when the positions of the microphones are fixed and 
known to the algorithm. Furthermore, the combination of multi-microphone 
enhancement and echo cancellation is not trivial and low complexity 
solutions are lacking performance in terms of tracking dynamic acoustic 
scenarios. Finally, low costs, small form factors, and high desired 
sound pressure levels result in loudspeakers that operate at their 
physical limits. This adds significant nonlinear components to the sound 
emitted by the loudspeaker. Therefore, conventional linear acoustic echo 
cancellation cannot compensate for the nonlinear parts of the echo and 
the conversational quality is not satisfactory.

The task of the dissertation is to alleviate these shortcomings. The 
developed signal processing algorithms should be more flexible with 
respect to desired features in real devices. Among these are microphone 
positions that are unknown or change during operation and the use of 
beamforming and acoustic echo cancellation at the same time. 
Furthermore, the developed solutions should be able to handle nonlinear 
echo paths and should introduce a low computational complexity to be 
attractive for battery-powered devices, too.

Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht 
erforderlich.

-- 
Irina Esser
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
esser@iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/

Irina Esser

tags

participants (1)