- Kommunikationstechnik-Kolloquium - lists.rwth-aachen.de

Kommunikationstechnisches Kolloquium am IKS
by Irina Esser 10 Nov '22

10 Nov '22

Sehr geehrte Abonnenten des Kolloquium-Newsletters, gerne informieren wir Sie über den nächsten Termin unseres Kommunikationstechnischen Kolloquiums. *Mittwoch, 16. November 2022* *Vortragende:* Jana Lorenz *Zeit*: 09:00 Uhr *Ort*: hybrid - Hörsaal 4G und https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor-Vortrag:* Vergleich adaptiver Algorithmen für die aktive Störgeräuschkompensation bei Kopfhörern in komplexen Schallfeldern Lärmbelästigung ist ein alltägliches Problem. Es gibt viele Ansätze diese zu reduzieren. Kopfhörer mit aktiver Störgeräuschunterdrückung (ANC, engl. Active Noise Cancellation) sind einer davon. Sie kompensieren vor allem tieffrequente Geräusche, welche passiv nur unzureichend gedämpft werden können. Dabei hat das Störgeräusch einen großen Einfluss auf die aktiven Kompensationsmöglichkeiten. Dieses kann sich in realen Umgebungen bezogen auf seine Einfallsrichtung oder seine statistischen Eigenschaften verändern und somit die aktive Störgeräuschunterdrückung erschweren. Daher sind adaptive Feedforward-Algorithmen von großer Bedeutung, da diese sich kontinuierlich an das sich verändernde Störgeräusch anpassen können. So werden in dieser Arbeit zwei ANC-Filterstrukturen, der Filtered-x Least Mean Square Algorithmus (FxLMS) und der Adaptive Linear Combiner (ALC), betrachtet. Es wird untersucht, wie sich die feste adaptiv gewichtete Parallelfilterstruktur des ALC, verglichen zum FxLMS-Algorithmus, auf das Konvergenzverhalten und die aktive Dämpfung auswirkt. In dieser Arbeit werden beide Algorithmen anhand von Schallfeldern verschiedener Komplexität verglichen. Es werden ruhende, sich räumlich bewegende und komplexere Schallfelder verwendet. Dabei erreicht die ALC-Filterstruktur, verglichen zum FxLMS-Algorithmus, auf Kosten einer geringeren maximal möglichen aktiven Dämpfung, eine schnellere Konvergenzzeit. Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich. Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der Termine des Kommunikationstechnischen Kolloquiums finden Sie unter: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) esser(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Kommunikationstechnisches Kolloquium am IKS
by Irina Esser 02 Nov '22

02 Nov '22

Sehr geehrte Abonnenten des Kolloquium-Newsletters, gerne informieren wir Sie über die nächsten Termine unseres Kommunikationstechnischen Kolloquiums. *Mittwoch, 9. November 2022* *Vortragender:* Nils Lattasch *Zeit*: 14:00 Uhr *Ort:* hybrid - Hörsaal 4G und https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor-Vortrag*: Koordinatentranformation für die Adaptive Aktive Störgeräuschunterdrückung in Kopfhörern Bei der aktiven Störgeräuschkompensation wird versucht, ein Störsignal mit einem destruktiv interferierenden Signal auszulöschen. Dazu wird in dieser Arbeit die feedforward Topologie verwendet. Für diese kann gezeigt werden, dass das optimale Filter ein IIR Filter impliziert. Da die Störgeräusche in der Praxis facettenreich sind, ist es ebenfalls sinnvoll adaptive Lösungen zu verwenden. In etablierten adaptiven Verfahren wie dem FxLMS Algorithmus kann versucht werden, ein FIR Filter mit möglichst vielen Filterkoeffizienten zu erstellen, um eine lange Impulsantwort zu generieren. Gleichzeitig ergibt sich jedoch ein Problem in der Adaption, da Systeme in vielen Variablen, in diesem Fall viele Filterkoeffizienten, langsam adaptiert werden können. Eine mögliche Lösung des Problems stellt eine Koordinatentransformation dar, bei der zeitinvariante IIR Filter, Teil eines adaptiven Gesamtsystems sind. Somit kann die Anzahl der Filterkoeffizienten reduziert werden und zusätzlich besitzt das sich ergebende Filter eine unendliche Impulsantwort. Ziel dieser Arbeit ist es eine Koordinatentransformation für die adaptive aktive Störgeräuschkompensation zu realisieren. Die Koordinatentransformation stellt durch eine frei wählbare Polstelle einen neuen Freiheitsgrad im Design eines ANC-Systems dar. In dieser Arbeit wird die Performance und der Ressourcenverbrauch der Koordinatentransformation systematisch untersucht. Zur Implementierung wird das sogenannte Laguerre-Netzwerk verwendet. Daraus ergeben sich neue Adaptionsvorschriften für das adaptive System, welche in dieser Arbeit hergeleitet werden. Zudem wird untersucht, welche Polstellen in dem Anwendungsfall ANC vorteilhaft sind. Des Weiteren wird untersucht, wie eine Konvergenzgrenze bezüglich der maximalen Schrittweite abgeschätzt werden kann. Abschließende Experimente erfolgen auf Basis von Messdaten am Beispiel eines gängigen in-ear Kopfhörers. Es zeigt sich, dass die Performance sensitiv gegenüber der Wahl der Polstelle und Schrittweite ist, bei geeigneter Wahl jedoch die Anzahl der Filterkoeffizienten um ein vielfaches reduziert werden können. und *Mittwoch, 9. November 2022* *Vortragender:* Christian Wolf *Zeit:* 15:0 Uhr *Ort*: hybrid - Hörsaal 4G und https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor-Vortrag*: Auslegung von Übersprechkompensationssystemen mittels Verfahren der robusten Regelung Das Abspielen von Binauralsignalen über Lautsprecher ist, im Gegensatz zur Verwendung von Kopfhörern, durch akustisches Übersprechen nachteilig beeinflusst. Das Übersprechen kann jedoch durch geeignete Vorfilterung des Binauralsignals so modifiziert werden, das es effektiv unterdrückt wird. Dies ist das grundsätzliche Prinzip der Übersprechkompensation. Das Vorfilter wird dabei als Übersprechkompensationsfilter bezeichnet. Eine gängige Methode zur Auslegung eines Übersprechkompensationsfilters ist die Methode der kleinsten Quadrate im Zeitbereich. Diese Arbeit stellt ein Vorgehen vor, mit dem Übersprechkompensationsfilter mithilfe von Verfahren aus der robusten Regelung, wie zum Beispiel der H2- oder H∞-Synthese, entworfen werden können. Zunächst wird gezeigt, wie sich die Übersprechkompensation als Problem der robusten Regelung modellieren lässt, sodass Lösungsverfahren wie die H2- oder H∞-Synthese auf das Problem angewendet werden können. Das allgemeine Problem wird um wählbare Gewichtungsfunktionen erweitert, sodass die Performance des Systems gezielt und frequenzselektiv beeinflusst werden kann. Zudem wird eine Regularisierung bezüglich der Amplitudengänge des Übersprechkompensationsfilters vorgesehen, mit der die Verstärkung des Übersprechkompensationsfilters auf ein in der Praxis sinnvolles Maß reduziert werden kann. Weiterhin wird untersucht, welchen Einfluss die Latenz des akustischen Systems auf die Performance hat und wie dies im Design berücksichtigt werden sollte. Im Anschluss daran wird der erarbeitete Ansatz mit der H2- und H∞-Synthese mit der Methode der kleinsten Quadrate im Zeitbereich verglichen. Durch eine theoretische Herleitung und eine Simulation wird gezeigt, dass die H2-Synthese und die Methode der kleinsten Quadrate im Zeitbereich direkt zusammenhängen. Insbesondere scheint sich die Methode der kleinsten Quadrate im Zeitbereich für große Längen des Übersprechkompensationsfilters den Ergebnissen mit der H2-Synthese anzunähern. Ein weiterer praxisrelevanter Aspekt, der in der Arbeit genauer beleuchtet wird, ist die Unsicherheit im akustischen System und deren Einfluss auf die Performance. Allgemein ist die Unsicherheit in praktischen Systemen verschiedenen Ursachen geschuldet. In dieser Arbeit wird die Unsicherheit durch die Variation von Kopfdrehung und der Geometrie verschiedener Zuhörer genauer untersucht. Es wird gezeigt, dass die Performance im Allgemeinen nicht robust gegenüber den untersuchten Variationen ist, jedoch bei tiefen Frequenzen weniger signifikant als bei hohen Frequenzen. Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich. Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der Termine des Kommunikationstechnischen Kolloquiums finden Sie unter: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Esser 06 Oct '22

06 Oct '22

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Wednesday, October 12, 2022* *Speaker*: Christian Prick *Time:* 10:00 a.m. *Location*: hybrid - Lecture room 4G and https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor-Lecture*: Adaptive Active Noise Cancellation for Headphones using Virtual Sensing In recent years, active noise canceling headphones became widespread because they can attenuate low-frequency noise and therefore complement the passive dampening characteristics of headphones. In order to further improve their performance, methods to increase the attenuation at higher frequencies are of interest. As the main limiting factor for high frequency attenuation is the transfer characteristics from the error microphone to the eardrum, virtual sensing algorithms have to be used. These require a knowledge of the acoustic paths to the eardrum, which are influenced by variations of different kinds. This thesis investigates the influences of the direction of arrival (DoA) and fit on the virtual sensing paths and proposes methods for their approximation. Furthermore, a combined method is proposed, which takes both the fit and DoA into account to approximate the virtual sensing paths at runtime. The resulting performance improvements are evaluated for both in- and over-ear headphones using dummy head measurements and acoustic simulations. The results indicate that the influence of the DoA and fit depend on the type of the headphones. The approximations thus achieve different performance improvements, with the fit generally having a larger impact. For both types of headphones, the combined method is able to improve attenuation. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at: https://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Esser 29 Aug '22

29 Aug '22

Dear subcribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Friday, September 2, 2022* *Speake**r:* Vitor Horst Duque *Time*: 10:00 a.m. *Location*: hybrid - Lecture room 4G and https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master-Lecture:* Autoencoder based Waveform Enhancement for Software Defined Radio Applications Deep learning is a trend that has made it's way into communication systems and gained a lot of interest already. this presentation discusses a machine learning-based approach to add novel features to existing waveform implementations for software defined radios. Such additional features include the reduction of the peak-to-average power ratio, spread spectrum abilities and interference mitigation. The developed machine learning structures use autoencoders to add these features. The results indicate that realizing additional features by means of autoencoders is a promising concept for future waveform development. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at: https://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) esser(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Esser 07 Jun '22

07 Jun '22

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Monday, June 13, 2022* *Speaker*: Zihang Wei *Time*: 02:00 p.m. *Location*: hybrid - Lecture room 4G and https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master-Lecture*: Investigations on Supervised System Identification Algorithms The system identification task aims at inferring the room impulse response of a specific acoustic enclosure. System identification is mandatory in applications such as acoustic echo cancellation and cross talk cancellation. Traditional gradient-based algorithms such as normalized least mean square algorithm uses FIR filters to estimate the RIRs, unfortunately, in a relatively high dimension. A novel dual-stage algorithm is proposed in this thesis. The algorithm performs a state update where allowed states are located on a manifold. In a first stage, an undercomplete autoencoder is trained over the RIR data set. In the second stage, we perform the system identification tasks. Here the problem is reformulated such that the latent state is updated instead of the full impulse response. The trained decoder is then exploited to transform the latent variables to a proper impulse response. Evaluation is made between the reconstructed RIR with reference to the true RIR. In this thesis, at first, the simulation framework generates RIR data set. Then autoencoders with different layer setups are trained on the generated data set. The qualified autoencoders are employed in the inference stage to perform the system identification tasks. Two crucial parameters, i. e., the latent dimension size and updating step size of the manifold are investigated under different SNR conditions. It is demonstrated that under noisy conditions, the proposed method outperforms the traditional NLMS approach. Evaluation results also show that lower bottleneck size design benefits the system identification with adverse noise and *Monday, June 13, 2022* *Speaker*: Johannes Imort *Time*: 03:00 p.m. *Location*: hybrid - Lecture room 4G and https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master-Lecture*: Online Learning of Loudspeaker Nonlineraities for Acoustic Echo Cancellation Hands-free communication is pervasive throughout modern society, requiring a robust cancellation of the echo from the far end speech signal that is emitted from the loudspeaker to the microphone. Acoustic echo cancellation addresses this issue typically by employing linear adaptive filters. However, in reality, the echo path is nonlinear due to the non-ideal characteristics of acoustic transducers and power amplifiers operated at their physical limits. This thesis introduces and investigates a novel approach to tackle nonlinear AEC by estimating the nonlinear reference signal using a deep neural network and a differentiable Kalman filter. The hybrid system is designed to learn loudspeaker nonlinearities directly from data, enabling end-to-end training on data composed of pairs of the far end reference and near end microphone signals. In contrast to previous neural network-based solutions that have been tailored toward one particular loudspeaker, the proposed system aims to be generalizable for different loudspeaker nonlinearities. Therefore, inspired by linear adaptive filtering, the recurrent architecture explicitly takes advantage of the information in the residual echo in order to estimate the nonlinearity adaptively. The proposed approach was evaluated for both simulated and measured data. The results indicate that the architecture could enable faster convergence and better steady state performance than related adaptive approaches. Furthermore, some examples demonstrated that the performance of a method that makes use of oracle knowledge could be surpassed, evidently because the models adapt to the linear acoustic echo path, too. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at: https://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Esser 25 May '22

25 May '22

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Wednesday, June 1, 2022* *Speaker*: Pattaratorn Santivarakom *Time:* 11:15 a.m. *Location*: hybrid - Lecture room 4G and https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor-Lecture*: Sound Source Localization Using Binaural Signals Binaural signals are similar to the sounds that humans hear with their left and right ears when the signal is recorded with a suitable device. As the human auditory system is able to localize sound sources, the directions of sound sources can also be found in binaural signals. This thesis addresses the performance of sound source localization algorithms using binaural signals. The algorithms regarded in this thesis are based on the concept of beamforming. Several algorithms from the literature are systematically compared and their basic building blocks are recombined. From this, a total of 26 variants are identified. The performance of an algorithm is determined by comparing the accuracy of the estimated source source’s directions with other algorithms. Furthermore, error metrics are suggested that take into account the varying signal powers in different time-frequency bins. According to the results, there are two algorithms that always estimate the same source source’s directions and they also provide the most accurate source source’s directions in most situations. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at: https://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) esser(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Esser 19 Apr '22

19 Apr '22

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Monday, April 25, 2022* *Speaker*: Alexander Sobolew *Time*: 03:00 p.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master-Lecture*: Investigation of Specialized Recurrent Units for Acoustic Echo Cancellation In today's communication, hands-free devices e.g. remote communication are widely used. Without further action, these would suffer from an acoustic echo that arises from the coupling between speaker and microphone. To minimize these disturbances, acoustic echo cancellation is indispensable. Model-based adaptive algorithms exist to solve this issue. However, they require careful tuning of parameters whose optimum differs between devices and acoustic situations. In this thesis, a new data-driven approach for acoustic echo cancellation is developed and investigated. In contrast to the purely model-based approach, the algorithm is supposed to learn the optimal performance from data without the necessity to be tuned. In new situations, the unknown parameters should be estimated. At its core, the novel structure is similar to a frequency adaptive filter. However, it is extended by the gating mechanism known from recurrent neural networks. The development also includes the determination of optimal training paradigms. When choosing the model structure, attention is paid to a reasonable training complexity. Major challenges in this thesis include the investigation of the gating mechanism, which is represented by a learn gate and a reset gate. The prior is used to estimate a time-varying step size of the iterative algorithm. Gated Recurrent Units provide an internal memory to accommodate the sequential information in speech, while skip connections optimize the gradient flow during training. Independent use of a reset gate to reset the impulse response estimation in case of situation change is outperformed by exploitation of weight sharing. Using weight sharing, the learn and reset gates have direct information about each other's behavior due to a shared partial network. It was shown that when using backpropagation through time, the truncation order can be reduced to a certain extent, which reduced the training complexity but did not decrease the performance. The developed model outperforms the tuned Fast Block Normalized Least-Mean-Square algorithm in reconvergence speed and steady-state performance in far-end single talk and double talk. Furthermore, our model repeatedly outperforms the tuned diagonalized Kalman filter in certain scenarios and offers significantly improved overall performance in single talk. and *Monday, April 25, 2022* *Speaker*: Alexej Sobolew *Time*: 04:00 p.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master-Lecture*: Investigation of Generative Neural Networks for Speech Enhancement Speech enhancement aims to reduce noise in speech signals and is widely used in hearing aids and mobile speech communication. Speech synthesis, on the other hand, aims to generate high-quality human speech and is used, e. g., in text-to-speech generation. Noise reduction and speech synthesis can be combined since conventional noise reduction methods often only improve the magnitude spectrum and keep the noisy phase. However, the phase has an important influence on speech quality and intelligibility. In addition, training neural networks with complex spectrograms is more difficult, so it is reasonable to first denoise the magnitude spectrum and subsequently synthesize the waveform of the speech based on it. Mentioned applications often require low execution times and low computational overhead. This is achievable by exploiting parallel processors using the non-autoregressive property and by reducing the number of parameters in the neural network. Hence, this thesis aims to investigate noise reduction, speech synthesis, and joint interaction. In this thesis, the first use case considered is phase reconstruction and speech synthesis based on clean data. A non-autoregressive three-stage speech enhancement system is developed for the second use case of combined noise reduction and speech synthesis based on magnitude spectra. For speech synthesis on clean magnitude spectra such as mel-spectrograms, the neural network called WaveGlow from Nvidia is taken as a basis. WaveGlow achieves similar subjective performance as the Griffin-Lim algorithm but is better suited for fast applications. For the reduction of parameters in WaveGlow, the SqueezeWave is used, resulting in a decrease in the number of parameters and the inference time by up to 70%. During the switch to additional noise reduction besides speech synthesis, it is shown that WaveGlow alone is not suitable for performing both tasks simultaneously. Consequently, the problem is divided into three stages: masking, inpainting, and synthesis. The models for masking and inpainting are adapted to mel-spectrograms and studied in detail for noise reduction. As result, they are able to reduce the noise significantly. It is worth noting that in this thesis the mel-filter is used as downsampling adapted to human perception due to its non-linear property, resulting in a reduction of the number of computations in the first two models. Subsequently, the performance of the entire three-stage speech enhancement system is investigated. It improves the speech quality and intelligibility of noisy data while exploiting parallel processors and can compete with existing state-of-the-art methods. The system achieves better noise reduction than the Convolutional Recurrent Network (CRN) and additionally does not rely on the noisy phase. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Esser 25 Mar '22

25 Mar '22

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Wednesday, March, 30, 2022* *Speaker*: Nora Pöhlau *Time*: 10:00 a.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master Lecture*: Performance Evaluation of Sound Field Translation Methods for Recorded Virtual Reality The Higher-Order Ambisonics (HOA) format allows directional recording and playback of sound, making it an attractive tool for spatial audio or immersive sound applications. Because Higher-Order Ambisonics are mathematically based on Spherical Harmonics (SHs), they offer full rotational freedom for the listener (3DoF). However, the sound field can only be correctly reconstructed in a small area around the original recording position due to physical constraints. Three algorithms developed at the Institute for Communications Systems (IKS) make it possible to allow an additional translational movement of the user, even beyond the sweet spot. These algorithms deviate from the physically correct reconstruction in favour of an acoustically plausible playback. In this thesis, the three algorithms of Space Warping (SW), Adaptive Space Warping (ASW) and Adaptive Beamforming (ABF) are perceptually compared by conducting multiple listening tests. ABF and ASW split the sound signal into a primary and an ambient part and apply the translation operation only to the primary part. In two web-based listening tests, it was found that this separation is an acoustically valid approach. It was not distinguishable for the listeners if the primary part contained only direct sound or additional early reflexions. In a second step, a listening test in the laboratory was conducted. Here, the algorithms were compared for different translation distances. For small distances, ABF showed the best performance of all algorithms. ABF introduced fluctuating residual noise for higher distances but still obtained the highest source position ratings. Besides that, a newly proposed variant of SW has proven to perform surprisingly well and scored second best in all ratings behind ABF. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Kommunikationstechnisches Kolloquium am IKS
by Irina Esser 17 Mar '22

17 Mar '22

Sehr geehrte Abonnenten des Kolloquium-Newsletters, gerne informieren wir Sie über den nächsten Termin unseres Kommunikationstechnischen Kolloquiums. *Dienstag, 22. März 2022* *Vortrage**nder*: Anatolii Skovitin *Zeit*: 14:00 Uhr *Ort*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor-Vortrag:* Sprachsynthese von gestörten Betragsspektren mit Methoden des maschinellen Lernens Sprachsignale sind ein wichtiges Mittel der Kommunikation zwischen Menschen. In der digitalen Welt werden Sprachsignale über Telefone oder das Internet übertragen. Dazu müssen sie zunächst in den Zeit-Frequenzbereich transformiert werden. Ein resultierendes Zeit-Frequenzspektrum setzt sich zusammen aus dem Magnituden- und dem Phasenspektrum. Sprachsignale sind häufig Störungen ausgesetzt, bei denen das Nutzsignal verzerrt wird. Es gibt Methoden die diese Verzerrungen abschwächen oder entfernen können. Häufig wird jedoch nur das Magnitudenspektrum betrachtet und das Phasenspektrum bleibt aufgrund seiner vergleichsweise geringen Bedeutung unverändert. Allerdings werden sich im Normalfall die verarbeiteten Magnitudenspektren den perfekten Magnitudenspektren annähern. Andere Methoden liefern überhaupt kein Phasenspektrum, sondern nur eine Schätzung des Magnitudenspektrums. In dieser Arbeit wird die Methode untersucht, die das Phasenspektrum von Sprachsignalen aus den geschätzten oder gestörten Magnitudenspektren rekonstruiert. Zu diesem Zweck werden Ansätze aus dem Bereich des maschinellen Lernens verwendet. Um die Methoden der Phasenrekonstruktion möglichst unabhängig von den spezifischen Störungsarten eines bestimmten Systems zu untersuchen, wird eine künstliche Störung verwendet. Die vorbereiteten Daten werden für das Training der neuronalen Netze verwendet. Die besten Modelle des neuronalen Netzes werden dann ausgewählt. Sie werden auf unterschiedlich gestörten Daten angewendet, um herauszufinden, wie gut die neuronalen Netze für verschiedene Arten von Störungen geeignet sind. Schließlich werden die verschiedenen Phasenrekonstruktionsmethoden angewendet und die resultierenden Sprachsignale bewertet. Außerdem wird ein Vergleich mit dem Griffin-Lim-Algorithmus durchgeführt. Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich. Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der Termine des Kommunikationstechnischen Kolloquiums finden Sie unter: https://www.iks.rwth-aachen.de/aktuelles/kolloquium/ -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) esser(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Einladung Promotionsvortrag
by Irina Esser 10 Mar '22

10 Mar '22

Sehr geehrte Abonnenten des Kolloquium-Newsletters, gerne laden wir Sie zu einem Promotionsvortrag ein. Vortragender: Herr Matthias Schrammen, M. Sc. Thema: *Front-End Signal Processing for Far-Field Speech Communication* Zeit: Freitag, 18. März 2022, 10:00 Uhr Zoom-Meeting: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 <https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09>Meeting-ID: 979 0415 7921 Passwort: 481650 Devices for speech communication operated in hands-free mode offer a very natural way of human communication. The capturing device, e.g., a smartphone, smart speaker or tablet, is often located up to several meters away from the human speaker. Furthermore, detrimental effects like noise and reverberation are present in everyday acoustic environments. Therefore, the signal-to-noise ratio at the microphones mounted on the device is typically too low to offer sufficient speech quality for the listener at the other end of the communication link. In addition, the loudspeaker of the device is located much closer to the microphones than the human speaker. Therefore, a strong echo signal from the loudspeaker couples into the microphones degrading the conversation quality for the remote listener even further. State-of-the-art approaches that tackle the above-mentioned problems usually rely on multiple microphones to improve the signal-to-noise ratio with methods like beamforming. Beamforming combines the digitally filtered signals of several microphones to obtain an enhanced speech signal at the output. In addition, acoustic echo cancellation is employed to attenuate the echo signal more specifically. This is achieved by adaptive estimation of a digital model of the acoustic echo path and subsequent subtraction of the synthesized echo signal from the microphone signal. However, the solutions are usually optimized for one specific device and are only applicable when the positions of the microphones are fixed and known to the algorithm. Furthermore, the combination of multi-microphone enhancement and echo cancellation is not trivial and low complexity solutions are lacking performance in terms of tracking dynamic acoustic scenarios. Finally, low costs, small form factors, and high desired sound pressure levels result in loudspeakers that operate at their physical limits. This adds significant nonlinear components to the sound emitted by the loudspeaker. Therefore, conventional linear acoustic echo cancellation cannot compensate for the nonlinear parts of the echo and the conversational quality is not satisfactory. The task of the dissertation is to alleviate these shortcomings. The developed signal processing algorithms should be more flexible with respect to desired features in real devices. Among these are microphone positions that are unknown or change during operation and the use of beamforming and acoustic echo cancellation at the same time. Furthermore, the developed solutions should be able to handle nonlinear echo paths and should introduce a low computational complexity to be attractive for battery-powered devices, too. Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich. -- Irina Esser Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) esser(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0