- Kommunikationstechnik-Kolloquium - lists.rwth-aachen.de

Communication Technology Colloquium at IKS
by Irina Ronkartz 26 Nov '20

26 Nov '20

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Thursday, December 3, 2020* *Speaker*: Timothé Scheich *Time*: 11:00 a.m. *L**ocation*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor Lecture*: Methods for the Assessment of Headphones with Active Noise Cancellation Today, Active Noise Cancelling (ANC) headphones are a product daily used by most people. How goog the headphones sound and how well their ANC is working with respect to human perception, is still an open research topic. Only few researchers have been studying the matter and provided working measurement models. In this thesis, we first lay the groundwork for a comprehensive understanding of already existing ANC headphones assessment methods through an extensive research of related work. Therewith, this thesis also motivates the importance of considering psychoacoustic as an assessment tool by presenting several metrics describing pleasantnes and annoyance of a sound. Furthermore, two measurement series are conducted to gather ANC headphone performance measurement data using several background noises. The measurement set-up is explained, and the applied noise databases are presented. Later on, we propose an assessment model using some of the described metrics. Finally, we are able to highlight, by comparing the same type of on-ear headphones, the importance of psychoacoustical parameters that may allow differentiation of similarly performing objects. We conclude by explaining the results and propose a subjective test to corroborate our findings. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of the dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Ronkartz 17 Nov '20

17 Nov '20

Dear Subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Tuesday, November 24, 2020* *Speaker*: Fabian Malig *Time*: 10:00 a.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor Lecture*: Robust Design of Adaptive ANC Systems Considering Self-Induced Sound In addition to passive attenuation, Active Noise Control (ANC) offers a practical way to reduce the power of disturbing noise. Especially in modern headphones this technology finds increasing popularity. Adaptive filters promise a better ANC than time-variant filters since they adjust to a changing environment. Though, they suffer from high sensitivity towards noises caused by the user himself, like speaking or impact sounds. Those noises are called self-induced sounds (SIS) in the following. This thesis covers the Kalman-Filter as adaptive filter in a feedforward ANC-system. It outperforms other adaptive algorithms concerning different disturbances on the system, like self-induced sounds. The goal of this thesis is the development of noise estimators and methods in order to reduce the disturbing influence of self-induced sounds on adaption. All interested parties are cordially invited, registration is not required. Generel information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/atktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Ronkartz 13 Nov '20

13 Nov '20

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Friday, November 20, 2020* *Speaker*: Christian Schlaiß *Time*: 2:00 p.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master Lecture*: Cancelling of Non-Stationary Disturbances in Active Headphones The occlusion effect is a consequence of sealing the ear canals with an object such as a hearing aid or more generally a hearable. It results in an amplification of lower frequencies and attenuation of higher frequencies, which further leads to an unwanted distrotion of the own-voice perception. To compensate this phenomenon, active and passive solutions have been presented in literature. Passive solutions include venting, where a ventilation hole is integrated into the hearing aid. However, this leads to unwanted feedback. An active approch has been presented by Liebich et al., in which a time-variant robust feedback controller compensates the low frequency amplification. This solution is limited to the attenuation of the occlusion effect on the own-voice. Other body-conducted sounds, such as footsteps, chewing and swallowing are not explicitly considered by Liebich et al. In this thesis, body-conducted sounds exceeding the own voice components are investigated and tackled. Measurements were conducted to identify the spectral distribution of different BC sounds due to the occlusion effect. Afterwards, a controller was designed to attenuate disturbances in the region of 40 Hz to 80 Hz. This is followed by the introduction of a stable controller interpolation scheme for switching between controllers. By use of this so called Youla-Kucera interpolation method, not only stability but also performance is guaranteed during switching for a nominal path. Additionally, robust stability for this interpolation scheme can be proven for discrete steps of delta and additional constraints on the choice of Q. This is supplemented by a simple and low complexity approach for detecting footsteps, based on low-pass filter and recursive smoothing over time. Lastly, the controller was implemented in real-time on a dSPACE ultra low latency processing system to validate performance on the footsteps controller and the switching performance. While the controller worked in theory, real-time measurements revealed that additional tuning and possibly additional sensors are needed for a better footsteps controller. However, the switching scheme showed promising results, fading from one controller to the other in a stable and smooth fashion. It also revealed the possibility of real-time controller tuning with the help of the interpolation coefficient. As all combinations of the two controllers (with the interpolation coefficient 2) remain robust stable, switching to different configurations is made possible. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of the dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Ronkartz 11 Nov '20

11 Nov '20

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next dates of our Communication Technology Colloquium. *Wednesday, November 18, 2020* *Speaker:* Leonie Geyer *Time: *2:00 p.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master Lecture*: Sound Field Conversion Using Machine Learning Methods The reproduction of realisitc sound fields is necessary for the efficient evaluation of modern communication devices. Depending on the application, different microphone arrangements are used to record the sound fields. Not all desired signals are directly available in the required microphone configuration. The goal of sound field conversion is to convert any signal between different recording systems in an identical sound field. This thesis compares different approaches to convert sound fields. The conversion between the signals of two concrete measurement systems, a binaural artificial head and a microphone array with eight channel, is investigated. Three new approaches, which use artificial neural networks, are developed. First a convolutional approach, which has a simple end-to-end structure. Secondly, a time filter approach. Here the network outputs a FIR filter, which is applied to the input signal. Third a variant of wavenet, which is divided into to subnetworks. One analyses a section of the time signal and output a feature vector, which is available to the conversion network when converting the time signal. The neural networks are trained using data from real and defined acoustic environments. The performance is measured by metrics in time and frequency domain. As a comparison, sound field conversion by equalising and measuring with the target recording device is performed. Different experiments are conducted on the structure and parametrisation, as well as the training process of the neural networks. Their performance is observed and optimised. The Wavenet variant achieves in all experiments the best results than the neural approaches. Training with a loss function, which includes the mean square error and frequency metrics, can reduce the error in the frequency domain, although a higher error in the time domain is observed, compared to the pure mean square error as loss function. and *Wednesday, November 18, 2020** **Speaker:* Shahd Al Hares *Time*: 3:00 p.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master Lecture*: Signal-Adaptive Approaches to Sound Field Translation With the growing demands for spatial audio, Ambisonics became to attract more attention in both fields of recording and reproduction of sound field. Thus, the demand increased for applying sound field translation that allows the use to move freely in different directions in the acoustic scene. In Higher Order Ambisonics (HOA), the sound field incidence is described in the reference point by a set of mathematical functions known as Spherical Harmonics (SH). However, the reproduction is restricted to the surrounding area of the reference point due to the underlying recording hardware, which can be observed in the Ambisonics domain as bandwidth limitation. In this thesis, an Ambisonics sound field translation is investigated. Recent approaches were proposed that allow the listener to move a few centimeters away from the reference point (3DoF+). Another approach provides further translation of the sound field (6DoF) but requires multiple Shperical Microphone Arrays (SMA) to be used during the recording process. In this thesis, an enhanced method for sound field translation is proposed. It is based on upscaling the Higher Order Ambisonics (HOA) signal to a higher SH order using Compressed Sensing (CS). CS is a framework that is used to recover a signal from an under-determined linear system. Different aspects of HOA upscaling and sound field translation are studied theoretically and practically. Noise reduction with CS for HOA signals is discussed, and the influence of source distance on the translated signal is investigated in the Near-Field-Compensated Higher Order Ambisonics (NFC-HOA) domain. A systematic comparison between multiple translation approaches, namely plane wave translation and space warping, is performed based on two Monte Carlo experiments. Moreover, a new formulae is derived that defines the limits of the upscaling SH order as a function of the normalized translation distance and initial SH order. Finally, a method is introduced for sound field translation of realistic signals based on upscaling in frequency domain. The method is evaluated in frequency domain for multiple Bark bands. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as current list of the dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Ronkartz 09 Nov '20

09 Nov '20

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Monday, November 16, 2020** **Speaker:* Lorenz Schmidt *Time*: 11:00 a.m. *Locatio**n*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master Lecture*: Noise Reduction Combining Conventional Approaches and Artificial Neural Networks The suppression of noise for single channel speech enhancement is one of the most prominent challenges in signal processing and has been addressed for decades. In recent years, the popularization of Machine Learning algorithms and advances in deep neural network (DNN) architectures have opened new perspectives and approaches to this field, yielding impressive results. Many of these algorithms, however, require a computational effort that exceeds the available resources of a real-time application. One approach, called RNNoise, combines the methods of classical signal processing with DNNs. Within a common noise reduction architecture, a small weighting mask is sufficient to achieve impressive results. The mask is estimated by a very small neural network with a low computational complexity. In this thesis, RNNoise is subject to several modifications that are intended to improve its denoising performance, while maintaining its affordable complexity. In a first step, the gated recurrent units (GRUs) of the RNNoise architecture are replaced simple recurrent units (SRUs), which improve its performance while speeding up the training process. The DNN is expanded to estimate the pitch frequency, which is used in the reconstruction of the harmonics with a comb filter. A new binary IIR comb filter is developed and added to the signal processing of RNNoise. Besides the modifications of RNNoise itself, a pitch estimator, based on ordinary regression, and a mutual information metric are developed. The evaluation shows a good performance for pitch estimation and voice activity detection (VAD). A preliminary study analyzes the upper limits, which can be achieved by the reduced spectral weighting mask. With bark scaling, 22 gains are a reasonable tradeoff between performance and complexity. Then, a theoretical evaluation shows that the new network architecture improves the estimation considerably, especially in non-stationary noise situations. A final evaluation compares a classical noise suppression method, an end-to-end neural network approach, classical RNNoise and the improved model by means of their cepstral distances, speech-to-noise enhancement and perceptual measures. The results show that the modifications give the new architecture an edge over classical RNNoise. On the other hand, the developed IIR binary comb filter falls back in the expectation and does not improve noise suppression performance. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Ronkartz 03 Nov '20

03 Nov '20

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our Communication Technology Colloquium. *Tuesday, November 10, 2020* *S**peaker: *Tom Deckenbrunnen *Time:* 10:00 a.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor Lecture*: Significance-Aware Filtering for Nonlinear Acoustic Echo Cancellation The acoustic echoes arising in hands-free applications for mobile speech communication contain considerable nonlinear distortions of the far-end signal. Some causes of these nonlinearities include the progressive miniaturization of components and the need for high play-back volume, two opposing paradigm. It is therefore necessary to use methods of nonlinear acoustic echo cancellation to provide sufficient quality of communication. However, the limited resources in mobile devices render many of these methods to alleviating the high computational complexity. One approach to alleviating the high computational demand is the so-called Significance-Aware filtering. Specifically, the cascaded models based on parallel Significance-Aware decomposition of the system identification task are treated in this thesis. The aim of this thesis is the evaluation and enhancement of those structures. In particular, methods for incorporating nonlinear memory into the parallel Significance-Aware models are proposed. Beyond that, an analysis of the necessary adaptation control for these structures is offered. Results show that better performance can be achieved by the addition of nonlinear memory to the parallel Significance-Aware models. It is furthermore shown that the proposed enhancements solve a specific problem of adaptation control. Nevertheless, adaptation control remains a major point of interest for the parallel Significance-Aware models. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of the dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Communication Technology Colloquium at IKS
by Irina Ronkartz 14 Oct '20

14 Oct '20

Dear subscribers of the colloquium newsletter, we are happy to inform you about the next dates of our Communication Technology Colloquium. *Thursday, October 22, 2020* *Speake**r:* Luis Maßny *Time:* 10:00 a.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master Lecture*: Analysis and Optimization of Multi-Kernel Polar Codes Polar codes are a class of linear block codes that exploit the channel polarization effect in order to provide good error correction capabilities at a low complexity. The channel polarization is based on a recursive channel transformation by a so-called polarization kernel. As a generalization of conventional polar codes, multi-kernel polar codes have been proposed, which allow it to combine polarization kernels of different sizes wihtin a single code. Accordingly, these codes provide many degrees of freedom, which makes it important to properly design such a code in order to optimize its performance. Two major aspects of the multi-kernel polar code design are analyzed in this thesis. Firstly, the design of good polarization kernels for practical codeword lengths is studied. Secondly, the effect of applying a chosen set of polarization kernels in different orders is approached. In order to systematically construct polarization kernels, a recursive construction rule is developed that describes each kernel as a concatenation of smaller kernels. The performance analysis is based on the computation of Z parameters of the polarized information bit channels for binary erasure channel. These yield an upper bound on the block error rate. The observations by error rate simulations over an additive white Gaussian noise channel. The results show that the kernel design depends on the code rate. In particular, it is demonstrated that in some cases an asymptotically suboptimal kernel is the best choice. The optimization of the kernel order in general turns out to be complex and depending on a variety of code parameters. and *Thursday, October 22, 2020* *Spe**aker: *Egke Chatzimoustafa *Time:* 11:00 a.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor Lecture*: Playback Methods for Multichannel Immersive Binaural Sound Reproducing spatial audio recordings with headphones allows listeners to perceive sound signals in 3D, where the listeners evaluate the differene between the recorded signals at both ears to localize sound sources. For a more immersive reproduction, sound sources should appear fixed in space in the case of head rotations. So for, several immersive reproduction methods like Motion Tracked Binaural Sound (MTB) and Binaural Cue Adaptation (BCA) were proposed. where BCA works with two microphones while MTB requires a larger number of microphones. The goal of this bachelor's thesis is to extend the BCA algorithm for more than two microphones and to evaluate this new multichannel system in terms of source localization and binaural cue modification. The thesis show how additional microphones in the multichannel system can be used to improve the sound source localization. reducing the estimation error also for low Signal-to-Noise Ratio (SNR) values. Furthermore, several experiments show that the additional microphones could solve the front/back confusion and could also discriminate sources that are at the top and bottom, regarding the head model. It is further shown how additional microphones can be employed in the cue modification algorithm. Several experiments confirm that additional microphones could increase the quality, modifying coherent components, and reducing incoherent power error and coherent-to-incoherent power ratio error. As the last extension, an adaptive reference channel selection algorithm is introduced that paramerizes the cue modification based on the optimal reference channel. For larger head movements of the listener, this extension can further improve the quality and even eliminate modification errors for certain head orientations. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of the dates of the Communication Technology Colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Kommunikationstechnisches Kolloquium am IKS | Communication Technology Colloquium at IKS
by Irina Ronkartz 22 Sep '20

22 Sep '20

-- English version below -- Sehr geehrte Abonnenten des Kolloquium-Newsletters, gerne informieren wir Sie über den nächsten Termin unseres Kommunikationstechnischen Kolloquiums. *Montag, 28. September 2020* *Vortragender:* Zach Lee *Zeit:* 11:00 Uhr *Ort*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor-Vortrag*: Time-Varying Simulation of Adaptive Crosstalk Cancellation Systems based on Acoustic Measurements In order to reproduce binaural audio via loudspeakers, a crosstalk cancellation (CTC) system is necessary to attenuate the crosstalk. The task is becoming more challenging when the listener is allowed to move, as the CTC system is often robust in a small and limited area. A recent innovative idea known as the adaptive crosstalk cancellation (ACTC) system is proposed. The CTC filter in this system is updated real-time by placing mircophones close to the entrance of the ear canal to measure the signals perceived by the listener and estimate the corresponding head-related impulse response (HRIR). The goal of this thesis is to evaluate the ACTC system by performing an acoustic measurement in an anechoic chamber to obtain the real and accurate HRIR of the test listener. With this knowledge, the performance of the ACTC system can be evaluated in a simulation. Furthermore, the limit of the ACTC system can be tested, in terms of how fast it can adapt to changes. The results show that the ACTC system has a fairly good performance when the listener's ears are on the ipsilateral side of the loudspeakers and the performance falls drastically on the contralateral side. In lower frequencies, the ACTC system works quite well up to a movement (head rotation) speed of 20 deg/s. Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich. Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der Termine des Kommunikationstechnischen Kolloquiums finden Sie unter: http://www.iks.rwth-aachen.de/aktuelles/kolloquium Dear subscirbers of the colloquium newsletter, we are happy to inform you about the next date of our communication technology colloquium. *Monday, September 28, 2020* *Speaker*: Zach Lee *Time:* 11:00 a.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Bachelor Lecture*: Time-Varying Simulation of Adaptive Crosstalk Cancellation Systems based on Acoustic Measurements In order to reproduce binaural audio via loudspeakers, a crosstalk cancellation (CTC) system is necessary to attenuate the crosstalk. The task is becoming more challenging when the listener is allowed to move, as the CTC system is often robust in a small and limited area. A recent innovative idea known as the adaptive crosstalk cancellation (ACTC) system is proposed. The CTC filter in this system is updated real-time by placing microphones close to the entrance of the ear canal to measure the signals perceibed by the listener and estimate the corresponding head-related impulse response (HRIR). The goal of this thesis is to evaluate the ACTC system by performing an acoustic measurement in an anechoic chamber to obtain the real and accurate HRIR of the test listener. With this knowledge, the performance of the ACTC system can be evaluated in a simulation. Furthermore, the limit of the ACTC system can be tested, in terms of how fast it can adapt to changes. The results show that the ACTC system has a fairly good performance when the listener's ears are on the ipsilateral side of the loudspeakers and the performance falls drastically on the contralateral side. In lower frequencies, the ACTC system works quite well up to a movement (head rotation) speed of 20 deg/s. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of the dates of the communication technology colloquium can be found at: http://www.iks.rwth-aachen.de/aktulles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Kommunikationstechnisches Kolloquium am IKS | Communication Technology Colloquium at IKS
by Irina Ronkartz 18 Sep '20

18 Sep '20

--- Englich version below --- Sehr geehrte Abonnenten des Kolloquium-Newsletters, gerne informieren wir Sie über den nächsten Termin unseres Kommunikationstechnischen Kolloquiums. *Freitag, 25. September 2020* *Vortr**agende*: Liuhui Deng *Zeit*: 11:00 Uhr *Ort*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master-Vortrag*: Speech Inpainting Using Image Processing Techniques Speech inpainting is a task that reconstructs speech from damaged speech signals, wherein corruption can result from improper storage, packet loss in communication networks and etc. Neural networks are becoming an active research hot-spot in the field of audio inpainting in recent years, including speech inpainting, music inpainting and etc. The networks can either be fed waveforms of audio or other feature representations such as Short-Time Frequency Transform (STFT), Mel Frequency Cepstral Coefficients (MFCC) and etc. in order to reconstruct audio. In this thesis, advanced Convolutional Neural Networks (CNNs) based architectures in image inpainting are adopted to the task speech inpainting. The motivation lie in the facts that the neural techniques in image inpainting are well investigated and turn out to be powerful, besides, the task speech inpainting can be interpreted as image inpainting when speech spectrogram is treated as 2-dimensional image. The involving networks are mainly context encoder, context encoder with Generative Adversarial Networks (GANs), EdgeConnect (w / o GANs) and EdgeConnect (with GANs). In this work, context encoder is an encoder decoder architecture and takes as input STFT magnitudes (and ground truth corruption mask) while EdgeConnect is fed additionally edge map of spectrogram in order to alleviate the blurriness issue observed in image inpainting. EdgeConnect (w / o GANs) is composed of two sub-models, both of which are a context encoder. One sub-model is referred to as edge completion model which reconstructs edge map from corrupted edge map and the other is inpainting model which reconstructs spectrogram based on correupted spectrogram and edge map. GANs applied in the models of interest are also intended to mitigate the blurriness by adding adversarial loss from GANs to the loss function of context encoder, edge completion model and inpainting model. Experiments indicate that context encoder (w/ or w/o GANs) outperforms the CNNs which are simply stacking a few convolutional layers. EdgeConnect (w/ or w/o GANs) achieves even better performance than context encoder (w/ or w/o GANs) mainly thanks to additional informative edge map of spectrogram. The best model among them is EdgeConnect (with GANs), its reconstructed speeches achieve 3,03 in terms of PESQ score, 71,2% improvement compared to input corrupted speech. Besides, analyses of edge map quality in EdgeConnect (w/ or w/o GANs) reveal that edge map of low quality heavily degrades the inpainting performance, thus a well performing edge completion model is of great importance and is a promising direction to put more effort into in the future. Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich. Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der Termine des Kommunikationstechnischen Kolloquiums finden Sie unter: http://www.iks.rwth-aachen.de/aktuelles/kolloquium/ Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our communication technology colloquium. *Friday, September 25, 2020* *Speaker*: Liuhui Deng *Time*: 11:00 a.m. *Location*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master Lecture*: Speech Inpainting Using Image Processing Techniques Speech inpainting is a task that reconstructs speech from damaged speech signals, wherein corruption can result from improper storage, packet loss in communication networks and etc. Neural networks are becoming an active research hot-spot in the field of audio inpainting in recent years, including speech inpainting, music inpainting and etc. The networks can either be fed waveforms of audio or other feature representations such as Short-Time Frequency Transform (STFT), Mel Frequency Cepstral Coefficients (MFCC) and etc. in order to reconstruct audio. In this thesis, advanced Convolutional Neural Networks (CNNs) based architectures in image inpainting are adopted to the task speech inpainting. The motivation lie in the facts that the neural techniques in image inpainting are well investigated and turn out to be powerful, besides, the task speech inpainting can be interpreted as image inpainting when speech spectrogram is treated as 2-dimensional image. The involving networks are mainly context encoder, context encoder with Generative Adversarial Networks (GANs), EdgeConnect (w / o GANs) and EdgeConnect (with GANs). In this work, context encoder is an encoder decoder architecture and takes as input STFT magnitudes (and ground truth corruption mask) while EdgeConnect is fed additionally edge map of spectrogram in order to alleviate the blurriness issue observed in image inpainting. EdgeConnect (w / o GANs) is composed of two sub-models, both of which are a context encoder. One sub-model is referred to as edge completion model which reconstructs edge map from corrupted edge map and the other is inpainting model which reconstructs spectrogram based on correupted spectrogram and edge map. GANs applied in the models of interest are also intended to mitigate the blurriness by adding adversarial loss from GANs to the loss function of context encoder, edge completion model and inpainting model. Experiments indicate that context encoder (w/ or w/o GANs) outperforms the CNNs which are simply stacking a few convolutional layers. EdgeConnect (w/ or w/o GANs) achieves even better performance than context encoder (w/ or w/o GANs) mainly thanks to additional informative edge map of spectrogram. The best model among them is EdgeConnect (with GANs), its reconstructed speeches achieve 3,03 in terms of PESQ score, 71,2% improvement compared to input corrupted speech. Besides, analyses of edge map quality in EdgeConnect (w/ or w/o GANs) reveal that edge map of low quality heavily degrades the inpainting performance, thus a well performing edge completion model is of great importance and is a promising direction to put more effort into in the future. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of the dates of the communication technology colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0

Kommunikationstechnisches Kolloquium am IKS | Communication Technology Colloquium at IKS
by Irina Ronkartz 25 Aug '20

25 Aug '20

--- English version below --- Sehr geehrte Abonnenten des Kolloquium-Newsletters, gerne informieren wir Sie über den nächsten Termin unseres Kommunikationstechnischen Kolloquiums. *Montag, 31. August 2020* *Vortragender*: Sumedh J. Dongare *Zeit**:* 14:00 Uhr *Ort*: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master-Vortrag*: Optimized Compression Functions for Reduced Complexity Informatin Bottleneck Detection and Decoding The detection and the decoding at the receiver side is of crucial importance and often the optimum signal processing algorithms result in high implementation complexity. Therefore, sub-optimal algotihms with close to optimum performance are needid in practice. The information bottleneck method is a novel method for the detection and decoding with the low complexity that maximizes the mutual information. The main idea of such signal processing method is to design mutual information preserving mappings that replace the traditional signal processing operations to reduce complexity. These mappings are typically implemented as look-up tables. For instance, the literature successfully applies this method for the decoding of binary low-density parity-check codes. The low-density parity-check codes are gaining more and more attention since their non-binary generalization has been found out which has better error correction capabilities than their binary equivalents. Due to the advancements in the computational capabilities of the devies, research in this field is a current hot topic. It turns out that the decoding of the non-binary low-density parity-check codes is an application which does not allow the straight forward application of the mutual information maximizing signals processing. The main problems are that the decoding requires systems with many input variables and in addition, the symbols from higher order fields can take more than two values. As a result the look-up table based approach which works well in case of binary codes, here results in look-up of prohibitive size. This motivates me to explore and investigate compression functions which can maximize the relevant mutual information but can be characterized using much fewer parameters than look-up tables. Such functions are designed in this thesis with a novel approach which relies on the genetic algorithms. Such algorithms are inspired from the natural evolution of the species. The novel approach allows to construct and analyze systems which cannot be designed with the look-up table based approach. This thesis compares the resulting systems to other state-of-the-art signal processing systems in terms of symbol error rate performance and also in terms of the ability to preserve relevant mutual information. The refernce system which is considered typically in this thesis is the soft symbol demodulator which has to be applied when non-binary low-density parity-check codes shall be used with binary modulation. Such demodulators designed using the novel approach are compared whith the look-up table based approach and traditonal soft symbol demodulators. The thesis develops a powerful class of parametrizable mappings that can be optimized using the genetic algorithms. Most importantly, the novel approach allows to achieve performance close to that of a soft symbol demodulator in many investigated scenarios. Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht erforderlich. Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der Termine des Kommunikationstechnischen Kolloquiums finden Sie unter: httP://www.iks.rwth-aachen.de/aktuelles/kolloquium/ Dear subscribers of the colloquium newsletter, we are happy to inform you about the next date of our communication technology colloquium. *Monday, August 31, 2020* *Speaker*: Sumedh J. Dongare *Time*: 2:00 p.m. *Locatio*n: https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09 Meeting-ID: 979 0415 7921 Passwort: 481650 *Master Lecture*: Optimized Compression Functions for Reduced Complexity Informatin Bottleneck Detection and Decoding The detection and the decoding at the receiver side is of crucial importance and often the optimum signal processing algorithms result in high implementation complexity. Therefore, sub-optimal algotihms with close to optimum performance are needid in practice. The information bottleneck method is a novel method for the detection and decoding with the low complexity that maximizes the mutual information. The main idea of such signal processing method is to design mutual information preserving mappings that replace the traditional signal processing operations to reduce complexity. These mappings are typically implemented as look-up tables. For instance, the literature successfully applies this method for the decoding of binary low-density parity-check codes. The low-density parity-check codes are gaining more and more attention since their non-binary generalization has been found out which has better error correction capabilities than their binary equivalents. Due to the advancements in the computational capabilities of the devies, research in this field is a current hot topic. It turns out that the decoding of the non-binary low-density parity-check codes is an application which does not allow the straight forward application of the mutual information maximizing signals processing. The main problems are that the decoding requires systems with many input variables and in addition, the symbols from higher order fields can take more than two values. As a result the look-up table based approach which works well in case of binary codes, here results in look-up of prohibitive size. This motivates me to explore and investigate compression functions which can maximize the relevant mutual information but can be characterized using much fewer parameters than look-up tables. Such functions are designed in this thesis with a novel approach which relies on the genetic algorithms. Such algorithms are inspired from the natural evolution of the species. The novel approach allows to construct and analyze systems which cannot be designed with the look-up table based approach. This thesis compares the resulting systems to other state-of-the-art signal processing systems in terms of symbol error rate performance and also in terms of the ability to preserve relevant mutual information. The refernce system which is considered typically in this thesis is the soft symbol demodulator which has to be applied when non-binary low-density parity-check codes shall be used with binary modulation. Such demodulators designed using the novel approach are compared whith the look-up table based approach and traditonal soft symbol demodulators. The thesis develops a powerful class of parametrizable mappings that can be optimized using the genetic algorithms. Most importantly, the novel approach allows to achieve performance close to that of a soft symbol demodulator in many investigated scenarios. All interested parties are cordially invited, registration is not required. General information on the colloquium, as well as a current list of the dates of the communication technology colloquium can be found at: http://www.iks.rwth-aachen.de/aktuelles/kolloquium -- Irina Ronkartz Institute of Communication Systems (IKS) RWTH Aachen University Muffeter Weg 3a, 52074 Aachen, Germany +49 241 80 26958 (phone) ronkartz(a)iks.rwth-aachen.de http://www.iks.rwth-aachen.de/

1 0