Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next dates of our Communication
Technology Colloquium.
*Thursday, October 22, 2020*
*Speake**r:* Luis Maßny
*Time:* 10:00 a.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master Lecture*: Analysis and Optimization of Multi-Kernel Polar Codes
Polar codes are a class of linear block codes that exploit the channel
polarization effect in order to provide good error correction
capabilities at a low complexity. The channel polarization is based on a
recursive channel transformation by a so-called polarization kernel. As
a generalization of conventional polar codes, multi-kernel polar codes
have been proposed, which allow it to combine polarization kernels of
different sizes wihtin a single code. Accordingly, these codes provide
many degrees of freedom, which makes it important to properly design
such a code in order to optimize its performance. Two major aspects of
the multi-kernel polar code design are analyzed in this thesis. Firstly,
the design of good polarization kernels for practical codeword lengths
is studied. Secondly, the effect of applying a chosen set of
polarization kernels in different orders is approached.
In order to systematically construct polarization kernels, a recursive
construction rule is developed that describes each kernel as a
concatenation of smaller kernels. The performance analysis is based on
the computation of Z parameters of the polarized information bit
channels for binary erasure channel. These yield an upper bound on the
block error rate. The observations by error rate simulations over an
additive white Gaussian noise channel.
The results show that the kernel design depends on the code rate. In
particular, it is demonstrated that in some cases an asymptotically
suboptimal kernel is the best choice. The optimization of the kernel
order in general turns out to be complex and depending on a variety of
code parameters.
and
*Thursday, October 22, 2020*
*Spe**aker: *Egke Chatzimoustafa
*Time:* 11:00 a.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Bachelor Lecture*: Playback Methods for Multichannel Immersive Binaural
Sound
Reproducing spatial audio recordings with headphones allows listeners to
perceive sound signals in 3D, where the listeners evaluate the differene
between the recorded signals at both ears to localize sound sources. For
a more immersive reproduction, sound sources should appear fixed in
space in the case of head rotations. So for, several immersive
reproduction methods like Motion Tracked Binaural Sound (MTB) and
Binaural Cue Adaptation (BCA) were proposed. where BCA works with two
microphones while MTB requires a larger number of microphones. The goal
of this bachelor's thesis is to extend the BCA algorithm for more than
two microphones and to evaluate this new multichannel system in terms of
source localization and binaural cue modification.
The thesis show how additional microphones in the multichannel system
can be used to improve the sound source localization. reducing the
estimation error also for low Signal-to-Noise Ratio (SNR) values.
Furthermore, several experiments show that the additional microphones
could solve the front/back confusion and could also discriminate sources
that are at the top and bottom, regarding the head model. It is further
shown how additional microphones can be employed in the cue modification
algorithm. Several experiments confirm that additional microphones could
increase the quality, modifying coherent components, and reducing
incoherent power error and coherent-to-incoherent power ratio error. As
the last extension, an adaptive reference channel selection algorithm is
introduced that paramerizes the cue modification based on the optimal
reference channel. For larger head movements of the listener, this
extension can further improve the quality and even eliminate
modification errors for certain head orientations.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of the
dates of the Communication Technology Colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
-- English version below --
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres
Kommunikationstechnischen Kolloquiums.
*Montag, 28. September 2020*
*Vortragender:* Zach Lee
*Zeit:* 11:00 Uhr
*Ort*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Bachelor-Vortrag*: Time-Varying Simulation of Adaptive Crosstalk
Cancellation Systems based on Acoustic Measurements
In order to reproduce binaural audio via loudspeakers, a crosstalk
cancellation (CTC) system is necessary to attenuate the crosstalk. The
task is becoming more challenging when the listener is allowed to move,
as the CTC system is often robust in a small and limited area. A recent
innovative idea known as the adaptive crosstalk cancellation (ACTC)
system is proposed. The CTC filter in this system is updated real-time
by placing mircophones close to the entrance of the ear canal to measure
the signals perceived by the listener and estimate the corresponding
head-related impulse response (HRIR).
The goal of this thesis is to evaluate the ACTC system by performing an
acoustic measurement in an anechoic chamber to obtain the real and
accurate HRIR of the test listener. With this knowledge, the performance
of the ACTC system can be evaluated in a simulation. Furthermore, the
limit of the ACTC system can be tested, in terms of how fast it can
adapt to changes.
The results show that the ACTC system has a fairly good performance when
the listener's ears are on the ipsilateral side of the loudspeakers and
the performance falls drastically on the contralateral side. In lower
frequencies, the ACTC system works quite well up to a movement (head
rotation) speed of 20 deg/s.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
Dear subscirbers of the colloquium newsletter,
we are happy to inform you about the next date of our communication
technology colloquium.
*Monday, September 28, 2020*
*Speaker*: Zach Lee
*Time:* 11:00 a.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Bachelor Lecture*: Time-Varying Simulation of Adaptive Crosstalk
Cancellation Systems based on Acoustic Measurements
In order to reproduce binaural audio via loudspeakers, a crosstalk
cancellation (CTC) system is necessary to attenuate the crosstalk. The
task is becoming more challenging when the listener is allowed to move,
as the CTC system is often robust in a small and limited area. A recent
innovative idea known as the adaptive crosstalk cancellation (ACTC)
system is proposed. The CTC filter in this system is updated real-time
by placing microphones close to the entrance of the ear canal to measure
the signals perceibed by the listener and estimate the corresponding
head-related impulse response (HRIR).
The goal of this thesis is to evaluate the ACTC system by performing an
acoustic measurement in an anechoic chamber to obtain the real and
accurate HRIR of the test listener. With this knowledge, the performance
of the ACTC system can be evaluated in a simulation. Furthermore, the
limit of the ACTC system can be tested, in terms of how fast it can
adapt to changes.
The results show that the ACTC system has a fairly good performance when
the listener's ears are on the ipsilateral side of the loudspeakers and
the performance falls drastically on the contralateral side. In lower
frequencies, the ACTC system works quite well up to a movement (head
rotation) speed of 20 deg/s.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of the
dates of the communication technology colloquium can be found at:
http://www.iks.rwth-aachen.de/aktulles/kolloquium
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
--- Englich version below ---
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres
Kommunikationstechnischen Kolloquiums.
*Freitag, 25. September 2020*
*Vortr**agende*: Liuhui Deng
*Zeit*: 11:00 Uhr
*Ort*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master-Vortrag*: Speech Inpainting Using Image Processing Techniques
Speech inpainting is a task that reconstructs speech from damaged speech
signals, wherein corruption can result from improper storage, packet
loss in communication networks and etc. Neural networks are becoming an
active research hot-spot in the field of audio inpainting in recent
years, including speech inpainting, music inpainting and etc. The
networks can either be fed waveforms of audio or other feature
representations such as Short-Time Frequency Transform (STFT), Mel
Frequency Cepstral Coefficients (MFCC) and etc. in order to reconstruct
audio.
In this thesis, advanced Convolutional Neural Networks (CNNs) based
architectures in image inpainting are adopted to the task speech
inpainting. The motivation lie in the facts that the neural techniques
in image inpainting are well investigated and turn out to be powerful,
besides, the task speech inpainting can be interpreted as image
inpainting when speech spectrogram is treated as 2-dimensional image.
The involving networks are mainly context encoder, context encoder with
Generative Adversarial Networks (GANs), EdgeConnect (w / o GANs) and
EdgeConnect (with GANs).
In this work, context encoder is an encoder decoder architecture and
takes as input STFT magnitudes (and ground truth corruption mask) while
EdgeConnect is fed additionally edge map of spectrogram in order to
alleviate the blurriness issue observed in image inpainting. EdgeConnect
(w / o GANs) is composed of two sub-models, both of which are a context
encoder. One sub-model is referred to as edge completion model which
reconstructs edge map from corrupted edge map and the other is
inpainting model which reconstructs spectrogram based on correupted
spectrogram and edge map. GANs applied in the models of interest are
also intended to mitigate the blurriness by adding adversarial loss from
GANs to the loss function of context encoder, edge completion model and
inpainting model. Experiments indicate that context encoder (w/ or w/o
GANs) outperforms the CNNs which are simply stacking a few convolutional
layers. EdgeConnect (w/ or w/o GANs) achieves even better performance
than context encoder (w/ or w/o GANs) mainly thanks to additional
informative edge map of spectrogram. The best model among them is
EdgeConnect (with GANs), its reconstructed speeches achieve 3,03 in
terms of PESQ score, 71,2% improvement compared to input corrupted
speech. Besides, analyses of edge map quality in EdgeConnect (w/ or w/o
GANs) reveal that edge map of low quality heavily degrades the
inpainting performance, thus a well performing edge completion model is
of great importance and is a promising direction to put more effort into
in the future.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium/
Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our communication
technology colloquium.
*Friday, September 25, 2020*
*Speaker*: Liuhui Deng
*Time*: 11:00 a.m.
*Location*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master Lecture*: Speech Inpainting Using Image Processing Techniques
Speech inpainting is a task that reconstructs speech from damaged speech
signals, wherein corruption can result from improper storage, packet
loss in communication networks and etc. Neural networks are becoming an
active research hot-spot in the field of audio inpainting in recent
years, including speech inpainting, music inpainting and etc. The
networks can either be fed waveforms of audio or other feature
representations such as Short-Time Frequency Transform (STFT), Mel
Frequency Cepstral Coefficients (MFCC) and etc. in order to reconstruct
audio.
In this thesis, advanced Convolutional Neural Networks (CNNs) based
architectures in image inpainting are adopted to the task speech
inpainting. The motivation lie in the facts that the neural techniques
in image inpainting are well investigated and turn out to be powerful,
besides, the task speech inpainting can be interpreted as image
inpainting when speech spectrogram is treated as 2-dimensional image.
The involving networks are mainly context encoder, context encoder with
Generative Adversarial Networks (GANs), EdgeConnect (w / o GANs) and
EdgeConnect (with GANs).
In this work, context encoder is an encoder decoder architecture and
takes as input STFT magnitudes (and ground truth corruption mask) while
EdgeConnect is fed additionally edge map of spectrogram in order to
alleviate the blurriness issue observed in image inpainting. EdgeConnect
(w / o GANs) is composed of two sub-models, both of which are a context
encoder. One sub-model is referred to as edge completion model which
reconstructs edge map from corrupted edge map and the other is
inpainting model which reconstructs spectrogram based on correupted
spectrogram and edge map. GANs applied in the models of interest are
also intended to mitigate the blurriness by adding adversarial loss from
GANs to the loss function of context encoder, edge completion model and
inpainting model. Experiments indicate that context encoder (w/ or w/o
GANs) outperforms the CNNs which are simply stacking a few convolutional
layers. EdgeConnect (w/ or w/o GANs) achieves even better performance
than context encoder (w/ or w/o GANs) mainly thanks to additional
informative edge map of spectrogram. The best model among them is
EdgeConnect (with GANs), its reconstructed speeches achieve 3,03 in
terms of PESQ score, 71,2% improvement compared to input corrupted
speech. Besides, analyses of edge map quality in EdgeConnect (w/ or w/o
GANs) reveal that edge map of low quality heavily degrades the
inpainting performance, thus a well performing edge completion model is
of great importance and is a promising direction to put more effort into
in the future.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of the
dates of the communication technology colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
--- English version below ---
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres
Kommunikationstechnischen Kolloquiums.
*Montag, 31. August 2020*
*Vortragender*: Sumedh J. Dongare
*Zeit**:* 14:00 Uhr
*Ort*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master-Vortrag*: Optimized Compression Functions for Reduced Complexity
Informatin Bottleneck Detection and Decoding
The detection and the decoding at the receiver side is of crucial
importance and often the optimum signal processing algorithms result in
high implementation complexity. Therefore, sub-optimal algotihms with
close to optimum performance are needid in practice. The information
bottleneck method is a novel method for the detection and decoding with
the low complexity that maximizes the mutual information. The main idea
of such signal processing method is to design mutual information
preserving mappings that replace the traditional signal processing
operations to reduce complexity. These mappings are typically
implemented as look-up tables. For instance, the literature successfully
applies this method for the decoding of binary low-density parity-check
codes. The low-density parity-check codes are gaining more and more
attention since their non-binary generalization has been found out which
has better error correction capabilities than their binary equivalents.
Due to the advancements in the computational capabilities of the devies,
research in this field is a current hot topic. It turns out that the
decoding of the non-binary low-density parity-check codes is an
application which does not allow the straight forward application of the
mutual information maximizing signals processing. The main problems are
that the decoding requires systems with many input variables and in
addition, the symbols from higher order fields can take more than two
values. As a result the look-up table based approach which works well in
case of binary codes, here results in look-up of prohibitive size.
This motivates me to explore and investigate compression functions which
can maximize the relevant mutual information but can be characterized
using much fewer parameters than look-up tables. Such functions are
designed in this thesis with a novel approach which relies on the
genetic algorithms. Such algorithms are inspired from the natural
evolution of the species. The novel approach allows to construct and
analyze systems which cannot be designed with the look-up table based
approach. This thesis compares the resulting systems to other
state-of-the-art signal processing systems in terms of symbol error rate
performance and also in terms of the ability to preserve relevant mutual
information.
The refernce system which is considered typically in this thesis is the
soft symbol demodulator which has to be applied when non-binary
low-density parity-check codes shall be used with binary modulation.
Such demodulators designed using the novel approach are compared whith
the look-up table based approach and traditonal soft symbol
demodulators. The thesis develops a powerful class of parametrizable
mappings that can be optimized using the genetic algorithms. Most
importantly, the novel approach allows to achieve performance close to
that of a soft symbol demodulator in many investigated scenarios.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
httP://www.iks.rwth-aachen.de/aktuelles/kolloquium/
Dear subscribers of the colloquium newsletter,
we are happy to inform you about the next date of our communication
technology colloquium.
*Monday, August 31, 2020*
*Speaker*: Sumedh J. Dongare
*Time*: 2:00 p.m.
*Locatio*n:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master Lecture*: Optimized Compression Functions for Reduced Complexity
Informatin Bottleneck Detection and Decoding
The detection and the decoding at the receiver side is of crucial
importance and often the optimum signal processing algorithms result in
high implementation complexity. Therefore, sub-optimal algotihms with
close to optimum performance are needid in practice. The information
bottleneck method is a novel method for the detection and decoding with
the low complexity that maximizes the mutual information. The main idea
of such signal processing method is to design mutual information
preserving mappings that replace the traditional signal processing
operations to reduce complexity. These mappings are typically
implemented as look-up tables. For instance, the literature successfully
applies this method for the decoding of binary low-density parity-check
codes. The low-density parity-check codes are gaining more and more
attention since their non-binary generalization has been found out which
has better error correction capabilities than their binary equivalents.
Due to the advancements in the computational capabilities of the devies,
research in this field is a current hot topic. It turns out that the
decoding of the non-binary low-density parity-check codes is an
application which does not allow the straight forward application of the
mutual information maximizing signals processing. The main problems are
that the decoding requires systems with many input variables and in
addition, the symbols from higher order fields can take more than two
values. As a result the look-up table based approach which works well in
case of binary codes, here results in look-up of prohibitive size.
This motivates me to explore and investigate compression functions which
can maximize the relevant mutual information but can be characterized
using much fewer parameters than look-up tables. Such functions are
designed in this thesis with a novel approach which relies on the
genetic algorithms. Such algorithms are inspired from the natural
evolution of the species. The novel approach allows to construct and
analyze systems which cannot be designed with the look-up table based
approach. This thesis compares the resulting systems to other
state-of-the-art signal processing systems in terms of symbol error rate
performance and also in terms of the ability to preserve relevant mutual
information.
The refernce system which is considered typically in this thesis is the
soft symbol demodulator which has to be applied when non-binary
low-density parity-check codes shall be used with binary modulation.
Such demodulators designed using the novel approach are compared whith
the look-up table based approach and traditonal soft symbol
demodulators. The thesis develops a powerful class of parametrizable
mappings that can be optimized using the genetic algorithms. Most
importantly, the novel approach allows to achieve performance close to
that of a soft symbol demodulator in many investigated scenarios.
All interested parties are cordially invited, registration is not required.
General information on the colloquium, as well as a current list of the
dates of the communication technology colloquium can be found at:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres
Kommunikationstechnischen Kolloquiums.
*Mittwoch, 15. Juli 2020*
*Vortragender:* Jérôme Biot
*Zeit:* 11:00 Uhr
*Ort*:
https://rwth.zoom.us/j/97904157921?pwd=SWpsbDl0MWhrWjY1ZkZaeFRoYmErZz09
Meeting-ID: 979 0415 7921
Passwort: 481650
*Master-Vortrag*: Acoustic Head-Tracking in Adaptive Crosstalk
Cancellation Systems
In order to accurately reproduce an acoustic scene for a listener, a
technique known as binaural reproduction is used. A binaural signal
consists of two channels containing a different signal for each ear
depending on the listener's position relative to the sound source. When
trying to replicate a binaural signal over headphones, perfect channel
separation is guarantedd, meaning that the binaural signals designated
for the left and right ear respectively are only transmitted to the
corresponding ear. When a loudspeaker system is used on the other hand,
crosstalk between the two binaural signals is an issue and has to be
cancelled by the corresponding crosstalk cancellation (CTC) filters. In
adaptive crosstalk cancellation (ACTC) systems, these filters are
adapted in real-time by placing two microphones close to the ear
positions to measure the incoming sound signals and estimate the
corresponding impulse responses.
The goal of this thesis is to acoustically estimate the position and
orientation of a listener's head based on these impulse responses in
order to adapt the binaural signals of the acoustic scene. Based on
geometric relations in free space, an estimation of the position and
orientation of the listener can be done. Several algorithms for 2D and
3D space have been implemented and compared against each other in terms
of accuracy and real-time capability.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium/
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über die nächsten Termine unseres
Kommunikationstechnischen Kolloquiums.
*Mittwoch, 17. Juni 2020*
*Vortragender:* Jingcheng Tian
*Zeit:* 11:00 Uhr
*Webex Meeting*:
https://meetingsemea22.webex.com/meetingsemea22-de/j.php?MTID=md260f6a6859c…
Meeting-Kennummer (Zugriffscode): 162 224
7271
Meeting-Passwort: 505027
*Master-Vortrag*: End-to-End Speech Inpainting Using Convolutinal
Network Structures
Speech signals are often subject to interference or damage in the time
or frequency domain during transmission. There are many ways to address
these disturbances. One of them is Packet Loss Concealment (PLC), which
is a technology designed to minimize the practical effect of lost
packets in digital communications. Bandwidth Extension (BWE), on the
other hand, is the process of extending the frequency range of a signal.
Speech inpainting, a generalized version of BWE and PLC, refers to the
loss of the signal at any time and any frequency, rather than a fixed
time and frequency. The term inpainting comes from image inpainting,
which comprises a subarea of digital image processing, where already
many deep learning-based techniques for reconstruction of broken
pictures exist. However, in the field of speech, this technology has not
been widely spread. Only a few dictionary-based speech signal inpainting
approaches exist. In this work we intodruce a model for solving speech
inpainting task. Learning-based methods have been proven to have a
better performance compared to traditional algorithms in front-end
processing, such as speech noise reduction and BWE. However, most
algorithms extract features and use magnitude spectrograms as input to
the model. One disadvantage of this is the lack of phase information.
WaveNet, the very famous model which is used for speech synthesis, uses
a dilated CNN to directly generate raw audio. This thesis uses a
modified WaveNet to make it directly read raw speech and generate
speech, that is, the input and output of the model are lossless and no
information is lost. Instead, CNN does the feature extraction
automatically in the first layers. At the same time, the huge space
complexity required by WaveNet is reduced. In addition, the
characteristics of the causality are modified into symmetry. The model
can not only see the past information, but also the future information,
which increases the receptive field and improves the accuracy of the
model. We also introduce different loss functions for comparison. In the
experiments, we tried different types of data, different noise,
different loss of time and frequency. Computational evaluation shows
that this method can reconstruct speech signals, not only in magnitude,
but also in phase.
und
*Mittwoch, 17. Juni 2020*
*Vortragender*: Daniel Wilhelm
*Zeit:* 14:00 Uhr
*Zoom-Meeting*:
https://rwth.zoom.us/j/91765335911?pwd=TFFHdTBlWStyR1lVU25IbGdCWmNJdz09
Meeting-ID: 917 6533 5911
Passwort: 297152
*Bachelor-Vortrag*: Rekonstruktion des Phasenspektrums von
Sprachsignalen mit Machine-Learning-Algorithmen
Die Verarbeitung von Audio- bzw. Sprachsignalen findet häufig im
Zeit-Frequenzspektrum statt. Dieses setzt sich zusammen aus dem
Magnituden- und dem Phasenspektrum. Da das Magnitudenspektrum relevanter
für die Verständlichkeit von Sprachsignalen ist, werden Berechnungen,
wie z. B. eine Störgeräuschreduktion, oft nur mit diesem durchgeführt
und das Phasenspektrum wird unverändert übernommen. Um eine möglichst
optimale Sprachqualität zu erhalten, muss jedoch auch das Phasenspektrum
berücksichtigt werden. Eine Möglichkeit ist es, das Phasenspektrum auf
Basis des verbesserten Magnitudenspektrums zu rekonstruieren. Ein weit
verbreiteter Ansatz hierfür ist der Griffin-Lim-Algorithmus. Dieser ist
ein iterativer Algorithmus, der als Eingabe nur das Magnitudenspektrum
erhält und sich dann in jedem Schritt mit dem passenden Zeitsignal
annähert. Eine hinreichende Sprachqualität erfordert jedoch
typischerweise viele Iterationen, die auf das gesamte Signal wirken, was
dazu führt, dass ein hoher Rechenaufwand entsteht und der Einsatz in
einer Echtzeit-Implementierung erschwert wird.
In dieser Arbeit wird daher ein anderer Ansatz zur Rekonstruktion des
Phasenspektrums aus dem Magnitudenspektrum eines Sprachsignals
entwickelt, bei dem das Phasenspektrum mit Hilfe von einem
Machine-Learning-Algorithmus geschätzt wird. Ebenfalls wird die
Anwendungsmöglichkeit auf beschädigte Sprachsignale, bei denen Lücken im
Zeit-Frequenzspektrum vorhanden sind, untersucht. In dem hier
vorgestellten Algorithmus werden die Ableitungen des Phasenspektrums
(nach der Zeit und nach der Frequenz) von einem neuronalen Netz
geschätzt und danach zu einem möglichst passenden Phasenspektrum
zusammengeführt. Es werden passende Vorverarbeitungsschritte für die an
das neuronale Netz zu übergebenden Daten gesucht. Eine Reduktion der zu
verarbeitenden Datenmenge wird vorgeschlagen, um den Rechenaufwand zu
reduzieren. Anschließend werden verschiedene Versuche durchgeführt, um
die Schätzung des Phasenspektrums zu verbessern. Dabei werden u.a. die
Komplexität und weitere Eigenschaften des neuronalen Netzes variiert,
sowie mehrere Möglichkeiten zur Zusammensetzung des Phasenspektrums aus
den Phasenableitungen eingesetzt. Ein Vergleich mit dem
Griffin-Lim-Algorithmus wird ebenfalls durchgeführt. Zum Abschluss
werden beschädigte Sprachsignale betrachtet und die
Anwendungsmöglichkeit des erarbeiteten Algorithmus für diesen Fall
bewertet.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium/
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne laden wir Sie zu einem Promotionsvortrag ein.
Vortragender: Herr Stefan Liebich, M. Sc.
Thema: *Active Noise and Occlusion Effect Cancellation in Headphones and
Hearing Aids*
Zeit: Montag, 18. Mai 2020, 11:00 Uhr
Zoom-Meeting: https://us02web.zoom.us/j/88156655771
Meeting-ID: 881 5665 5771
Passwort: 471696
The perception of one's own voice is distorted when telephoning with
headsets, or wearing hearing aids. The reason for this is the so-called
occlusion effect. The ear canals are completely or partially closed by
the headset or hearing aid. The occlusion causes amplification of low
frequencies, and attenuation of high frequencies of one's own voice. The
unnatural perception of one's own voice and of noise caused by chewing
and swallowing are among the most common complaints of users.
Furthermore, environmental noise might impair perception. In this
thesis, both the unnatural perception of one's own voice and the
disturbance by environmental noise are tackled by novel signal
processing approaches.
The proposed solution solves the problem of the occlusion effect by
actively emiiting a compensation signal through the integrated
loudspeaker. This novel approach combines methods of active noise
cancellation (ANC, Noise Cancelling Headphone) with a personalized
design. The binaural headset contains two additional microphones per
side, one inner and one outer, to acquire signals for the calculation of
the compensation signals.
A combination of feedback and feedforward digital filters allows for
either approaching personal silence or a natural perception of one's own
voice and the surroundings.
The main contributions are:
* Novel design concept for ANC / OEC systems which are robust w.r.t.
acoustical variations of ear canals and earpiece fittings
* Novel structure for combined feedback-feedforward filters with
adaptive stability control
* Analysis of variability of acoustic front-end (headset) as well as
electronics back-end (digital signal processing incl.
AD/DA-conversion) and implications on ANC/OEC performance
* Real-time implementation, instrumental and auditive evaluations
The achieved ANC performance is comparable to that of a commercial
reference system. The OEC performance revealed in both objective
measurements and subjective listening tests, significant improvements of
the own-voice perception.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
u.a. finden Sie den Link zur Videokonferenz zum Mastervortrag von Nikita Airee. Bitte beachten Sie, dass, aufgrund Überlastung des Systems, die Anfangszeit auf 8:30 Uhr vorverlegt wurde.
viele Grüße
Irina Ronkartz
-------- Original-Nachricht --------
Betreff: Final Presentation Master Thesis Nikita Airee
Datum: Dienstag, März 24, 2020 09:50 CET
Von: "Adrat, Marc" <marc.adrat(a)fkie.fraunhofer.de>
An: Christiane Antweiler <antweiler(a)iks.rwth-aachen.de>, Irina Ronkartz <ronkartz(a)iks.rwth-aachen.de>, Peter Jax <jax(a)iks.rwth-aachen.de>, "Nikita Airee" <nikitaairee(a)gmail.com>, "Saha, Souradip" <souradip.saha(a)fkie.fraunhofer.de>, Souradip Saha <souradipsaha128(a)gmail.com>, "Antweiler, Markus" <markus.antweiler(a)fkie.fraunhofer.de>
Topic: Optimized Power Allocation in NOMA-based Overlay Cognitive Radio Network
***DO NOT DELETE OR CHANGE ANY OF THE TEXT BELOW THIS LINE***
Marc Adrat has scheduled this WebEx meeting.
Final Presentation Master Thesis Nikita Airee
Host: Marc Adrat
When it's time, start or join the WebEx meeting from here:
https://conference.fraunhofer.de/orion/joinmeeting.do?MTID=d6d66de63cb6b5c3…
Access Information
Meeting Number: 992 502 355
Meeting Password: (This meeting does not require a password.)
Audio Connection
+4971197077777 (WebEx)
Access Code:
992 502 355
Hosts, need your host access code or key? Go to the meeting information page:
https://conference.fraunhofer.de/orion/meeting/meetingInfo?MTID=5fbce4dbf75…
The conference.fraunhofer.de team
Need help?
https://www.fraunhofer.de/
-------- Original-Nachricht --------
Betreff: Kommunikationstechnisches Kolloquium am IKS | online
Datum: Montag, März 23, 2020 12:07 CET
Von: Irina Ronkartz <ronkartz(a)iks.rwth-aachen.de>
An: kommunikationstechnik-kolloquium(a)lists.rwth-aachen.de
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres Kommunikationstechnischen Kolloquiums.
Freitag, 27. März 2020
Vortragende: Nikita Airee
Ort: online
Zeit: 10:30 Uhr | neue Startzeit 8:30 Uhr
Master-Vortrag: Optimized Power Allocation on NOMA-based Overlay Cognitive Radio Networks
Spectral efficiency (SE) is crucial for wireless networks due to the scarcity of wireless spectrum and the ever increasing number of users that need access to it. Cognitive radios (CRs) are a prominent solution for SE as they facilitate spectrum sharing between tow or more users. Another technique for improving SE, which has recently come in to focus, os nonorthogonal multiple access (NOMA). Together, they have immense potential for enabling highly spectrally efficient communication. A type of NOMA called power domain nonorthogonal multiple access (PD-NOMA) has been considered in downlink for a CR network in this thesis. PD-NOMA is essentially a multiple access (MA) technique that multiplexes the signals for users in the power domain. Hence, the division of power among the user signals is a critical PD-NOMA process known as power allocation (PA).
In this thesis, two methods for PA have been proposed namely, k-scaled PA and fair PA. The performance of the users in the CR network for each of these methods has been extensively evaluated. It has been noted that k-scaled PA reduces the information overhead when there are a large number of users in the network. While it has also been observed to be robust against interference, k-scaled PA method does not always lead to optimum power division. Therefore, fair PA has been proposed for achieving similar performance at all the users. However, it is only applicable for users with specific channel conditions. Also, up to 5 users have been accommodated in the network using the k-scaled method, enabling communicatin over the same spectral resource.
Alle Interessierten sind herzlich eingeladen, der Vortrag findet online statt, der Link dazu wird am Donnerstag verschickt.
Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres
Kommunikationstechnischen Kolloquiums.
*Freitag, 27. März 2020*
*Vortragende*: Nikita Airee
*Ort:* online
*Zeit:* 10:30 Uhr
*Master-Vortrag*: Optimized Power Allocation on NOMA-based Overlay
Cognitive Radio Networks
Spectral efficiency (SE) is crucial for wireless networks due to the
scarcity of wireless spectrum and the ever increasing number of users
that need access to it. Cognitive radios (CRs) are a prominent solution
for SE as they facilitate spectrum sharing between tow or more users.
Another technique for improving SE, which has recently come in to focus,
os nonorthogonal multiple access (NOMA). Together, they have immense
potential for enabling highly spectrally efficient communication. A
type of NOMA called power domain nonorthogonal multiple access (PD-NOMA)
has been considered in downlink for a CR network in this thesis. PD-NOMA
is essentially a multiple access (MA) technique that multiplexes the
signals for users in the power domain. Hence, the division of power
among the user signals is a critical PD-NOMA process known as power
allocation (PA).
In this thesis, two methods for PA have been proposed namely, /k/-scaled
PA and fair PA. The performance of the users in the CR network for each
of these methods has been extensively evaluated. It has been noted that
/k/-scaled PA reduces the information overhead when there are a large
number of users in the network. While it has also been observed to be
robust against interference, /k/-scaled PA method does not always lead
to optimum power division. Therefore, fair PA has been proposed for
achieving similar performance at all the users. However, it is only
applicable for users with specific channel conditions. Also, up to 5
users have been accommodated in the network using the /k/-scaled method,
enabling communicatin over the same spectral resource.
*Alle Interessierten sind herzlich eingeladen, der Vortrag findet online
statt, der Link dazu wird am Donnerstag verschickt. *
Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/
Sehr geehrte Abonnenten des Kolloquium-Newsletters,
gerne informieren wir Sie über den nächsten Termin unseres
Kommunikationstechnischen Kolloquiums.
*Freitag, 7. Februar 2020*
*Vortragende*: Bingying Qin
*Ort:* Hörsaal 4G
*Zeit:* 10:00 Uhr
*Master-Vortrag*: Acquisition of Individual HRTFs using Acoustic
Headtracking
Virtual reality is now widely applied in different areas.
Headphone-based binaural rendering is commonly used to reconstruct the
sound in a virtual environment. Head-Related Transfer Functions (HRTs)
is essential in the sound reconstruction. Using individual HRTF database
can achieve a better experience in the sound simulation. The demands for
individual HRTF database increase rapidly. The proposed dynamic
measurement with acoustic head tracking provides a fast and easy method
to obtain individual HRTFs.
This thesis proposes a processing procedure to get individual HRTF
database from the HRTFs measured using acoustic head tracking. The
processing procedure constructs a desired, usually regularly spaced HRTF
data set from the irregularly spaced measurement results. The relations
between the sound source positions are quantified in weights and the
weights are used in the HRTF interpolation. Interpolation algorithm with
separated amplitude and phase interpolation is illustrated. Measurements
performed in IKS|Lab using acoustic head tracking proves that the
interpolation works well.
Alle Interessierten sind herzlich eingeladen, eine Anmeldung ist nicht
erforderlich.
Allgemeine Informationen zum Kolloquium, sowie eine aktuelle Liste der
Termine des Kommunikationstechnischen Kolloquiums finden Sie unter:
http://www.iks.rwth-aachen.de/aktuelles/kolloquium/
--
Irina Ronkartz
Institute of Communication Systems (IKS)
RWTH Aachen University
Muffeter Weg 3a, 52074 Aachen, Germany
+49 241 80 26958 (phone)
ronkartz(a)iks.rwth-aachen.de
http://www.iks.rwth-aachen.de/