+**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
+**********************************************************************
Zeit: Freitag, 12. Juli 2019, 10.00 Uhr
Ort: Informatikzentrum, E3, Raum 9222
Referent: Dipl.-Inform. Malte Nuhn
Thema: Unsupervised Training with Applications in Natural Language
Processing//
Abstract:
The state-of-the-art algorithms for various natural language processing
tasks require large amounts of labeled training data. At the same time,
obtaining labeled data of high quality is often the most costly step in
setting up natural language processing systems.Opposed to this,
unlabeled data is much cheaper to obtain and available in larger
amounts.Currently, only few training algorithms make use of unlabeled
data. In practice, training with only unlabeled data is not performed at
all. In this thesis, we study how unlabeled data can be used to train a
variety of models used in natural language processing. In particular, we
study models applicable to solving substitution ciphers, spelling
correction, and machine translation. This thesis lays the groundwork for
unsupervised training by presenting and analyzing the corresponding
models and unsupervised training problems in a consistent manner.We show
that the unsupervised training problem that occurs when breaking
one-to-one substitution ciphers is equivalent to the quadratic
assignment problem (QAP) if a bigram language model is incorporated and
therefore NP-hard. Based on this analysis, we present an effective
algorithm for unsupervised training for deterministic substitutions. In
the case of English one-to-one substitution ciphers, we show that our
novel algorithm achieves results close to human performance, as
presented in [Shannon 49].
Also, with this algorithm, we present, to the best of our knowledge, the
first automatic decipherment of the second part of the Beale
ciphers.Further, for the task of spelling correction, we work out the
details of the EM algorithm [Dempster & Laird + 77] and experimentally
show that the error rates achieved using purely unsupervised training
reach those of supervised training.For handling large vocabularies, we
introduce a novel model initialization as well as multiple training
procedures that significantly speed up training without hurting the
performance of the resulting models significantly.By incorporating an
alignment model, we further extend this model such that it can be
applied to the task of machine translation. We show that the true
lexical and alignment model parameters can be learned without any
labeled data: We experimentally show that the corresponding likelihood
function attains its maximum for the true model parameters if a
sufficient amount of unlabeled data is available. Further, for the
problem of spelling correction with symbol substitutions and local
swaps, we also show experimentally that the performance achieved with
purely unsupervised EM training reaches that of supervised training.
Finally, using the methods developed in this thesis, we present results
on an unsupervised training task for machine translation with a ten
times larger vocabulary than that of tasks investigated in previous work.
Es laden ein: die Dozentinnen und Dozenten der Informatik
_______________________________________________
--
--
Stephanie Jansen
Faculty of Mathematics, Computer Science and Natural Sciences
HLTPR - Human Language Technology and Pattern Recognition
RWTH Aachen University
Ahornstraße 55
D-52074 Aachen
Tel. Frau Jansen: +49 241 80-216 06
Tel. Frau Andersen: +49 241 80-216 01
Fax: +49 241 80-22219
sek(a)i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de
Tel: +49 241 80-216 01/06
Fax: +49 241 80-22219
sek(a)i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de
**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
+**********************************************************************
Zeit: Dienstag, 15. Dezember 2020, 10:00 Uhr
Zoom:
https://rwth.zoom.us/j/99233095930?pwd=dHhTV253V1ZYUzRtSkk1L3A1REZVUT09
Meeting-ID: 992 3309 5930
Kenncode: 626162
Referent: Philipp Weidel, Dipl. Inform.
Thema: Learning and decision making in closed loop simulations of
plastic spiking neural networks
Abstract:
To understand how animals and humans learn, form memories and make
decisions is a highly relevant goal both for neuroscience and for fields
that take some inspiration from neuroscience, such as machine learning
and artificial intelligence. Many models of learning and decision making
were developed in the fields of machine learning, artificial
intelligence, and computational neuroscience. Although these models aim
to describe similar mechanisms, they do not all pursue the same goal.
These models can be differentiated between models aiming to reach
optimal performance on a specific task (or set of tasks) and models
trying to explain how animals and humans learn. Some models of the first
class use biologically inspired methods (such as deep learning) but are
usually not biologically realistic and are therefore not well suited to
explain the function of the brain. Models in the second class focus on
being biologically plausible to explain how the brain works, but often
demonstrate their capability on too simplistic tasks and yield low
performance on well-known tasks from machine learning. This work aims to
close the gap between these two types of models.
In the first part of this talk, tools are described that allow the
combination of biologically plausible neural network models together
with powerful toolkits known from machine learning and robotics. To this
end, MUSIC, the middleware for spiking neural network simulators such as
NEST and NEURON is interfaced with ROS, a middleware for robotic
hardware and simulators such as Gazebo. This toolchain is extended with
interfaces to reinforcement learning toolkits such as the OpenAI Gym.
The second part addresses the question of how the brain can represent
its environment in the neural substrate of the cortex and how a
realistic model of reinforcement learning can make use of these
representations. To this end, a spiking neural network model of
unsupervised learning is presented which is able to learn its input
projections such that it can detect and represent repeating patterns. By
using an actor-critic reinforcement learning architecture driven by a
realistic dopamine modulated plasticity rule the model can make use of
the representations and learn a range tasks.
Es laden ein: die Dozentinnen und Dozenten der Informatik
--
Prof. Dr. Abigail Morrison
IAS-6 / INM-6 / SimLab Neuroscience
Jülich Research Center
&
Computer Science 3 - Software Engineering
RWTH Aachen
http://www.fz-juelich.de/inm/inm-6http://www.fz-juelich.de/ias/jsc/slnshttp://www.se-rwth.de
Office: +49 2461 61-9805
Fax # : +49 2461 61-9460
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
+**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
+**********************************************************************
Zeit: Mittwoch, 10. Februar 2021, 14.00 Uhr
Zoom: https://rwth.zoom.us/j/91716576115?pwd=ZzUwemxXWUJkQURjQjJibmlGb3dYZz09
Meeting ID: 917 1657 6115
Passcode: 496556
Referent: Konrad Anton Fögen M.Sc.
Lehr- und Forschungsgebiet Softwarekonstruktion
Thema: Combinatorial Robustness Testing based on Error-Constraints
Abstract:
In this talk, we present an extension to combinatorial testing (CT) which is an effective specification-based test method that is based on an input parameter model (IPM). We argue that robustness is an important property of a software, which must be tested in addition to a software's functionality. This requires invalid values and invalid value combinations to be able to observe a software's reaction to them.
However, the effectiveness of CT deteriorates in the presence of invalid values or invalid value combinations. This phenomenon is called invalid input masking effect and is already acknowledged in some research. It led to extensions of CT that we call combinatorial robustness testing (CRT). The objective of CRT is to improve the fault detection by avoiding invalid input masking. This is achieved by separating the testing of valid values and valid value combinations from the testing of invalid values and invalid value combinations.
While CRT is a promising extension of CT, it is still insufficiently researched. For instance, in related work, IPMs are extended with additional semantic information to specify invalid values. However, invalid value combinations cannot be specified directly.
Therefore, the objective of this work is to further expand the idea of CRT. The aim is to develop a new CRT test method with a modeling approach to specify invalid values and invalid value combinations equally well. This modeling approach should also be incorporated into explicit test adequate criteria and test selection strategies. Furthermore, this modeling approach shall be supported by automated techniques.
First, we conduct a controlled experiment to check if CRT is necessary at all or if CT is already appropriate to test robustness. Based on the result, we continue and develop a refined t-factor fault model that incorporates robustness faults and the inherent invalid input masking effect.
Next, we develop a new test method for CRT and introduce a new structure that extends the structure of IPMs. It is called robustness input parameter model (RIPM) and contains the concept of error-constraints which is an additional set of logical expressions to describe the validity of values and value combinations.
With the refined t-factor fault model and the new RIPM structure, new test adequacy criteria that incorporate the additional semantic information and new test selection strategies that satisfy the test adequacy criteria are developed.
The new concept of error-constraints requires additional effort in modeling. Therefore, we develop two techniques to support the modeling of them. First, we develop a technique to identify and repair inconsistencies among error-constraints. Second, we develop a technique to automatically generate error-constraints based on the conformance to another system.
Last but not least, all aforementioned concepts and techniques are operationalized and integrated in a test automation framework which includes a process, an architecture, and a Java-based reference implementation.
Es laden ein: die Dozentinnen und Dozenten der Informatik
+**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
+**********************************************************************
Zeit: Mittwoch, 10. Februar 2021, 15.00 Uhr
Zoom: https://rwth.zoom.us/j/96367107600?pwd=c2lMb2M1OXZXSHZFalRySUR2QTExUT09
Referent: Oliver Kautz M.Sc.
Lehrstuhl Informatik 3
Thema: Model Analyses Based on Semantic Differencing and Automatic Model Repair
Abstract:
Models are the primary development artifacts used in model-driven software development.
Therefore, models continuously evolve during the design, development, and
maintenance of software systems. Thus, model differencing is an important task to
understand the syntactic and semantic differences between model versions.
Previous work produced general (and thus language-independent) approaches for syntactic
model differencing, but only a few language-dependent approaches for semantic
model differencing. Approaches combining syntactic with semantic model differencing
by relating the syntactic changes of models to their semantic differences rarely exist.
Previous work neglected the development of language-independent approaches abstracting
from a concrete model property for detecting the syntactic elements of a model,
which cause that the model does not satisfy the property. If the property encodes a
requirement and the non-satisfaction represents the existence of a bug, then detecting
the syntactic model elements causing the non-satisfaction of the property facilitates
developers in detecting the syntactic model elements causing the bug.
In this talk, we present a framework for precisely defining modeling languages, including
syntax, semantics, and model evolution possibilities. We discuss syntactic and semantic
model differencing. The framework is instantiated with four concrete modeling languages:
Time-synchronous port automata, feature diagrams, sequence diagrams, and activity diagrams.
Based on the framework for precisely defining modeling languages, we present a modeling
language and property-independent framework for automatic model repairs. The framework
facilitates developers in detecting the syntactic elements of a model causing that the
model does not satisfy a property. Instantiating the framework with a concrete modeling
language and a concrete model property enables the automatic calculation of syntactic
changes that transform a model not satisfying the property to a model that satisfies the
property. The framework relies on the assumption that it is possible to partition the
syntactic changes applicable to each model into finitely many model-specific and property-
specific equivalence classes.
Es laden ein: die Dozentinnen und Dozenten der Informatik
+**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
+**********************************************************************
Zeit: Freitag, 29. Januar 2021, 14.00 Uhr
Zoom: https://rwth.zoom.us/j/95719946489?pwd=S0lITm9pcW45b1k4SW5EVis2a1poQT09
Referent: Martin Serror, M.Sc.
Lehrstuhl Informatik 4
Thema: On the Benefits of Cooperation for Dependable Wireless Communications
Abstract:
The emerging Industrial Internet-of-Things (IIoT) improves flexibility, productivity, and costs
of industrial processes by connecting sensors, actuators, and controllers to each other and
the Internet. On the factory floor, such interconnections increasingly rely on wireless
communications, reducing deployment and maintenance costs while supporting the mobility
of communication partners. The industrial domain, however, is mainly characterized by safety-
and mission-critical Machine-to-Machine communication. Therefore, state-of-the-art wireless
communication protocols for home and business environments, such as WLAN and Bluetooth,
are not suited for the IIoT. Consequently, the IIoT requires dependable wireless communication,
achieving both high reliability and low latency.
A promising approach for so-called Ultra-Reliable Low-Latency Communication (URLLC) in the
IIoT is cooperative diversity, since the participating stations already collaborate toward a common
goal, i. e., keeping the industrial process running. There, a sending station exploits multiple
independent transmission paths via cooperating relays to convey a packet to its destination
reliably. In contrast to spatial diversity, this approach also works with single-input single-output
transceivers. However, when considering relaying for URLLC, it is particularly challenging that all
participants have to share the scarce transmission resources.
Hence, in this talk, we investigate various mechanisms enabling dependable wireless communication,
i. e., increasing communication reliability within a bounded low latency, where we mainly focus on the
benefits of cooperative diversity. Therefore, we explore different design options for URLLC and
evaluate them, leveraging the advantages of different methodological approaches. This talk thus offers
valuable insights into designing communication protocols with challenging requirements.
At the example of cooperation, we thoroughly retrace the development process from analysis to
prototypical deployment. On the one hand, the achieved results contribute to URLLC for the IIoT;
on the other hand, they provide a critical examination of the selected evaluation methodologies.
Es laden ein: die Dozentinnen und Dozenten der Informatik
Happy new year,
You are cordially invited to the next UnRAVeL guest talk on Wednesday, 13.01.2021, 16.00:
Mahesh Viswanathan, University of Illinois at Urbana-Champaign: Verifying the Privacy and Accuracy of Algorithms for Differential Privacy
Differential privacy is a mathematical framework for performing privacy-preserving computations over sensitive
data. One important feature of differential privacy algorithms is their ability to achieve provable individual privacy guarantees and at the same time ensure that the outputs are reasonably accurate. Such algorithms compute noisy versions of the right answers to aggregate queries on sensitive data to ensure privacy. Privacy guarantees demand that the algorithm running on "similar" data sets produce responses that are statistically similar; this provides a very strong form of individual privacy. Accuracy, on the other hand, demands that the algorithms output, though noised, be sufficient close to the correct answer for a query. In this talk we will present preliminary results on the algorithmic complexity of checking the privacy and accuracy requirements for a given algorithm.
Joint work with Giles Barthe, Rohit Chadha,Vishal Jagannath, Paul Krogmeier, and Prasad Sistla. Based on papers in LICS2020 and POPL 2021.
Wednesday, 13.01.2020, 16:00 (!); https://www.unravel.rwth-aachen.de/go/id/kfruuhttps://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…>
Meeting ID: 920 4794 9381
Password: unravel
Best regards
Helen Bolke-Hermanns
Helen Bolke-Hermanns
RTG UnRAVeL - RWTH Aachen University
Ahornstr. 55, D-52074 Aachen
Building E3, 2nd floor, Room 9218
Telefon: +49 (241) 80-21 004
Fax: +49 (241) 80-22 215
E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de>
Web: www.unravel.rwth-aachen.de<http://www.unravel.rwth-aachen.de/>
[rwth_unravel_en_rgb]
+**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
+**********************************************************************
Zeit: Donnerstag, 21. Januar 2021, 15.00 Uhr
Zoom:
https://rwth.zoom.us/j/98847090675?pwd=TnA5L3NCTkx0TjBmMWJEZHZnQkx1QT09
Meeting-ID: 988 4709 0675
Kenncode: 904296
Referent: Isaak Lim M.Sc.
Lehrstuhl Informatik 8
Thema: Learned Embeddings for Geometric Data
Abstract:
Solving high-level tasks on 3D shapes such as classification,
segmentation, vertex-to-vertex maps or computing the perceived style
similarity between shapes requires methods that are able to extract the
necessary information from geometric data and describe the appropriate
properties. Constructing functions that do this by hand is challenging
because it is unclear how and which information to extract for a task.
Furthermore, it is difficult to determine how to use the extracted
information to provide answers to the questions about shapes that are
being asked (e.g. what category a shape belongs to). To this end, we
propose to learn functions that map geometric data to an embedding
space. The outputs of those maps are compressed encodings of the input
geometric data that can be optimized to contain all necessary
task-dependent information. These encodings can then be compared
directly (e.g. via the Euclidean distance) or by other fairly simple
functions to provide answers to the questions being asked.
Neural networks can be used to implement such maps and comparison
functions. This has the benefit that they offer flexibility and
expressiveness. Furthermore, information extraction and comparison can
be automated by designing appropriate objective functions that are used
to optimize the parameters of the neural networks on geometric data
collections with task-related meta information provided by humans.
We therefore have to answer two questions. Firstly, given the often
irregular nature of representations of 3D shapes, how can geometric data
be represented as input to neural networks and how should such networks
be constructed?
Secondly, how can we design the resulting embedding space provided by
neural networks in such a manner that we are able to achieve good
results on high-level tasks on 3D shapes?
In this talk we provide answers to these two questions. Concretely, de-
pending on the availability of the data sources and the task specific
requirements we compute encodings from geometric data representations in
the form of images, point clouds and triangle meshes. Once we have a
suitable way to encode the input, we explore different ways in which to
design the learned embedding space by careful construction of
appropriate objective functions that extend beyond straightforward
cross-entropy minimization based approaches on categorical
distributions. We show that these approaches are able to achieve
good results in both discriminative as well as generative tasks.
Es laden ein: die Dozentinnen und Dozenten der Informatik