+**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
+**********************************************************************
Zeit: Freitag, 12. Juli 2019, 10.00 Uhr
Ort: Informatikzentrum, E3, Raum 9222
Referent: Dipl.-Inform. Malte Nuhn
Thema: Unsupervised Training with Applications in Natural Language
Processing//
Abstract:
The state-of-the-art algorithms for various natural language processing
tasks require large amounts of labeled training data. At the same time,
obtaining labeled data of high quality is often the most costly step in
setting up natural language processing systems.Opposed to this,
unlabeled data is much cheaper to obtain and available in larger
amounts.Currently, only few training algorithms make use of unlabeled
data. In practice, training with only unlabeled data is not performed at
all. In this thesis, we study how unlabeled data can be used to train a
variety of models used in natural language processing. In particular, we
study models applicable to solving substitution ciphers, spelling
correction, and machine translation. This thesis lays the groundwork for
unsupervised training by presenting and analyzing the corresponding
models and unsupervised training problems in a consistent manner.We show
that the unsupervised training problem that occurs when breaking
one-to-one substitution ciphers is equivalent to the quadratic
assignment problem (QAP) if a bigram language model is incorporated and
therefore NP-hard. Based on this analysis, we present an effective
algorithm for unsupervised training for deterministic substitutions. In
the case of English one-to-one substitution ciphers, we show that our
novel algorithm achieves results close to human performance, as
presented in [Shannon 49].
Also, with this algorithm, we present, to the best of our knowledge, the
first automatic decipherment of the second part of the Beale
ciphers.Further, for the task of spelling correction, we work out the
details of the EM algorithm [Dempster & Laird + 77] and experimentally
show that the error rates achieved using purely unsupervised training
reach those of supervised training.For handling large vocabularies, we
introduce a novel model initialization as well as multiple training
procedures that significantly speed up training without hurting the
performance of the resulting models significantly.By incorporating an
alignment model, we further extend this model such that it can be
applied to the task of machine translation. We show that the true
lexical and alignment model parameters can be learned without any
labeled data: We experimentally show that the corresponding likelihood
function attains its maximum for the true model parameters if a
sufficient amount of unlabeled data is available. Further, for the
problem of spelling correction with symbol substitutions and local
swaps, we also show experimentally that the performance achieved with
purely unsupervised EM training reaches that of supervised training.
Finally, using the methods developed in this thesis, we present results
on an unsupervised training task for machine translation with a ten
times larger vocabulary than that of tasks investigated in previous work.
Es laden ein: die Dozentinnen und Dozenten der Informatik
_______________________________________________
--
--
Stephanie Jansen
Faculty of Mathematics, Computer Science and Natural Sciences
HLTPR - Human Language Technology and Pattern Recognition
RWTH Aachen University
Ahornstraße 55
D-52074 Aachen
Tel. Frau Jansen: +49 241 80-216 06
Tel. Frau Andersen: +49 241 80-216 01
Fax: +49 241 80-22219
sek(a)i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de
Tel: +49 241 80-216 01/06
Fax: +49 241 80-22219
sek(a)i6.informatik.rwth-aachen.de
www.hltpr.rwth-aachen.de
Dear members of the Computer Science Department,
You are cordially invited to the talk of Prof. Gabriele Kern-Isberner (TU Dortmund) for next week, 4.11., 10:30.
Title: Towards Lifted Inference: Counting Strategies for Relational Maximum Entropy Reasoning
Abstract:
The principle of maximum entropy (MaxEnt) constitutes a meaningful methodology for drawing non monotonic inferences from probabilistic conditional knowledge as it satisfies some fundamental principles from commonsense reasoning. Similar to alternative approaches to probabilistic reasoning in relational settings, straightforward maximum entropy computations suffer from an exponential dependence from the size of the underlying domain which can lead to intractability in many cases. To overcome this problem, we adopt techniques from weighted first-order model counting (WFOMC) which exploit symmetries and interchangeabilities among the domain elements in order to count models of sentences more efficiently. To meet the requirements of our formalization of knowledge by conditionals, we assign a type to the counted models which captures the three-valued evaluation of the conditionals, i.e. the conditional structure of the models. We present the resulting variant of model counting which we call 'typed model counting' and discuss its benefits by means of some illustrating examples.
https://www.informatik.rwth-aachen.de/go/id/jgfze
Wednesday, 04.11.2020, 10:30
https://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…>
Meeting ID: 920 4794 9381
Password: unravel
Best regards
Helen Bolke-Hermanns
Helen Bolke-Hermanns
Fachgruppe Informatik
RWTH Aachen University
Ahornstr. 55, D-52074 Aachen
Building E3, 2nd floor, Room 9218
Telefon: +49 (241) 80-21-004
Fax: +49 (241) 80-22 215
E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de>
[rwth_informatik_bild_rgb]
+**********************************************************************
*
*
* Einladung
*
*
*
* Informatik-Oberseminar
*
*
*
+**********************************************************************
Zeit: Dienstag, 10. November 2020, 14:00 Uhr
Ort: Videokonferenz (Zoom-Meeting, Informationen siehe unten)
Referent: Mathias Obster, M.Sc. RWTH
Informatik 11 - Embedded Software
Thema: Unterstützung der SPS-Programmierung durch Statische Analyse
während der Programmeingabe
Abstract:
Durch abstrakte Interpretation und genauer durch Wertemengenanalyse lassen sich vollautomatisch Fehler in Programmcode finden, ohne diesen auszuführen. Bei Programmen für Speicherprogrammierbare Steuerungen (SPSen) ist dies für die Fehlervermeidung von besonderem Interesse, da sie im industriellen Umfeld zum Einsatz kommen.
In diesem Vortrag wird ein Ansatz vorgestellt, mit dem diese Methoden bereits während der Programmeingabe, also während der Entwicklung eines SPS-Programms, zur Fehlererkennung und -vermeidung beitragen kann. Dafür wurde das Analyseframework ARCADE.PLC erweitert, sodass es Analyseergebnisse in einer Entwicklungsumgebung darstellen kann, die auch in der Industrie zum Einsatz kommt. So können dort sowohl Warnungen vor potenziell fehlerhaftem Programmverhalten als auch mögliche Variablenwerte dargestellt werden.
Ein neu eingeführter inkrementeller Ansatz kann darüber hinaus den Berechnungsaufwand verringern, der sonst durch die häufige Ausführung der Analysen entsteht. Dabei wird ausgenutzt, dass sich während der Entwicklung in kurzen Zeitintervallen üblicherweise nur kleine Änderungen für das Gesamtprogramm ergeben.
Für die Bereitstellung einer integrierten Lösung für den Anwender wurden die Konzepte und eine Erweiterung für eine Entwicklungsumgebung für SPS-Programme prototypisch umgesetzt und im Rahmen einer kleinen Nutzerstudie evaluiert. Dabei stand die Frage im Mittelpunkt, ob Programmierer während der Eingabe und Bearbeitung eines SPS-Programms von Ergebnissen der Statischen Analyse profitieren können.
Es laden ein: die Dozentinnen und Dozenten der Informatik
********************************
Thema: [Promotion Mathias Obster] Vortrag
Uhrzeit: 10.Nov.2020 02:00 PM Amsterdam, Berlin, Rom, Stockholm, Wien
Zoom-Meeting beitreten
https://rwth.zoom.us/j/98589897540?pwd=OWZidEpuU1N5bjA5UFIxcFhNUEJidz09
Meeting-ID: 985 8989 7540
Kenncode: 111012
Dear members of the Computer Science Department,
You are cordially invited to the talk of Prof. Michael Schaub, RWTH Aachen.
Title: Learning from signals on graphs with unobserved edges
Abstract:
In many applications we are confronted with the following system identification scenario: we observe a dynamical process that describes the state of a system at particular times. Based on these observations we want to infer the (dynamical) interactions between the entities we observe. In the context of a distributed system, this typically corresponds to a "network identification" task: find the (weighted) edges of the graph of interconnections.
However, often the number of samples we can obtain from such a process are far too few to identify the edges of the network exactly. Can we still reliably infer some aspects of the underlying system?
Motivated by this question we consider the following identification problem: instead of trying to infer the exact network, we aim to recover a (low-dimensional) statistical model of the network based on the observed signals on the nodes. More concretely, here we focus on observations that consist of snapshots of a diffusive process that evolves over the unknown network. We model the (unobserved) network as generated from an independent draw from a latent stochastic blockmodel (SBM), and our goal is to infer both the partition of the nodes into blocks, as well as the parameters of this SBM. We present simple spectral algorithms that provably solve the partition and parameter inference problems with high-accuracy. We further discuss some possible variations and extensions of this problem setup.
https://www.informatik.rwth-aachen.de/go/id/jgfsd
Wednesday, 21.10.2020, 10:00
Join Zoom Meeting https://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…>
Meeting ID: 920 4794 9381 Password: unravel
Best regards
Helen Bolke-Hermanns
Helen Bolke-Hermanns
Fachgruppe Informatik
RWTH Aachen University
Ahornstr. 55, D-52074 Aachen
Building E3, 2nd floor, Room 9218
Telefon: +49 (241) 80-21-004
Fax: +49 (241) 80-22 215
E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de>
[rwth_informatik_bild_rgb]
Dear members of the Computer Science Department,
You are cordially invited to the talk of the UnRAVeL guest Guy Van den Broeck, UCLA
Title: From Probabilistic Circuits to Probabilistic Programs and Back
Abstract:
Probabilistic graphical models are a rich staple of probabilistic AI. However, they make a very specific choice of abstraction: probability distributions are represented by their variable-level (in)dependencies. In this talk I present some recent work on probabilistic models that go beyond classical PGMs, and make a radically different choice of abstraction; one that is computational. Concretely, I will discuss two classes of models: probabilistic circuits and probabilistic programs. Probabilistic circuits represent distributions through the computation graph of probabilistic inference. They move beyond PGMs by guaranteeing tractable inference for certain classes of queries. Probabilistic programs represent distributions through higher-level primitives of computation: iteration, branching, and procedural abstraction. They move beyond PGMs by looking "inside" of the dependencies. Finally, I will illustrate how these two computational abstractions are themselves closely related, by showing how the Dice probabilistic programming language compiles probabilistic programs into probabilistic circuits for inference.
Wednesday, Oct. 7th, 5.00 pm
The talk will take place as a zoom-session: https://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…>
Meeting ID: 920 4794 9381, Password: unravel
Best regards
Helen Bolke-Hermanns
Helen Bolke-Hermanns
Fachgruppe Informatik
RWTH Aachen University
Ahornstr. 55, D-52074 Aachen
Building E3, 2nd floor, Room 9218
Telefon: +49 (241) 80-21-004
Fax: +49 (241) 80-22 215
E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de>
[rwth_informatik_bild_rgb]