October 2020 - informatik-vortraege

Einladung Informatik-Oberseminar Malte Nuhn
by Sekretariat I6 30 May '22

30 May '22

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 12. Juli 2019, 10.00 Uhr Ort: Informatikzentrum, E3, Raum 9222 Referent: Dipl.-Inform. Malte Nuhn Thema: Unsupervised Training with Applications in Natural Language Processing// Abstract: The state-of-the-art algorithms for various natural language processing tasks require large amounts of labeled training data. At the same time, obtaining labeled data of high quality is often the most costly step in setting up natural language processing systems.Opposed to this, unlabeled data is much cheaper to obtain and available in larger amounts.Currently, only few training algorithms make use of unlabeled data. In practice, training with only unlabeled data is not performed at all. In this thesis, we study how unlabeled data can be used to train a variety of models used in natural language processing. In particular, we study models applicable to solving substitution ciphers, spelling correction, and machine translation. This thesis lays the groundwork for unsupervised training by presenting and analyzing the corresponding models and unsupervised training problems in a consistent manner.We show that the unsupervised training problem that occurs when breaking one-to-one substitution ciphers is equivalent to the quadratic assignment problem (QAP) if a bigram language model is incorporated and therefore NP-hard. Based on this analysis, we present an effective algorithm for unsupervised training for deterministic substitutions. In the case of English one-to-one substitution ciphers, we show that our novel algorithm achieves results close to human performance, as presented in [Shannon 49]. Also, with this algorithm, we present, to the best of our knowledge, the first automatic decipherment of the second part of the Beale ciphers.Further, for the task of spelling correction, we work out the details of the EM algorithm [Dempster & Laird + 77] and experimentally show that the error rates achieved using purely unsupervised training reach those of supervised training.For handling large vocabularies, we introduce a novel model initialization as well as multiple training procedures that significantly speed up training without hurting the performance of the resulting models significantly.By incorporating an alignment model, we further extend this model such that it can be applied to the task of machine translation. We show that the true lexical and alignment model parameters can be learned without any labeled data: We experimentally show that the corresponding likelihood function attains its maximum for the true model parameters if a sufficient amount of unlabeled data is available. Further, for the problem of spelling correction with symbol substitutions and local swaps, we also show experimentally that the performance achieved with purely unsupervised EM training reaches that of supervised training. Finally, using the methods developed in this thesis, we present results on an unsupervised training task for machine translation with a ten times larger vocabulary than that of tasks investigated in previous work. Es laden ein: die Dozentinnen und Dozenten der Informatik _______________________________________________ -- -- Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences HLTPR - Human Language Technology and Pattern Recognition RWTH Aachen University Ahornstraße 55 D-52074 Aachen Tel. Frau Jansen: +49 241 80-216 06 Tel. Frau Andersen: +49 241 80-216 01 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de Tel: +49 241 80-216 01/06 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de

3 15

04.11.2020, 10:30: Prof. Gabriele Kern-Isberner (TU Dortmund): Towards Lifted Inference: Counting Strategies for Relational Maximum Entropy Reasoning
by Bolke-Hermanns, Helen 30 Oct '20

30 Oct '20

Dear members of the Computer Science Department, You are cordially invited to the talk of Prof. Gabriele Kern-Isberner (TU Dortmund) for next week, 4.11., 10:30. Title: Towards Lifted Inference: Counting Strategies for Relational Maximum Entropy Reasoning Abstract: The principle of maximum entropy (MaxEnt) constitutes a meaningful methodology for drawing non monotonic inferences from probabilistic conditional knowledge as it satisfies some fundamental principles from commonsense reasoning. Similar to alternative approaches to probabilistic reasoning in relational settings, straightforward maximum entropy computations suffer from an exponential dependence from the size of the underlying domain which can lead to intractability in many cases. To overcome this problem, we adopt techniques from weighted first-order model counting (WFOMC) which exploit symmetries and interchangeabilities among the domain elements in order to count models of sentences more efficiently. To meet the requirements of our formalization of knowledge by conditionals, we assign a type to the counted models which captures the three-valued evaluation of the conditionals, i.e. the conditional structure of the models. We present the resulting variant of model counting which we call 'typed model counting' and discuss its benefits by means of some illustrating examples. https://www.informatik.rwth-aachen.de/go/id/jgfze Wednesday, 04.11.2020, 10:30 https://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…> Meeting ID: 920 4794 9381 Password: unravel Best regards Helen Bolke-Hermanns Helen Bolke-Hermanns Fachgruppe Informatik RWTH Aachen University Ahornstr. 55, D-52074 Aachen Building E3, 2nd floor, Room 9218 Telefon: +49 (241) 80-21-004 Fax: +49 (241) 80-22 215 E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de> [rwth_informatik_bild_rgb]

1 0

Ankündigung: Informatik-Oberseminar
by Obster, Mathias 26 Oct '20

26 Oct '20

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Dienstag, 10. November 2020, 14:00 Uhr Ort: Videokonferenz (Zoom-Meeting, Informationen siehe unten) Referent: Mathias Obster, M.Sc. RWTH Informatik 11 - Embedded Software Thema: Unterstützung der SPS-Programmierung durch Statische Analyse während der Programmeingabe Abstract: Durch abstrakte Interpretation und genauer durch Wertemengenanalyse lassen sich vollautomatisch Fehler in Programmcode finden, ohne diesen auszuführen. Bei Programmen für Speicherprogrammierbare Steuerungen (SPSen) ist dies für die Fehlervermeidung von besonderem Interesse, da sie im industriellen Umfeld zum Einsatz kommen. In diesem Vortrag wird ein Ansatz vorgestellt, mit dem diese Methoden bereits während der Programmeingabe, also während der Entwicklung eines SPS-Programms, zur Fehlererkennung und -vermeidung beitragen kann. Dafür wurde das Analyseframework ARCADE.PLC erweitert, sodass es Analyseergebnisse in einer Entwicklungsumgebung darstellen kann, die auch in der Industrie zum Einsatz kommt. So können dort sowohl Warnungen vor potenziell fehlerhaftem Programmverhalten als auch mögliche Variablenwerte dargestellt werden. Ein neu eingeführter inkrementeller Ansatz kann darüber hinaus den Berechnungsaufwand verringern, der sonst durch die häufige Ausführung der Analysen entsteht. Dabei wird ausgenutzt, dass sich während der Entwicklung in kurzen Zeitintervallen üblicherweise nur kleine Änderungen für das Gesamtprogramm ergeben. Für die Bereitstellung einer integrierten Lösung für den Anwender wurden die Konzepte und eine Erweiterung für eine Entwicklungsumgebung für SPS-Programme prototypisch umgesetzt und im Rahmen einer kleinen Nutzerstudie evaluiert. Dabei stand die Frage im Mittelpunkt, ob Programmierer während der Eingabe und Bearbeitung eines SPS-Programms von Ergebnissen der Statischen Analyse profitieren können. Es laden ein: die Dozentinnen und Dozenten der Informatik ******************************** Thema: [Promotion Mathias Obster] Vortrag Uhrzeit: 10.Nov.2020 02:00 PM Amsterdam, Berlin, Rom, Stockholm, Wien Zoom-Meeting beitreten https://rwth.zoom.us/j/98589897540?pwd=OWZidEpuU1N5bjA5UFIxcFhNUEJidz09 Meeting-ID: 985 8989 7540 Kenncode: 111012

1 0

21.10., 10.00: Prof. Michael Schaub, RWTH Aachen: Learning from signals on graphs with unobserved edges
by Bolke-Hermanns, Helen 18 Oct '20

18 Oct '20

Dear members of the Computer Science Department, You are cordially invited to the talk of Prof. Michael Schaub, RWTH Aachen. Title: Learning from signals on graphs with unobserved edges Abstract: In many applications we are confronted with the following system identification scenario: we observe a dynamical process that describes the state of a system at particular times. Based on these observations we want to infer the (dynamical) interactions between the entities we observe. In the context of a distributed system, this typically corresponds to a "network identification" task: find the (weighted) edges of the graph of interconnections. However, often the number of samples we can obtain from such a process are far too few to identify the edges of the network exactly. Can we still reliably infer some aspects of the underlying system? Motivated by this question we consider the following identification problem: instead of trying to infer the exact network, we aim to recover a (low-dimensional) statistical model of the network based on the observed signals on the nodes. More concretely, here we focus on observations that consist of snapshots of a diffusive process that evolves over the unknown network. We model the (unobserved) network as generated from an independent draw from a latent stochastic blockmodel (SBM), and our goal is to infer both the partition of the nodes into blocks, as well as the parameters of this SBM. We present simple spectral algorithms that provably solve the partition and parameter inference problems with high-accuracy. We further discuss some possible variations and extensions of this problem setup. https://www.informatik.rwth-aachen.de/go/id/jgfsd Wednesday, 21.10.2020, 10:00 Join Zoom Meeting https://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…> Meeting ID: 920 4794 9381 Password: unravel Best regards Helen Bolke-Hermanns Helen Bolke-Hermanns Fachgruppe Informatik RWTH Aachen University Ahornstr. 55, D-52074 Aachen Building E3, 2nd floor, Room 9218 Telefon: +49 (241) 80-21-004 Fax: +49 (241) 80-22 215 E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de> [rwth_informatik_bild_rgb]

1 0

7.10., 17.00: Guy Van den Broeck, UCLA: From Probabilistic Circuits to Probabilistic Programs and Back
by Bolke-Hermanns, Helen 05 Oct '20

05 Oct '20

Dear members of the Computer Science Department, You are cordially invited to the talk of the UnRAVeL guest Guy Van den Broeck, UCLA Title: From Probabilistic Circuits to Probabilistic Programs and Back Abstract: Probabilistic graphical models are a rich staple of probabilistic AI. However, they make a very specific choice of abstraction: probability distributions are represented by their variable-level (in)dependencies. In this talk I present some recent work on probabilistic models that go beyond classical PGMs, and make a radically different choice of abstraction; one that is computational. Concretely, I will discuss two classes of models: probabilistic circuits and probabilistic programs. Probabilistic circuits represent distributions through the computation graph of probabilistic inference. They move beyond PGMs by guaranteeing tractable inference for certain classes of queries. Probabilistic programs represent distributions through higher-level primitives of computation: iteration, branching, and procedural abstraction. They move beyond PGMs by looking "inside" of the dependencies. Finally, I will illustrate how these two computational abstractions are themselves closely related, by showing how the Dice probabilistic programming language compiles probabilistic programs into probabilistic circuits for inference. Wednesday, Oct. 7th, 5.00 pm The talk will take place as a zoom-session: https://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…> Meeting ID: 920 4794 9381, Password: unravel Best regards Helen Bolke-Hermanns Helen Bolke-Hermanns Fachgruppe Informatik RWTH Aachen University Ahornstr. 55, D-52074 Aachen Building E3, 2nd floor, Room 9218 Telefon: +49 (241) 80-21-004 Fax: +49 (241) 80-22 215 E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de> [rwth_informatik_bild_rgb]

1 0