July 2021 - informatik-vortraege - lists.rwth-aachen.de

Einladung Informatik-Oberseminar Malte Nuhn
by Sekretariat I6 30 May '22

30 May '22

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 12. Juli 2019, 10.00 Uhr Ort: Informatikzentrum, E3, Raum 9222 Referent: Dipl.-Inform. Malte Nuhn Thema: Unsupervised Training with Applications in Natural Language Processing// Abstract: The state-of-the-art algorithms for various natural language processing tasks require large amounts of labeled training data. At the same time, obtaining labeled data of high quality is often the most costly step in setting up natural language processing systems.Opposed to this, unlabeled data is much cheaper to obtain and available in larger amounts.Currently, only few training algorithms make use of unlabeled data. In practice, training with only unlabeled data is not performed at all. In this thesis, we study how unlabeled data can be used to train a variety of models used in natural language processing. In particular, we study models applicable to solving substitution ciphers, spelling correction, and machine translation. This thesis lays the groundwork for unsupervised training by presenting and analyzing the corresponding models and unsupervised training problems in a consistent manner.We show that the unsupervised training problem that occurs when breaking one-to-one substitution ciphers is equivalent to the quadratic assignment problem (QAP) if a bigram language model is incorporated and therefore NP-hard. Based on this analysis, we present an effective algorithm for unsupervised training for deterministic substitutions. In the case of English one-to-one substitution ciphers, we show that our novel algorithm achieves results close to human performance, as presented in [Shannon 49]. Also, with this algorithm, we present, to the best of our knowledge, the first automatic decipherment of the second part of the Beale ciphers.Further, for the task of spelling correction, we work out the details of the EM algorithm [Dempster & Laird + 77] and experimentally show that the error rates achieved using purely unsupervised training reach those of supervised training.For handling large vocabularies, we introduce a novel model initialization as well as multiple training procedures that significantly speed up training without hurting the performance of the resulting models significantly.By incorporating an alignment model, we further extend this model such that it can be applied to the task of machine translation. We show that the true lexical and alignment model parameters can be learned without any labeled data: We experimentally show that the corresponding likelihood function attains its maximum for the true model parameters if a sufficient amount of unlabeled data is available. Further, for the problem of spelling correction with symbol substitutions and local swaps, we also show experimentally that the performance achieved with purely unsupervised EM training reaches that of supervised training. Finally, using the methods developed in this thesis, we present results on an unsupervised training task for machine translation with a ten times larger vocabulary than that of tasks investigated in previous work. Es laden ein: die Dozentinnen und Dozenten der Informatik _______________________________________________ -- -- Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences HLTPR - Human Language Technology and Pattern Recognition RWTH Aachen University Ahornstraße 55 D-52074 Aachen Tel. Frau Jansen: +49 241 80-216 06 Tel. Frau Andersen: +49 241 80-216 01 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de Tel: +49 241 80-216 01/06 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de

3 15

UnRAVeL "Behind the Scenes" Survey Lecture
by Tim Seppelt 26 Jul '21

26 Jul '21

Dear all, part of the programme of the research training group UnRAVeL is a series of introductory lectures on the topics of „randomness“ and „uncertainty“ in UnRAVeL’s research thrusts algorithms and complexity, verification, logic and languages, and their application scenarios. Each lecture is delivered by one of the researchers involved in UnRAVeL. The main aim is to provide doctoral researchers as well as master students a broad overview of the subjects of UnRAVeL. This year, 12 UnRAVeL professors will answer the following questions, based on one of their recent scientific results: * How did you get to this result? * How did you come up with certain key ideas? * How did you cope with obstacles on the way? Which ideas you had did not work out? Following these talks, PhD students will give an informal summary of their doctoral studies within UnRAVeL. All interested doctoral researchers and master students are invited to attend the UnRAVeL lecture series 2021 and engage in discussions with researchers and doctoral students. Details information can be found on https://www.unravel.rwth-aachen.de/cms/UnRAVeL/Studium/~pzix/Ringvorlesung-… All events take place on *Thursdays from 16:30 to 18:00 on Zoom* https://rwth.zoom.us/j/96043715437?pwd=U0dRczkyQjRCY21abW13TDNmUHlhUT09 * 15/04/2021 Survey Lecture: Erika Ábrahám: Probabilistic Hyperproperties * 22/04/2021 Jürgen Giesl: Inferring Expected Runtimes of Probabilistic Programs * 29/04/2021 Erich Grädel: Hidden Variables in Quantum Mechanics and Logics of Dependence and Independence * 06/05/2021 Christof Löding: Learning Automata for Infinite Words * 20/05/2021 Martin Grohe: The Logic of Graph Neural Networks * 10/06/2021 Britta Peis: Sensitivity Analysis for Submodular Function Optimization with Applications in Algorithmic Game Theory * 17/06/2021 Nils Nießen: Optimised Maintenance of Railway Infrastructure * 24/06/2021 Gerhard Lakemeyer: Uncertainty in Robotics * 01/07/2021 Joost-Pieter Katoen: The Surprises of Probabilistic Termination * 08/07/2021 Christina Büsing: Robust Minimum Cost Flow Problem Under Consistent Flow Constraints * 15/07/2021 Ringvorlesung: Gerhard Woeginger: Bilevel optimization * 22/07/2021 Ulrike Meyer: Malware Detection We are looking forward to seeing you at the lectures. Best regards, Tim Seppelt for the organisation committee https://www.unravel.rwth-aachen.de/global/show_picture.asp?id=aaaaaaaaaydoc…

1 13

Einladung: Informatik-Oberseminar Joachim Protze
by Joachim Protze 18 Jul '21

18 Jul '21

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Donnerstag, 22. Juli 2002, 13.00 Uhr Ort: Zoom Videokonferenz https://rwth.zoom.us/j/97475619537?pwd=NzBLYkFqREVISyt3QnNSd1ZoK2NZZz09 Referent: Dipl.-Inf. Joachim Protze Lehrstuhl Informatik 12 Thema: Modular Techniques and Interfaces for Data Race Detection in Multi-Paradigm Parallel Programming Abstract: The demand for ever-growing computing capabilities in scientific computing and simulation has led to heterogeneous computing systems with multiple parallelism levels. The aggregated performance of the Top 500 high-performance computing (HPC) systems showed an annual growth rate of 85% for the years 1993-2013. As this growth rate significantly exceeds the growth rate of 40% to 60% supported by Moore’s law, the additional growth was always supported by an increasing number of computing nodes with distributed memory and connected by a network. The message passing interface (MPI) proved to be the dominating programming paradigm for distributed memory computing as the most coarse-grain level of parallelism in HPC. While performance gain from Moore’s law in the last century mainly went into single-core performance by increasing the clock frequency, we see an increasing number of computing cores per socket since the beginning of this century. The cores within a socket or a node share the memory. Although MPI can be used and is used for shared memory parallelization, explicit use of shared memory as with OpenMP can improve the scalability and performance of parallel applications. As a result, hybrid MPI and OpenMP programming is a common paradigm in HPC. Memory access anomalies such as data races are a severe issue in parallel programming. Data race detection has been studied for years, and different static and dynamic analysis techniques have been presented. This work will not try and propose fundamentally new analysis techniques but will show how high-level abstraction of MPI and OpenMP can be mapped to the low-level abstraction of analysis tools without impact on the analysis’s soundness. This work develops and presents analysis workflows to identify memory access anomalies in hybrid, multi- paradigm parallel applications. This work collects parallel variants of memory access anomalies known from sequential programming and identifies specific patterns for distributed and shared memory programming. Furthermore, this work identifies the high-level synchronization, concurrency, and memory access semantics implicitly and explicitly defined by the parallel programming paradigms’ specifications to provide a mapping to the analysis abstraction. As part of these high-level concurrency concepts, we can identify several sources of concurrency within a thread. This work compares two techniques to handle this high- level concurrency for data race analysis and finds that a combined approach works best in the general case. The evaluation shows that this work’s analysis workflow provides a high precision while enabling increased recall for concurrency within a thread. In this talk, we will focus on the mapping of high-level concurrency abstractions to low-level analysis abstractions as an important key point of this thesis and present the results of the work. Es laden ein: die Dozentinnen und Dozenten der Informatik

1 1

Einladung: Informatik-Oberseminar Richard Wilke
by Richard Wilke 14 Jul '21

14 Jul '21

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 23. Juli 2021, 10:00 Uhr Zoom: https://rwth.zoom.us/j/97181863376?pwd=VmZIUzlNTXhQRFl0S25uRFBTRW0wdz09 Meeting-ID: 971 8186 3376 Kenncode: 867315 Referent: Richard Wilke, M.Sc. LuFG Mathematische Grundlagen der Informatik Thema: Reasoning about Dependence and Independence: Teams and Multiteams Abstract: Team semantics is the mathematical basis of modern logics for reasoning about dependence and independence. Its core feature is that formulae are evaluated against a set of assignments, called a team. This approach dates back to Hodges (1997) who used it to provide a compositional semantics for independence friendly logic. Building on this idea, Väänänen (2007) suggested that dependencies between variables should not be treated as annotations of quantifiers, but as atomic properties of teams. However, being based on sets, team semantics can only be used to reason about the presence or absence of data. Multiteam semantics instead takes multiplicities of data into account and is based on multisets of assignments, called multiteams. In this talk we give an overview of this formalism, explore a wide spectrum of logics with multiteam semantics and compare them with regard to their expressive power. We exhibit some striking differences between multiteam and team semantics, and also show where these formalisms are similar. Moreover, we present a game-theoretic semantics for our logic and establish connections between logics with multiteam semantics and variants of existential second-order logic. Es laden ein: die Dozentinnen und Dozenten der Informatik

1 0

Einladung Informatik-Oberseminar Evgeny Kusmenko
by Kusmenko, Evgeny 06 Jul '21

06 Jul '21

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Montag, 19. Juli 2021, 14.30 Uhr Zoom: https://rwth.zoom.us/j/99446295120?pwd=NWU0QmIvVkpONlV2R0xmbmloTkhZZz09 Referent: Dipl.-Ing. Evgeny Kusmenko Lehrstuhl Informatik 3 Thema: Model-Driven Development Methodology and Domain-Specific Languages for the Design of Artificial Intelligence in Cyber-Physical Systems Abstract: The development of cyber-physical systems poses a multitude of challenges requiring experts from different fields. Such systems cannot be developed successfully without the support of appropriate processes, languages, and tools. Model-driven software engineering is an important approach which helps development teams to cope with the increasing complexity of today's cyber-physical systems. In this talk we are going to discuss a model-driven engineering methodology with a particular focus on interconnected intelligent cyber-physical systems such as cooperative vehicles. The basis of the proposed methodology is a component-and-connector architecture description language focusing on the decomposition and integration of cyber-physical system software. It features a strong, math-oriented type system abstracting away from the technical realization and incorporating physical units. To facilitate the development of highly-interconnected self-adaptive systems, the language enables its users to model component and connector arrays and supports architectural runtime-reconfiguration. Architectural elements can be altered, added, and removed dynamically upon the occurrence of trigger events. In order to fully cover the development process, the proposed methodology, in addition to structural modeling, provides means for behavior specification and its seamless integration into the components of the architecture. A matrix-oriented scripting language enables the developer to specify algorithms using a syntax close to the mathematical domain. What is more, a dedicated deep learning modeling language is provided for the development and training of neural networks as directed acyclic graphs of neuron layers. The framework supports different learning methods including supervised, reinforcement, and generative adversarial learning, covering a broad range of applications from image and natural language processing to decision making and test data generation. The presented toolchain enables an automated generation of fully functional C++ code together with the corresponding build and training scripts based on the architectural models and behavior specifications. Finally, to facilitate the integration and deployment of the modeled software in distributed environments, we use a tagging approach to model the middleware and to control a middleware generation toolchain. Es laden ein: die Dozentinnen und Dozenten der Informatik

1 0

Einladung: Informatik-Oberseminar Patrick Landwehr
by Landwehr, Patrick 05 Jul '21

05 Jul '21

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Mittwoch, 14. Juli 2021, 10.00 Uhr Ort: Zoom Videokonferenz Link: https://rwth.zoom.us/j/94853770165?pwd=Uzk4TjI4TGNJQ3owaXJDbUxMT2d0UT09 Meeting-ID: 948 5377 0165 Kenncode: 046795 Referent: Patrick Landwehr M.Sc. Lehrstuhl Informatik 7 Thema: Tree Automata with Constraints on Infinite Trees Abstract: Tree automata on infinite trees are a powerful tool that is widely used for decision procedures and synthesis of logical specifications. It is well known that finite tree automata have good algorithmic properties, but somewhat limited expressive power. For example, they cannot verify that certain subtrees of an input tree are equal. In order to model such properties, we study extensions of tree automata that use so called constraints to compare whole subtrees of an input. We distinguish between two types of constraints: local constraints and global constraints. Local constraints can be used to compare the direct subtrees of each node. In this thesis we first summarize the existing results of tree automata with local constraints for infinite trees. Then we paritally answer the open question whether the class of languages recognizable by these automata is closed under projection. That is, we show that in the case of automata with Büchi acceptance condition the class of recognizable languages is closed under projection. As a consequence, we obtain a new decision algorithm for the emptiness problem as well as a proof for the fact that each non-empty language recognized by a Büchi tree automaton with sibling constraints contains a regular tree. Moreover, we also study logical characterizations of this class of languages. Tree automata with global constraints are able to compare compare subtrees whose positions are defined by the states reached in a run. For example, this model can verify that all subtrees rooted at positions where a certain state is reached are equal. In this thesis we generalize the model introduced on finite trees to the setting of infinite trees. We show that most closure properties and decidability results can be extended from finite to infinite trees. However, new techniques are required in order to do so. While the decidability of the emptiness problem remains an open question in general, we present decidability results for some subclasses of tree automata with global constraints. That is, if the automaton tests only for equality of subtrees (and not for inequality) the emptiness problem is decidable. The same is true if the underlying language (i.e. when ignoring the constraints) is countable. We also study the special case of automata with global constraints on unary infinite trees (omega-words). Here we show that in contrast to branching trees, the class of languages recognizable by these automata is closed under complement. Finally, we present precise logical characterizations for all of the subclasses mentioned, by extensions of monadic second order logic on infinite trees (or omega-words). Es laden ein: die Dozentinnen und Dozenten der Informatik

1 0

Einladung: Informatik-Oberseminar Andrea Schnorr
by Andrea Schnorr 03 Jul '21

03 Jul '21

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 19. Februar 2021, 11.00 Uhr Zoom: https://rwth.zoom.us/j/2452218628 Referent: Andrea Schnorr, M.Sc. LuFG i12 Thema: Feature Tracking for Space-Filling Structures Abstract: Feature-based visualization is a proven strategy to deal with the massive amounts of data emerging from time-dependent simulations: the analysis focuses on meaningful structures, i.e., said features. Feature tracking algorithms aim at automatically finding corresponding objects in successive time steps of these time-dependent data sets in order to assemble the individual objects into spatio-temporal features. Classically, feature-based visualization has focused on sparse structures, i.e. structures which cover only a small portion of the data domain. Given a sufficiently high temporal resolution, existing tracking approaches are able to reliably resolve the correspondence between feature objects of successive time steps. Our research is motivated by our collaborators' work on the statistical analysis of structures that are space-filling by definition: dissipation elements. Space-filling structures partition the entire domain. Our collaborators aim at extending their statistical analysis to a time-dependent setting. Hence, we introduce an efficient approach for general feature tracking which handles both sparse and space-filling data. To this end, we develop a framework for automatic evaluation of tracking approaches, an algorithmic framework for feature tracking, and an efficient implementation of this framework. First, we propose a novel evaluation framework based on algorithmic data generators, which provide synthetic data sets and the corresponding ground truth data. This framework facilitates the structured quantitative analysis of an approach's feature tracking performance and the comparison of different approaches based on the resulting measurements. Second, we introduce a novel approach for tracking both sparse and space-filling features. The correspondence between neighboring time-steps is determined by successively solving two graph optimization problems. In the first phase, one-to-one assignments are resolved by computing a maximum-weight, maximum-cardinality matching on a bi-partite graph. In its second phase, the algorithm detects events by finding a maximum weight independent set in a graph of all possible, potentially conflicting event explanations. Third, we show an optimized version of the second stage of the tracking framework which exploits the model-specific graph structure arising for the tracking problem. The method's effectiveness is demonstrated by a set of case studies including the use of the evaluation framework as well as the analysis of miscellaneous real-world simulation data sets. Es laden ein: die Dozentinnen und Dozenten der Informatik

2 1