November 2021 - informatik-vortraege

Einladung Informatik-Oberseminar Malte Nuhn
by Sekretariat I6 30 May '22

30 May '22

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 12. Juli 2019, 10.00 Uhr Ort: Informatikzentrum, E3, Raum 9222 Referent: Dipl.-Inform. Malte Nuhn Thema: Unsupervised Training with Applications in Natural Language Processing// Abstract: The state-of-the-art algorithms for various natural language processing tasks require large amounts of labeled training data. At the same time, obtaining labeled data of high quality is often the most costly step in setting up natural language processing systems.Opposed to this, unlabeled data is much cheaper to obtain and available in larger amounts.Currently, only few training algorithms make use of unlabeled data. In practice, training with only unlabeled data is not performed at all. In this thesis, we study how unlabeled data can be used to train a variety of models used in natural language processing. In particular, we study models applicable to solving substitution ciphers, spelling correction, and machine translation. This thesis lays the groundwork for unsupervised training by presenting and analyzing the corresponding models and unsupervised training problems in a consistent manner.We show that the unsupervised training problem that occurs when breaking one-to-one substitution ciphers is equivalent to the quadratic assignment problem (QAP) if a bigram language model is incorporated and therefore NP-hard. Based on this analysis, we present an effective algorithm for unsupervised training for deterministic substitutions. In the case of English one-to-one substitution ciphers, we show that our novel algorithm achieves results close to human performance, as presented in [Shannon 49]. Also, with this algorithm, we present, to the best of our knowledge, the first automatic decipherment of the second part of the Beale ciphers.Further, for the task of spelling correction, we work out the details of the EM algorithm [Dempster & Laird + 77] and experimentally show that the error rates achieved using purely unsupervised training reach those of supervised training.For handling large vocabularies, we introduce a novel model initialization as well as multiple training procedures that significantly speed up training without hurting the performance of the resulting models significantly.By incorporating an alignment model, we further extend this model such that it can be applied to the task of machine translation. We show that the true lexical and alignment model parameters can be learned without any labeled data: We experimentally show that the corresponding likelihood function attains its maximum for the true model parameters if a sufficient amount of unlabeled data is available. Further, for the problem of spelling correction with symbol substitutions and local swaps, we also show experimentally that the performance achieved with purely unsupervised EM training reaches that of supervised training. Finally, using the methods developed in this thesis, we present results on an unsupervised training task for machine translation with a ten times larger vocabulary than that of tasks investigated in previous work. Es laden ein: die Dozentinnen und Dozenten der Informatik _______________________________________________ -- -- Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences HLTPR - Human Language Technology and Pattern Recognition RWTH Aachen University Ahornstraße 55 D-52074 Aachen Tel. Frau Jansen: +49 241 80-216 06 Tel. Frau Andersen: +49 241 80-216 01 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de Tel: +49 241 80-216 01/06 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de

3 15

Einladung: Informatik-Oberserminar Lucas Beyer
by Lucas Beyer 29 Nov '21

29 Nov '21

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 03. Dezember 2021, 10.00 Uhr Ort: Zoom Videokonferenz https://us02web.zoom.us/j/88210359250?pwd=QXFGMUEzTGRDVmJBbHgwT0lXcEhrdz09 Meeting-ID: 882 1035 9250 Kenncode: 542690 Referent: Lucas Beyer, Dipl.Ing Lehrstuhl Informatik 13 Thema: Deep Visual Human Sensing with Application in Robotics Abstract: In this talk, I present my thesis contributions to the field of visual human sensing that arise when deploying robots in environments with humans. After motivating the need for visual human sensing, we start by describing a novel human detector based on a 2D lidar sensor (e.g. a "laser scanner"). It is the first of its kind that is learning-based and general, specifically it does not encode a "two leg prior". Detection being covered, we move on to discuss person re-identification, and specifically our contribution of establishing triplet-loss based methods as a strong contender and principled approach in the field. Using this we also sketch the way to a completely novel approach on tracking which leverages such triplet-based re-identification models at its core. We then discuss more detailed analysis of individual persons, specifically their head orientation, which can serve as a cue for their intent or an indicator of what is interesting in the scene, among other things. We derive a novel cyclic regression loss based on the von-Mises distribution and use it, coupled with our "biternion" output layer, to learn continuous regression models using only discrete, weakly labeled data. Finally, we present a holistic system integrating all of these pieces and several more, highlighting the system-level difficulties of such integration, and proposing some ways around them. Es laden ein: die Dozentinnen und Dozenten der Informatik

1 0

Einladung Informatik Kolloquium Christopher Morris
by Grohe, Martin 24 Nov '21

24 Nov '21

*********************************************************************** * * * Einladung * * * * Informatik-Kolloquium * * * *********************************************************************** Zeit: Mittwoch, 1. Dezember 2021, 13.30 Uhr Ort: Zoom Videokonferenz https://rwth.zoom.us/j/95857189087?pwd=ajNJYUZFcHVvSHNFUmJya1RqUFhKUT09 Meeting-ID: 958 5718 9087 Kenncode: 050524 Referent: Christopher Morris, Quebec AI Institute and McGill University Thema: Learning with Graphs: From Theory to Applications Abstract: Graph-structured data is ubiquitous across domains ranging from chemo- and bioinformatics to image and social network analysis. To develop successful machine learning models in these domains, we need techniques mapping the graph's structure to a vectorial representation in a meaningful way---so-called graph embeddings. Starting from the 1960s in chemoinformatics, different research communities have worked in the area under various guises, often leading to recurring ideas. Moreover, triggered by the resurgence of (deep) neural networks, there is an ongoing trend in the machine learning community to design permutation-invariant or -equivariant neural architectures capable of dealing with graph input often denoted as neural graph networks (GNNs). However, although often successful in practice, GNN's capabilities and limits are understood to a lesser extend. In this talk, we overview some results shedding some light on the limitations and capabilities of GNNs by leveraging tools from graph theory and related areas. To complement the theory, we show how GNNs can act as an inductive bias to enhance state-of-art solvers for combinatorial optimization in a data-driven way. Es laden ein: die Dozentinnen und Dozenten der Informatik — Martin Grohe RWTH Aachen Lehrstuhl Informatik 7 Ahornstr. 55 52074 Aachen Germany e: grohe(a)informatik.rwth-aachen.de t: +49 241 80 21700 f: +49 241 8022 215

1 0

Einladung: Informatik-Oberserminar Theodora Kontogianni
by Theodora Kontogianni 24 Nov '21

24 Nov '21

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Montag, 29. November 2021, 14.00 Uhr Ort: Zoom Videokonferenz https://rwth.zoom.us/j/93845227037?pwd=cm9qRjhtVm5JbWRYdGkrSUsyRythdz09 <https://www.google.com/url?q=https://rwth.zoom.us/j/93845227037?pwd%3Dcm9qR…> Meeting ID: 938 4522 7037 Passcode: 310833 Referent: Theodora Kontogianni, M.Sc. Lehrstuhl Informatik 13 Thema: Object Discovery, Interactive and 3D Segmentation for Large-Scale Computer Vision Tasks Abstract: In this talk, I present my thesis contributions that deal with issues arising when trying to exploit the large body of data available for computer vision tasks. In particular we address the problem of unsupervised object discovery in time-varying, large-scale image collections by proposing a novel tree structure that closely approximates the Minimum Spanning Tree and present an efficient construction approach along with an incremental update mechanism of the tree structure that incorporates new data as they are added to the image database. We then focus on defining novel 3D convolutional and recurrent operators over unstructured 3D point clouds. The goal is to learn point representations for the task of 3D semantic segmentation. We overcome the limitations of the unstructured and large-scale nature of the 3D point clouds by defining local structure through two clustering methods and expand the limited receptive field of previous approaches by modeling long-range relationships with the use of Recurrent Networks. In the third part, we address the task of interactive object segmentation where a computer vision algorithm segments an object aided by a human user. We present a method that significantly reduces the number of required user clicks compared to previous works. We use the sparse user corrections to adapt the model parameter on-the-fly during test time. In particular, we look at out-of-domain settings where the test datasets are significantly different from the datasets used to train our deep learning model. Es laden ein: die Dozentinnen und Dozenten der Informatik

1 0

Einladung: Informatik-Oberseminar Herr Martin Liebenberg
by Leany Maaßen 15 Nov '21

15 Nov '21

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Montag, 22. November 2021, 10:00-11:00 Uhr Zoom: https://rwth.zoom.us/j/95217813154?pwd=RkQ1ZTllbi94OUZiZDRNRE15eGpHZz09 Meeting ID: 952 1781 3154 Passcode: 596536 Referent: Herr Dipl.-Inform. Martin Liebenberg LuFG Informatik 5 Thema: Autonomous Agents for the World Wide Lab Artificial Intelligence in the Manufacturing Industry Abstract: The Internet of Production (IoP) is a research programme, where 30 interdisciplinary institutes work on revolutionising the manufacturing industry. A central concept of the IoP is the World Wide Lab (WWL) by which in a lab of labs the data of many manufacturing processes should be made available as if the data came from ones own manufacturing processes. With this data, which we receive from the WWL, we want to build Digital Shadows that are condensed or aggregated data for a specific purpose, such as a reduced mathematical model or a trained neural network. An early vision of the usage of the IoP is a Google-like web search, where one can pose a manufacturing problem and get in return an answer with which one can improve ones production process or build new products. In my thesis, I propose a solution to realise such a scenario based on Artificial Intelligence (AI) methods, which I call WWL Agents. Inspired by the ideas of the Semantic Web, these agents should automate the search for data, knowledge or Digital Shadows in the WWL for specific manufacturing problems, which we think is impractical to do manually. Furthermore, WWL Agents should apply the found information to build Digital Shadows or improve manufacturing processes. In this talk, we present the development of WWL Agents from three different perspectives. First, we consider it from the perspective of building Digital Shadows in a cross-domain collaboration. The second perspective relates to modelling the behaviour of WWL agents. Finally, we discuss the infrastructure required by a WWL Agent to provide semantic interoperability in the WWL. By these means we obtain a powerful concept by which the user can get the precise meaning of an answer and, through provenance information, knowledge about the origin of entities of the answer. Moreover, we demonstrate applications for WWL Agents in manufacturing in one exemplary use case where the agents plan production processes. In hot rolling, we show that, with local search, agents can find very quickly schedules, which could be used to repair failed rolling schedules during operation. Es laden ein: die Dozentinnen und Dozenten der Informatik _______________________________ Leany Maaßen RWTH Aachen University Lehrstuhl Informatik 5, LuFG Informatik 5 Prof. Dr. Stefan Decker, Prof. Dr. Matthias Jarke, Prof. Gerhard Lakemeyer Ph.D. Ahornstrasse 55 D-52074 Aachen Tel: 0241-80-21509 Fax: 0241-80-22321 E-Mail: maassen(a)dbis.rwth-aachen.de

1 0