November 2020 - informatik-vortraege

Einladung Informatik-Oberseminar Malte Nuhn
by Sekretariat I6 30 May '22

30 May '22

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 12. Juli 2019, 10.00 Uhr Ort: Informatikzentrum, E3, Raum 9222 Referent: Dipl.-Inform. Malte Nuhn Thema: Unsupervised Training with Applications in Natural Language Processing// Abstract: The state-of-the-art algorithms for various natural language processing tasks require large amounts of labeled training data. At the same time, obtaining labeled data of high quality is often the most costly step in setting up natural language processing systems.Opposed to this, unlabeled data is much cheaper to obtain and available in larger amounts.Currently, only few training algorithms make use of unlabeled data. In practice, training with only unlabeled data is not performed at all. In this thesis, we study how unlabeled data can be used to train a variety of models used in natural language processing. In particular, we study models applicable to solving substitution ciphers, spelling correction, and machine translation. This thesis lays the groundwork for unsupervised training by presenting and analyzing the corresponding models and unsupervised training problems in a consistent manner.We show that the unsupervised training problem that occurs when breaking one-to-one substitution ciphers is equivalent to the quadratic assignment problem (QAP) if a bigram language model is incorporated and therefore NP-hard. Based on this analysis, we present an effective algorithm for unsupervised training for deterministic substitutions. In the case of English one-to-one substitution ciphers, we show that our novel algorithm achieves results close to human performance, as presented in [Shannon 49]. Also, with this algorithm, we present, to the best of our knowledge, the first automatic decipherment of the second part of the Beale ciphers.Further, for the task of spelling correction, we work out the details of the EM algorithm [Dempster & Laird + 77] and experimentally show that the error rates achieved using purely unsupervised training reach those of supervised training.For handling large vocabularies, we introduce a novel model initialization as well as multiple training procedures that significantly speed up training without hurting the performance of the resulting models significantly.By incorporating an alignment model, we further extend this model such that it can be applied to the task of machine translation. We show that the true lexical and alignment model parameters can be learned without any labeled data: We experimentally show that the corresponding likelihood function attains its maximum for the true model parameters if a sufficient amount of unlabeled data is available. Further, for the problem of spelling correction with symbol substitutions and local swaps, we also show experimentally that the performance achieved with purely unsupervised EM training reaches that of supervised training. Finally, using the methods developed in this thesis, we present results on an unsupervised training task for machine translation with a ten times larger vocabulary than that of tasks investigated in previous work. Es laden ein: die Dozentinnen und Dozenten der Informatik _______________________________________________ -- -- Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences HLTPR - Human Language Technology and Pattern Recognition RWTH Aachen University Ahornstraße 55 D-52074 Aachen Tel. Frau Jansen: +49 241 80-216 06 Tel. Frau Andersen: +49 241 80-216 01 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de Tel: +49 241 80-216 01/06 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de

3 15

02.12.2020, 10:30, Kuldeep Meel (NUS Singapore): Sparse Hashing for Scalable Approximate Model Counting: When Theory and Practice Finally Meet
by Bolke-Hermanns, Helen 27 Nov '20

27 Nov '20

Dear members of the Computer Science Department, You are cordially invited to the talk of Kuldeep Meel (NUS Singapore)<https://www.unravel.rwth-aachen.de/cms/UnRAVeL/Das-Graduiertenkolleg/Gastwi…> for next week, 2,12., 10:30. Title: Sparse Hashing for Scalable Approximate Model Counting: When Theory and Practice Finally Meet Given a Boolean formula F, the problem of model counting, also referred to as #SAT, is to compute the number of solutions of F. The hashing-based techniques for approximatecounting have emerged as a dominant approach, promising achievement of both scalability and rigorous theoretical guarantees. The standard construction of strongly 2-universal hash functions employs dense XORs (i.e., involving half of the variables in expectation), which is widely known to cause degradation in the runtime performance of state of the art SAT solvers. Consequently, the past few years have witnessed an intense activity in the design of sparse XORs as hash functions. In this talk, we will first survey the known results, and identify the crucial bottleneck contributing to their lack of ability to provide speedups. We will then formalize a relaxation of universal hashing, called concentrated hashing, and establish a novel and beautiful connection between concentration measures of these hash functions and isoperimetric inequalities on boolean hypercubes. This allows us to obtain tight bounds on variance as well as the dispersion index and show that logarithmically sized XORs suffices for the design of sparse hash functions belonging concentrated hash family. Finally, we use sparse hash functions belonging to this concentrated hash family to develop new approximate counting algorithms. Our comprehensive experimental evaluation of our algorithm on 1896 benchmarks with computational effort of over 20,000 computational hours demonstrates speedup compared to existing approaches. To the best of our knowledge, this work is the first study to demonstrate runtime improvement of approximate model counting algorithms through the usage of sparse hash functions, while still retaining strong theoretical guarantees. (Based on Joint work with S. Akshay, D. Agarwal, and Bhavishya; The corresponding papers were published at SAT-20 and LICS-20) Bio: Kuldeep Meel is Sung Kah Kay Assistant Professor of Computer Science in School of Computing at the National University of Singapore. He received his Ph.D. (2017) and M.S. (2014) degree from Rice University, and B. Tech. (with Honors) degree (2012) in Computer Science and Engineering from Indian Institute of Technology, Bombay. He is a recipient of 2019 NRF Fellowship for AI. His research interests lie at the intersection of Artificial Intelligence and Formal Methods. His work received the 2018 Ralph Budd Award for Best PhD Thesis in Engineering, 2014 Outstanding Masters Thesis Award from Vienna Center of Logic and Algorithms and Best Student Paper Award at CP 201 Wednesday, 2.12.2020, 10:30 https://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…> Meeting ID: 920 4794 9381 Password: unravel https://www.unravel.rwth-aachen.de/go/id/kfsgc Best regards Helen Bolke-Hermanns Helen Bolke-Hermanns RTG UnRAVeL - RWTH Aachen University Ahornstr. 55, D-52074 Aachen Building E3, 2nd floor, Room 9218 Telefon: +49 (241) 80-21 004 Fax: +49 (241) 80-22 215 E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de> Web: www.unravel.rwth-aachen.de<http://www.unravel.rwth-aachen.de/> [rwth_unravel_en_rgb]

1 0

Kostenloses ONLINE Agile Programming Training für Studierende - 04.01.-08.01.2021
by studenten＠ws4s.de 19 Nov '20

19 Nov '20

Liebe IT-Studierende, alle von euch werden während des Studiums vermutlich schon etwas über agile Methoden erfahren haben. Jetzt habt ihr die Möglichkeit Praxiserfahrung zu sammeln und agile Methoden in der Praxis anzuwenden. Ich würde euch gerne herzlich zum kostenlosen ONLINE Agile Programming Training bei andrena objects einladen: Agile Programming Training - Online (aus Köln) - 04.01.-08.01.2021 Was erwartet euch? - Agile Methoden und andrena objects kennenlernen - Interaktive Scrum-Sprints in Gruppen - Netzwerken und Austausch über Einstiegsmöglichkeiten - Kostenloses Teilnehmerzertifikat für eure Bewerbungsunterlagen Um eine kurze Anmeldung wird gebeten unter: https://ws4s.de/veranstaltungen/online-agile-development-training/ Die Plätze sind begrenzt und in der Regel auch sehr schnell ausgebucht. Sichert euch jetzt einen der begrenzten Plätze! :-) Wer ist andrena objects? andrena objects ist ein mittelständisches Softwareunternehmen und einer der Vorreiter der agilen Softwareentwicklung. Bei diesem besonderen fünfttägigen Training bekommt ihr tiefe Einblicke in die Methoden der agilen Softwareentwicklung und den Arbeitsalltag bei andrena objects. Ihr durchlauft innerhalb dieser Tage in Teams (und mit Unterstützung von Experten) vier beispielhafte Scrum-Sprints und lernt dabei Arbeitspraktiken aus Scrum, Extreme Programming und der Projektpraxis von andrena und SAP kennen. Bei Fragen zögert nicht, mir einfach direkt zu schreiben. Viele Grüße, Marcel vom Workshops4Students-Team _____________________________ Co-Founder Workshops4Students E-Mail: studenten(a)ws4s.de mailto:studenten@ws4s.de Web: http://www.ws4s.de

1 0

reminder: today, 10:30: Prof. Gabriele Kern-Isberner (TU Dortmund): Towards Lifted Inference: Counting Strategies for Relational Maximum Entropy Reasoning
by Bolke-Hermanns, Helen 04 Nov '20

04 Nov '20

This is a short reminder. Prof. Gabriele Kern-Isberner (TU Dortmund) Title: Towards Lifted Inference: Counting Strategies for Relational Maximum Entropy Reasoning Abstract: The principle of maximum entropy (MaxEnt) constitutes a meaningful methodology for drawing non monotonic inferences from probabilistic conditional knowledge as it satisfies some fundamental principles from commonsense reasoning. Similar to alternative approaches to probabilistic reasoning in relational settings, straightforward maximum entropy computations suffer from an exponential dependence from the size of the underlying domain which can lead to intractability in many cases. To overcome this problem, we adopt techniques from weighted first-order model counting (WFOMC) which exploit symmetries and interchangeabilities among the domain elements in order to count models of sentences more efficiently. To meet the requirements of our formalization of knowledge by conditionals, we assign a type to the counted models which captures the three-valued evaluation of the conditionals, i.e. the conditional structure of the models. We present the resulting variant of model counting which we call 'typed model counting' and discuss its benefits by means of some illustrating examples. https://www.informatik.rwth-aachen.de/go/id/jgfze Today, 10:30 https://rwth.zoom.us/j/92047949381?pwd=LzIwUW96WEM0MkRjZ01FUmhwd1I3QT09<https://www.google.com/url?q=https://rwth.zoom.us/j/92047949381?pwd%3DLzIwU…> Meeting ID: 920 4794 9381 Password: unravel Helen Bolke-Hermanns Fachgruppe Informatik RWTH Aachen University Ahornstr. 55, D-52074 Aachen Building E3, 2nd floor, Room 9218 Telefon: +49 (241) 80-21-004 Fax: +49 (241) 80-22 215 E-Mail: Helen.Bolke-Hermanns(a)Informatik.RWTH-Aachen.de<mailto:Helen.Bolke-Hermanns@Informatik.RWTH-Aachen.de> [rwth_informatik_bild_rgb]

1 0

Einladung Informatik-Oberseminar Jan Dreier
by Jan Dreier 02 Nov '20

02 Nov '20

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Montag, 16. November 2020, 13:00 Uhr Zoom: https://rwth.zoom.us/j/94172280001?pwd=K2Z6Z1ZHbUVVdDBDbjNSWVBVUTZoZz09 Meeting ID: 941 7228 0001 Passcode: N.gdJ5 Referent: Jan Dreier, M.Sc. Lehr- und Forschungsgebiet Theoretische Informatik Thema: Two New Perspectives on Algorithmic Meta-Theorems Abstract: We present two algorithmic meta-theorems. Our first meta-theorem is an approximation scheme of the parameterized model-checking problem for the first-order counting logic FO({>0}). This logic extends first-order logic with the ability to count the number of satisfying assignments of a variable and compare it to fixed but arbitrarily large numbers. The model-checking scheme runs in linear fpt time (parameterized by formula length) on structures with bounded expansion and either gives the correct answer or says ``I do not know.'' The latter answer may only be given if small perturbations in the number-symbols of the formula could make it both satisfied and unsatisfied. As a consequence, every property expressible in FO({>0}) can be approximated in linear time on bounded expansion graph classes. A restricted set of counting properties can also be solved exactly in linear time, leading for example to an fpt algorithm for partial dominating set on bounded expansion graph classes. These results are complemented by showing that exactly solving the model-checking problem for FO({>0}) is already hard on trees of bounded depth and just slightly increasing the expressiveness of FO({>0}) makes even approximation hard on trees. Our second meta-theorem is a first-order model-checking algorithm for random graphs and complex networks. Roughly speaking, we say a random graph model is alpha-power-law-bounded if it is unclustered and the fraction of vertices with degree k is O(k^-alpha). We solve the parameterized first-order model-checking problem in almost linear fpt time on random graph models satisfying this property with alpha >= 3. This means in particular that one can solve every first-order property in almost linear expected time on these random graph models. This includes many popular models of complex networks such as preferential attachment graphs, Chung–Lu graphs, configuration graphs, and sparse Erdős–Rényi graphs. We present hardness results that show intractability for alpha < 3, assuming we allow an adversary to color the vertices of the input graph. We base our algorithms on a new structural decomposition of alpha-power-law-bounded random graphs and on new concentration bounds for the degree evolution in preferential attachment graphs. Es laden ein: die Dozentinnen und Dozenten der Informatik -- Jan Dreier RWTH Aachen University https://tcs.rwth-aachen.de/~dreier/

1 0