informatik-vortraege

Download

informatik-vortraege@lists.rwth-aachen.de

January 2022

2 participants
2 discussions

Einladung Informatik-Oberseminar Malte Nuhn
by Sekretariat I6 30 May '22

30 May '22

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 12. Juli 2019, 10.00 Uhr Ort: Informatikzentrum, E3, Raum 9222 Referent: Dipl.-Inform. Malte Nuhn Thema: Unsupervised Training with Applications in Natural Language Processing// Abstract: The state-of-the-art algorithms for various natural language processing tasks require large amounts of labeled training data. At the same time, obtaining labeled data of high quality is often the most costly step in setting up natural language processing systems.Opposed to this, unlabeled data is much cheaper to obtain and available in larger amounts.Currently, only few training algorithms make use of unlabeled data. In practice, training with only unlabeled data is not performed at all. In this thesis, we study how unlabeled data can be used to train a variety of models used in natural language processing. In particular, we study models applicable to solving substitution ciphers, spelling correction, and machine translation. This thesis lays the groundwork for unsupervised training by presenting and analyzing the corresponding models and unsupervised training problems in a consistent manner.We show that the unsupervised training problem that occurs when breaking one-to-one substitution ciphers is equivalent to the quadratic assignment problem (QAP) if a bigram language model is incorporated and therefore NP-hard. Based on this analysis, we present an effective algorithm for unsupervised training for deterministic substitutions. In the case of English one-to-one substitution ciphers, we show that our novel algorithm achieves results close to human performance, as presented in [Shannon 49]. Also, with this algorithm, we present, to the best of our knowledge, the first automatic decipherment of the second part of the Beale ciphers.Further, for the task of spelling correction, we work out the details of the EM algorithm [Dempster & Laird + 77] and experimentally show that the error rates achieved using purely unsupervised training reach those of supervised training.For handling large vocabularies, we introduce a novel model initialization as well as multiple training procedures that significantly speed up training without hurting the performance of the resulting models significantly.By incorporating an alignment model, we further extend this model such that it can be applied to the task of machine translation. We show that the true lexical and alignment model parameters can be learned without any labeled data: We experimentally show that the corresponding likelihood function attains its maximum for the true model parameters if a sufficient amount of unlabeled data is available. Further, for the problem of spelling correction with symbol substitutions and local swaps, we also show experimentally that the performance achieved with purely unsupervised EM training reaches that of supervised training. Finally, using the methods developed in this thesis, we present results on an unsupervised training task for machine translation with a ten times larger vocabulary than that of tasks investigated in previous work. Es laden ein: die Dozentinnen und Dozenten der Informatik _______________________________________________ -- -- Stephanie Jansen Faculty of Mathematics, Computer Science and Natural Sciences HLTPR - Human Language Technology and Pattern Recognition RWTH Aachen University Ahornstraße 55 D-52074 Aachen Tel. Frau Jansen: +49 241 80-216 06 Tel. Frau Andersen: +49 241 80-216 01 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de Tel: +49 241 80-216 01/06 Fax: +49 241 80-22219 sek(a)i6.informatik.rwth-aachen.de www.hltpr.rwth-aachen.de

3 15

Einladung: Informatik-Oberseminar Frau Nadja Zaric
by Bombelka, Angelika 17 Jan '22

17 Jan '22

+********************************************************************** * * * Einladung * * * * Informatik-Oberseminar * * * +********************************************************************** Zeit: Freitag, 4. Februar 2022, 15.00 Uhr Ort: Zoom-Videokonferenz (https://rwth.zoom.us/j/98899850408?pwd=VTJWdUg2ZDUwK21mU2haa01HdEdIQT09) Referent: Nadja Zaric, M.Sc. Lehr- und Forschungsgebiet Informatik 9 Thema: PEGAM - a Personalized Gamification design Model for programming language e-courses Abstract: This dissertation addresses low academic participation and engagement as issues often related to students' retention in online learning courses. The issues were identified at the Department of Computer Science at RWTH Aachen University, Germany, although high dropout rates are a growing problem in Computer Science studies worldwide. A solving approach often used in addressing the beforementioned problems includes gamification and personalization techniques: Gamification is a process of applying game design principles in serious contexts (i.e., learning), while personalization refers to tailoring the context to users' needs and characteristics. In this work, the two techniques are used in combination in the Personalized Gamification Model (PeGaM), created to design an online course for learning programming languages. PeGaM is theoretically grounded in the principles of the Gamified Learning Theory and the theory of learning tendencies. Learning tendencies define learners' preferences for a particular form of behavior, and those behaviors are seen as possible moderators of gamification success. Moderators are a concept explained in the Gamified Learning Theory, and refer to variables that can influence the impact of gamification on the targeted outcomes. Gamification success is a measure of the extent to which students behave in a manner that leads to successful learning. The conceptual model of PeGaM is an iterative process in which learning tendencies are used to identify students who are believed to be prone to avoid certain activities. Gamification is then incorporated in activities that are recognized as 'likely to be avoided' to produce a specific learning-related behavior responsible for a particular learning outcome. PeGaM model includes five conceptual steps and 19 design principles required for gamification of learning environments that facilitate student engagement and participation. In practice, PeGaM was applied in an introductory JavaScript course with Bachelor students of Computer Science at RWTH Aachen University. The investigation was guided by the principles of the Design-Based Research approach. Through this approach, PeGaM was created, evaluated and revised, over three iterative cycles. The first cycle had an explorative character, included one control and one treatment group, and gathered 124 participants. The second and third cycles were experimental studies, in which 69 and 171 participants were randomly distributed along with one control and two treatment groups. Through the three interventions, mixed methods were used to capture students' academic participation (a measure of students' online behavior in the course collected through activity logs), engagement (evaluated quantitatively through a questionnaire compiled to measure behavioral, emotional, and cognitive engagement), and gameful experience (quantitative measure of students' experience with the gamified system). In addition, supporting data was collected through semi-structured interviews and open-ended survey questions. The empirical findings revealed that gamification with PeGaM contributes to learning outcomes and that the success of gamification is conditioned by the applicability of game elements with learners' preferences and learning activities. Cross case comparisons supported the application of PeGaM design principles and demonstrated its potential. Even though limited support was found to confirm the moderating role of learners' learning tendencies, the study demonstrated that the gamification of learning activities that students are likely to avoid can increase their participation - but must be carefully designed. Most importantly, educational gamification can support and enhance learning-related behavior but require relevant and meaningful learning activities in combination with carefully considered reward, collaborative and feedback mechanisms. The study provides practical and theoretical insights but also highlights challenges and limitations associated with personalized gamification thus offering suggestions for further investigation. Es laden ein: die Dozentinnen und Dozenten der Informatik

1 0