{"title":"New High German Texts for Evidence of Phrasemes","authors":"C. Mahlow, Britta Juska-Bacher","doi":"10.21248/jlcl.26.2011.151","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.151","url":null,"abstract":"Most dictionaries containing phraseological information are restricted to a synchronic perspective. Diachronic information on structural, semantic, and pragmatic change over time has to be reconstructed by a time-consuming consultation of various dictionaries providing only punctual insights. In the OLdPhras, project we construct an online dictionary for diachronic phraseology in German from ca. 1650 to the present by combining dic- tionary exploration with corpus-based methods. This paper highlights some challenges we have met: How to select the interesting phrasemes, i.e., those that underwent some change? How to deal with historical cor- pora? How to include different kinds of phraseme variation? We present a semi-automatic corpus-based approach for the investigation of phraseme development. We argue for a combination of dictionary exploration and corpus-based methods to provide reliable and extensive information about the diachronic development of German phrasemes.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129965078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Meaning versus Form in Computer-assisted Task-based Language Learning: A Case Study on the German Dative","authors":"Sabrina Wilske, Magdalena Wolska","doi":"10.21248/jlcl.26.2011.134","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.134","url":null,"abstract":"We report on a study which investigated the effects of three types of feedback realized in instructional dialogues with a computer-based language learning system for German. The interaction was framed within a directions giving task and the linguistic form in focus was the dative case in prepositional phrases. The feedback types differed with respect to the focus they put on form versus meaning and the explicitness of feedback in response to learner errors. The results of the study suggest that a stronger focus on form is related to greater accuracy gains in using the form. The integration of incidental focus on form within a primarily meaning-based task increases accuracy as well, however to a lesser extent.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121108130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Latent-Semantic Analysis and Network Analysis for Monitoring Conceptual Development","authors":"Fridolin Wild, D. Haley, Katja Bülow","doi":"10.21248/jlcl.26.2011.133","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.133","url":null,"abstract":"This paper describes and evaluates CONSPECT (from concept inspection), an application that analyses states in a learner’s conceptual development. It was designed to help online learners and their tutors monitor conceptual development and also to help reduce the workload of tutors monitoring a learner’s conceptual development. CONSPECT combines two technologies - Latent Semantic Analysis (LSA) and Network Analysis (NA) into a technique called Meaningful Interaction Analysis (MIA). LSA analyses the meaning in the textual digital traces left behind by learners in their learning journey; NA provides the analytic instrument to investigate (visually) the semantic structures identified by LSA. This paper describes the validation activities undertaken to show how well LSA matches first year medical students in 1) grouping similar concepts and 2) annotating text.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124490487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sprachressourcen in der Lehre ? Erfahrungen, Einsatzszenarien, Nutzerwünsche","authors":"Frank Binder, H. Lüngen, Henning Lobin","doi":"10.21248/jlcl.26.2011.136","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.136","url":null,"abstract":"Im zweiten Teil dieses Themenheftes steht eine andere Konstellation im Mittelpunkt. Hier geht es vor allem um die Frage, wie die forschungsbezogene Nutzung von Sprachressourcen (Daten, Werkzeuge, Services) in der akademischen Lehre vermittelt werden kann und welche Rahmenbedingungen den Einsatz von Sprachressourcen in der Lehre erleichtern können. Vor dem Hintergrund vielfältiger aktueller Bestrebungen, die Möglichkeiten der digitalen Geisteswissenschaften durch die Schaffung von Forschungsinfrastrukturen zu erweitern, stellt sich neben den anvisierten forschungsbezogenen Nutzungsszenarien digitaler Ressourcen auch die Frage, wie entsprechende Inhalte und Arbeitsweisen in die akademische Lehre integriert werden können. Vielerorts werden Sprachressourcen in Forschungsprojekten erarbeitet und genutzt. Neben der langfristigen Sicherung dieser Ressourcen in technischen Infrastrukturen ist deren nachhaltige Nutzung und Pflege im Interesse sowohl der Nutzer als auch der Anbieter. Im Sinne des Humboldt’schen Ideals der Einheit von Forschung und Lehre ist dabei insbesondere der Transfer von der Forschung in die Lehre und zurück von Interesse. Den Lernenden bzw. den Studierenden erschließen sich die Möglichkeiten der Nutzung von Sprachressourcen und Werkzeugen, und sie werden auf lange Sicht zu deren Nutzern, die wiederum den Fortbestand der Ressourcen sichern. Eine Schlüsselrolle zwischen den Anbietern und den zukünftigen Nutzerinnen und Nutzern kommt dabei den Dozentinnen und Dozenten zu, die die Technik und den forschungsmethodischen Umgang mit den Ressourcen vermitteln. Neben der Arbeit mit akademischen Lehrwerken spielt dabei auch die unmittelbare Nutzung von Sprachressourcen eine zunehmend stärkere Rolle, sei es direkt in den Lehrveranstaltungen oder im Rahmen von Hausarbeiten oder studentischen Projekten. Der Einsatz von Sprachressourcen stellt dabei in mehrfacher Hinsicht eine Herausforderung dar, weist jedoch auch Parallelen zu anderen Bereichen auf, in denen computerbasierte Verfahren in Forschung und Lehre genutzt werden. Vor diesem Hintergrund wurde im Rahmen des vom BMBF geförderten Projekts D-SPIN (Deutsche Sprachressourcen-Infrastruktur) ein Workshop zu „Sprachressourcen in der Lehre“ veranstaltet. Aus einigen Vorträgen des Workshops sind nun Beiträge für dieses Themenheft hervorgegangen. Wir bedanken uns ausdrücklich bei den Autorinnen und Autoren, welche sich bereit erklärt haben, die jeweiligen Workshopbeiträge zu Artikeln für dieses Themenheft aufzubereiten.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115383276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Morphological and Part-of-Speech Tagging of Historical Language Data: A Comparison","authors":"Stefanie Dipper","doi":"10.21248/jlcl.26.2011.144","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.144","url":null,"abstract":"This paper deals with morphological and part-of-speech tagging applied to manuscripts written in Middle High German. I present the results of a set of experiments that involve different levels of token normalization and dialect-specific subcorpora. As expected, tagging with “normalized”, quasi-standardized tokens performs best. Normalization improves accuracies by 3.56–7.10 percentage points, resulting in accuracies of > 79% for morphological tagging, and > 91% for part-of-speech tagging. Comparing Middle with New High German data of similar size, the evaluation shows that part-of-speech tagging, but not morphological tagging, is clearly easier with modern data.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125317020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eiríkur Rögnvaldsson, A. Ingason, E. Sigurðsson, Joel C. Wallenberg
{"title":"Creating a Dual-Purpose Treebank","authors":"Eiríkur Rögnvaldsson, A. Ingason, E. Sigurðsson, Joel C. Wallenberg","doi":"10.21248/jlcl.26.2011.153","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.153","url":null,"abstract":"We describe the background for and building of IcePaHC, a one million word parsed historical corpus of Icelandic which has just been finished. This corpus which is completely free and open contains fragments of 60 texts ranging from the late 12 century to the present. We describe the text selection and text collecting process and discuss the quality of the texts and their conversion to modern Icelandic spelling. We explain why we choose to use a phrase structure Penn style annotation scheme and briefly describe the syntactic annotation process. Furthermore, we advocate the importance of an open source policy as regards language resources.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"37 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116537550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digitale Sprachressourcen in Lehramtsstudiengängen: Kompetenzen - Erfahrungen - Desiderate","authors":"Michael Beißwenger, Angelika Storrer","doi":"10.21248/jlcl.26.2011.140","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.140","url":null,"abstract":"– wie sich die Arbeit mit digitalen Sprachressourcen in die Curricula der Hochschulgermanistik und der Lehrplane fur das Fach Deutsch an Schulen integrieren lasst und weshalb wir die Vermittlung entsprechender Kompetenzen in Lehramtsstudiengangen fur wichtig und hochgradig berufsfeldrelevant halten, – wie wir an der Technischen Universitat Dortmund in den Bereichen Linguistik und Sprachdidaktik Sprachressourcen in der Lehre einsetzen (und welche), – welche Erfahrungen wir dabei gemacht haben und welche Wunsche und Anregungen zur Erleichterung des didaktischen Einsatzes von Sprachressourcen sich daraus ableiten lassen.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127275229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chancen und Probleme der Nutzung von Korpora, Taggern und anderen Sprachressourcen in Seminaren","authors":"Heike Zinsmeister","doi":"10.21248/jlcl.26.2011.137","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.137","url":null,"abstract":"Mit Korpora oder lexikalisch-semantischen Ressourcen zu arbeiten und dabei Programme zur Aufbereitung oder Analyse der Daten zu nutzen, gehort zum Alltag vieler Computerlinguisten. Computerlinguistische Studiengange sollten daher ihren Studierenden nicht nur Wissen uber Theorien und Algorithmen vermitteln – und eigene Programmierkenntnisse — sondern sie auch auf den Umgang mit vorhandenen Sprachressourcen vorbereiten. Hierbei sind nicht Ressourcen fur das E-Learning gemeint, sondern Sprachressourcen, die unabhangig von einer Verwendung in der Lehre entwickelt wurden. Beispiele fur solche Sprachressourcen sind:","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127287985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Asif Ekbal, Francesca Bonin, S. Saha, Egon W. Stemle, E. Barbu, F. Cavulli, Christian Girardi, Massimo Poesio
{"title":"Rapid Adaptation of NE Resolvers for Humanities Domains using Active Annotation","authors":"Asif Ekbal, Francesca Bonin, S. Saha, Egon W. Stemle, E. Barbu, F. Cavulli, Christian Girardi, Massimo Poesio","doi":"10.21248/jlcl.26.2011.145","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.145","url":null,"abstract":"The entities mentioned in collections of scholarly articles in the Humanities (and in other scholarly domains) belong to different types from those familiar from news corpora, hence new resources need to be annotated to create supervised taggers for tasks such as ne extraction. However, in such domains there is a great need for making the best use possible of the annotators. One technique designed for this purpose is active annotation. We discuss our use of active annotation for annotating corpora of articles about Archaeology in the Portale della Ricerca Umanistica Trentina.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133553499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digitale Korpora in der Lehre - Anwendungsbeispiele aus der Theoretischen Linguistik und der Computerlinguistik","authors":"S. Dipper","doi":"10.21248/jlcl.26.2011.138","DOIUrl":"https://doi.org/10.21248/jlcl.26.2011.138","url":null,"abstract":"In diesem Artikel werden verschiedene Szenarien aus der Lehre vorgestellt, in denen Korpora (und andere Sprachressourcen) Einsatz finden. Jeweils zwei Beispiele illustrieren die Nutzung von Korpora in der Theoretischen Linguistik und in der Computerlinguistik. In der Theoretischen Linguistik dienen Korpora als Belegquellen oder Testdaten fur die Hypothesen aus der theoretischen Forschung. In der Computerlinguistik werden Korpora fur die Anwendungsentwicklung oder fur den Ressourcenaufbau eingesetzt.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124126471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}