{"title":"Digitising a corpus of Austrian dialect recordings from the 20th century","authors":"Christian Huber, Benjamin Fischer","doi":"10.1553/oe_phonogrammarchiv","DOIUrl":"https://doi.org/10.1553/oe_phonogrammarchiv","url":null,"abstract":"The Corpus of Austrian Dialect Recordings from the 20 th Century comprises 2442 dialect recordings from the Phonogrammarchiv’s holdings on magnetic tape from fieldwork conducted in the years 1951 to 1995 by German philologists Eberhard Kranzmayer, Maria Hornung, Werner Bauer, Herbert Tatzreiter and others. They cover all provinces of Austria as well as linguistic varieties of German spoken in Northern Italy, Hungary and former Yugoslavia and Czechoslovakia. In a project cooperation of the Phonogrammarchiv with the Research Department “Variation and Change of German in Austria” (both Austrian Academy of Sciences) and the Austrian Science Fund Special Research Programme “German in Austria” (F60), these recordings are now digitised, annotated and analysed, and will be made searchable in a database. In this article we introduce and discuss the corpus and address various digitisation and metadata-related issues with special attention to real-world questions and problems encountered.","PeriodicalId":210552,"journal":{"name":"Digital Lexis and Beyond","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115301227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Komparative Zeitreihenanalyse der lexikalischen Stabilität und Emotion in österreichischen Korpusdaten","authors":"","doi":"10.1553/austrian_corpora","DOIUrl":"https://doi.org/10.1553/austrian_corpora","url":null,"abstract":"","PeriodicalId":210552,"journal":{"name":"Digital Lexis and Beyond","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131844163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"“Joy” and “Fear” in Thomas Bernhard’s autobiographies: Aspects of a Computational Sentiment Analysis","authors":"M. Sellner","doi":"10.1553/SENTIMENT_ANALYSISS1","DOIUrl":"https://doi.org/10.1553/SENTIMENT_ANALYSISS1","url":null,"abstract":"This pilot-study of a computational analysis of literary texts presents the results of aspects of a “sentiment analysis”. The data of analysis are the autobiographies of the Austrian novelist Thomas Bernhard. The primary object of attention are the sentiments “joy” and “fear”. We elaborate on and demonstrate the impact of several preprocessing procedures, describe the characteristics of the dictionary and the annotations of its entries conceived and used for analysis. We specify the general methodology and the steps involved for quantifying of its result by the use of the functions of the R-package “Quanteda”. The descriptive output of the procedures is examined with several statistical measures to compare the counts of “joy” vs “fear” that were found in the texts individually, contrastively and in combination as a corpus. We conclude that there is a proportional and relative difference between the frequencies of the sentiments of the individual texts, but that this observation is insignificant if interpreted on the basis of the non-parametric Wilcoxon rank-sum test. A “goodness of fit” test, on the other hand, shows that the two sentiments show a homogeneous distribution across the corpus","PeriodicalId":210552,"journal":{"name":"Digital Lexis and Beyond","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128090035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Etymology and Language Contact Through Digital Lexicographical Encoding: The Dictionary of Loanwords in the Midrash Genesis Rabbah (DLGenR)","authors":"Christina Katsikadeli, V. Slepoy, Thomas Klampfl","doi":"10.1553/DLGENR_LOANWORDSS1","DOIUrl":"https://doi.org/10.1553/DLGENR_LOANWORDSS1","url":null,"abstract":"","PeriodicalId":210552,"journal":{"name":"Digital Lexis and Beyond","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122374498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating the linguistic representativeness of Early Modern Greek Corpora","authors":"E. Karantzola, Yannis Kostopoulos, K. Sampanis","doi":"10.1553/EMODERN_GREEKS1","DOIUrl":"https://doi.org/10.1553/EMODERN_GREEKS1","url":null,"abstract":"Following a poorly documented period in the history of vernacular Greek (6th-12th c.), the late 15th century sets the beginning of a linguistic era characterized by a quantitatively and qualitatively incomparable production of prose texts written in “common” language. It is at this point that classicizing Greek stops dominating in writing, and a new linguistic variety – albeit a very diverse and fluid one – Early Modern Greek (EMG) starts growing rapidly as a literacy language. The development of this new variety is manifested in its widespread use as literary language (in texts with aesthetic function), as well as in its use as a simple scripta, namely a written vernacular for legal, administrative, commercial, and other functions. Despite its significance in the history of Greek, this period remains to a large extent unexplored and underrepresented in Greek language corpora. On this view, our understanding of EMG depends crucially on the representativeness of the few available corpora. The aim of this paper is to investigate the linguistic representativeness of EMG corpora, and to explore possible associations between observed linguistic patterns and corpora design. Focusing on the distribution of contrastive and reformulation markers, our study reveals that the linguistic data illustrated in the available EMG corpora are divergent and largely dependent on the representation of variables, such as text form (poetry/prose), period, geographical region, and genre","PeriodicalId":210552,"journal":{"name":"Digital Lexis and Beyond","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128881510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"(Dis)continuities in the diachrony of the Greek lexicon: The learned component in the light of a corpus analysis","authors":"","doi":"10.1553/greek_lexicon","DOIUrl":"https://doi.org/10.1553/greek_lexicon","url":null,"abstract":"","PeriodicalId":210552,"journal":{"name":"Digital Lexis and Beyond","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131928862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VICAV 3.0: Zooming in on Lexical Resources","authors":"Karlheinz Moerth, Daniel Schopper","doi":"10.1553/vicav","DOIUrl":"https://doi.org/10.1553/vicav","url":null,"abstract":"The paper outlines the language documentation platform VICAV which pools information on spoken varieties of contemporary Arabic. It gives a general outline of VICAV’s background and its scope, touches on a number of methodological issues concerning the text-technological setup and discusses conceptual questions focusing on aspects of eLexicography, in particular on practical issues dealing with digital data and tools used to build it. The paper explains in detail the involved language resources and deals with issues pertaining to standards, formats and interoperability. Ample detail is furnished on tools developed as part of VICAV’s evolution, in particular the dictionary editor and the web-interface. The paper is based on the presentation given at OLT in December 2019 in Salzburg and has been supplemented with information on recent developments achieved in the course of 2020","PeriodicalId":210552,"journal":{"name":"Digital Lexis and Beyond","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122606532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}