{"title":"Images of the Lisbon Treaty Debate in the British Press: A Corpus-Based Approach to Metaphor Analysis. Chiara Nasti","authors":"Charlotte Taylor","doi":"10.1093/llc/fqt022","DOIUrl":"https://doi.org/10.1093/llc/fqt022","url":null,"abstract":"","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122700655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inadequacy of the chi-squared test to examine vocabulary differences between corpora","authors":"Yves Bestgen","doi":"10.1093/llc/fqt020","DOIUrl":"https://doi.org/10.1093/llc/fqt020","url":null,"abstract":"Pearson's chi-squared test is probably the most popular statistical test used in corpus linguistics, particularly for studying linguistic variations between corpora. Oakes and Farrow (Literary and Linguistic Computing, 2007, 22, 85-99) proposed various adaptations of this test in order to allow for the simultaneous comparison of more than two corpora, while also yielding an almost correct Type I error rate (i.e. claiming that a word is most frequently found in a variety of English, when in actuality this is not the case). By means of resampling procedures, the present study shows that when used in this context, the chi-squared test produces far too many significant results, even in its modified version. Several potential approaches to circumventing this problem are discussed in the conclusion.","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115405985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Found in translation: To what extent is authorial discriminability preserved by translators?","authors":"R. Forsyth, Phoenix W. Y. Lam","doi":"10.1093/llc/fqt018","DOIUrl":"https://doi.org/10.1093/llc/fqt018","url":null,"abstract":"Most authorship attribution studies have focused on works that are available in the language used by the original author (Holmes, 1994; Juola, 2006) because this provides a direct way of examining an author's linguistic habits. Sometimes, however, questions of authorship arise regarding a work only surviving in translation. One example is 'Constance', the putative 'last play' of Oscar Wilde, only existing in a supposed French translation of a lost English original. The present study aims to take a step towards dealing with cases of this kind by addressing two related questions: (1) to what extent are authorial differences preserved in translation; (2) to what extent does this carry-over depend on the particular translator? With these aims, we analysed 262 letters written by Vincent van Gogh and by his brother Theo, dated between 1888 and 1890, each available in the original French and in an English translation. We also performed a more intensive investigation of a subset of this corpus, comprising forty-eight letters, for which two different English translations were obtainable. Using three different indices of discriminability (classification accuracy, Hedge's g, and area under the receiver operating characteristic curve), we found that much of the stylistic discriminability between the two brothers was preserved in the English translations. Subsidiary analyses were used to identify which lexical features were contributing most to inter-author discriminability. Discrimination between translation sources was possible, although less effective than between authors. We conclude that 'handprints' of both author and translator can be found in translated texts, using appropriate techniques.","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115855073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Semantic Representation of Natural Language. M. Levison, G. Lessard, C. Thomas, and M. Donald","authors":"Christina Unger","doi":"10.1093/llc/fqt029","DOIUrl":"https://doi.org/10.1093/llc/fqt029","url":null,"abstract":"","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129177972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Calculating syllable count automatically from fixed-meter poetry in English and Welsh","authors":"Michael Hammond","doi":"10.1093/llc/fqt019","DOIUrl":"https://doi.org/10.1093/llc/fqt019","url":null,"abstract":"","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133484107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing the inter-method reliability and correlational validity of the Body Type Dictionary","authors":"Laura A. Cariola","doi":"10.1093/llc/fqt025","DOIUrl":"https://doi.org/10.1093/llc/fqt025","url":null,"abstract":"The political negotiation, erection, and fall of national and cultural borders represent an issue that frequently occupies the media. Given the historical importance of boundaries as a marker of cultural identity, as well as their function to separate and unite people, the Body Type Dictionary (BTD; Wilson, 2006) represents a suitable computerized content analysis measure to analyse vocabulary qualified to measure body boundaries and their penetrability. Out of this context, this study aimed to assess the inter-method reliability of the BTD (Wilson, 2006) in relation to Fisher and Cleveland’s (1956, 1958) manual scoring system for high and low barrier personalities. The results indicated that Fisher and Cleveland’s manually coded barrier and penetration imagery scores showed an acceptable positive correlation with the computerized frequency counts of the BTD’s coded barrier and penetration imagery scores, thereby indicating an inter-method reliability. In addition, barrier and penetration imagery correlated positively with primordial thought language in the picture response test, and narratives of everyday and dream memories, thereby indicating correlational validity.","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116144728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Language chunking, data sparseness, and the value of a long marker list: explorations with word n-grams and authorial attribution","authors":"A. Antonia, Hugh Craig, Jack Elliott","doi":"10.1093/llc/fqt028","DOIUrl":"https://doi.org/10.1093/llc/fqt028","url":null,"abstract":"The frequencies of individual words have been the mainstay of computer-assisted authorial attribution over the past three decades. The usefulness of this sort of data is attested in many benchmark trials and in numerous studies of particular authorship problems. It is sometimes argued, however, that since language as spoken or written falls into word sequences, on the ‘idiom principle’, and since language is characteristically produced in the brain in chunks, not in individual words, n-grams with n higher than 1 are superior to individual words as a source of authorship markers. In this article, we test the usefulness of word n-grams for authorship attribution by asking how many good-quality authorship markers are yielded by n-grams of various types, namely 1-grams, 2-grams, 3-grams, 4-grams, and 5-grams. We use two ways of formulating the n-grams, two corpora of texts, and two methods for finding and assessing markers. We find that when using methods based on regularly occurring markers, and drawing on all the available vocabulary, 1-grams perform best. With methods based on rare markers, and all the available vocabulary, strict 3-gram sequences perform best. If we restrict ourselves to a defined word-list of function-words to form n-grams, 2-grams offer a striking improvement on 1-grams. .................................................................................................................................................................................","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127921350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Patterns of local discourse coherence as a feature for authorship attribution","authors":"V. Feng, Graeme Hirst","doi":"10.1093/llc/fqt021","DOIUrl":"https://doi.org/10.1093/llc/fqt021","url":null,"abstract":"We define a model of discourse coherence based on Barzilay and Lapata’s entity grids as a stylometric feature for authorship attribution. Unlike standard lexical and character-level features, it operates at a discourse (cross-sentence) level. We test it against and in combination with standard features on nineteen booklength texts by nine nineteenth-century authors. We find that coherence alone performs often as well as and sometimes better than standard features, though a combination of the two has the highest performance overall. We observe that despite the difference in levels, there is a correlation in performance of the two kinds of features.","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132475379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing word segmentation tasks using ant colony metaheuristics","authors":"G. Tambouratzis","doi":"10.1093/llc/fqt026","DOIUrl":"https://doi.org/10.1093/llc/fqt026","url":null,"abstract":"In this article, the application of Ant-Colony Optimization (ACO) to a morphological segmentation task is described, where the aim is to analyse a set of words into their constituent stem and ending. A number of criteria for determining the optimal segmentation are evaluated comparatively while at the same time investigating more comprehensively the effectiveness of the ACO system in defining appropriate values for system parameters. Owing to the characteristics of the task at hand, particular emphasis is placed on studying the ACO process for learning sessions of a limited duration. Morphological segmentation becomes hardest in highly inflectional languages, where each stem is associated with a large number of distinct endings. Consequently, the present article investigates morphological segmentation of words from a highly inflectional language, specifically Ancient Greek, by combining pattern-recognition principles with limited linguistic knowledge. To weigh these sources of knowledge, a set of weights is used as a set of system parameters, to be optimized via ACO. ACO-based experimental results are shown to be of a higher quality than those achieved by manual optimisation or ‘randomised generate and test’ methods. This illustrates the applicability of the ACO-based approach to the morphological segmentation task. .................................................................................................................................................................................","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116849747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hey, this is your journal","authors":"Edward Vanhoutte","doi":"10.1093/llc/fqu003","DOIUrl":"https://doi.org/10.1093/llc/fqu003","url":null,"abstract":"","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125968540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}