O. Taran, O. Palchevska, A. Luchyk, V. Shabunina, Oksana Labenko
{"title":"Stylometric Study of the Fiction Using Sketch Engine","authors":"O. Taran, O. Palchevska, A. Luchyk, V. Shabunina, Oksana Labenko","doi":"10.1145/3526242.3526249","DOIUrl":null,"url":null,"abstract":"The paper deals with a stylometric study of I. Asimov’s idiostyle considering a corpus-based approach. For the analysis of stylometric features the I. Asimov “Foundation” cycle text corpus was created. The quantitative and statistical processing of the text corpus is done via Sketch Engine tool that enables comparison of phrases and words in the following variants: lemma, token, subcorpus. The last parameter is important for distinguishing individual authorial features, comparing their combinability and identifying the dynamics of idiostyle. The following stylometric features of a text corpus by I. Asimov are described: quantitative morphological and lexical characteristics of the vocabulary, quantitative characteristics of occasionalisms’ word formation and statistical estimation of occasionalisms’ collocations. It is sated that the frequency of occasionalisms in the cycle of novels undergoes chronological change, as well as their combinability. In this paper, a method of occasionalisms’ automated extraction due to keyness score was proposed, however, it requires the subsequent manual verification.","PeriodicalId":288048,"journal":{"name":"Digital Humanities Workshop","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Humanities Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526242.3526249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The paper deals with a stylometric study of I. Asimov’s idiostyle considering a corpus-based approach. For the analysis of stylometric features the I. Asimov “Foundation” cycle text corpus was created. The quantitative and statistical processing of the text corpus is done via Sketch Engine tool that enables comparison of phrases and words in the following variants: lemma, token, subcorpus. The last parameter is important for distinguishing individual authorial features, comparing their combinability and identifying the dynamics of idiostyle. The following stylometric features of a text corpus by I. Asimov are described: quantitative morphological and lexical characteristics of the vocabulary, quantitative characteristics of occasionalisms’ word formation and statistical estimation of occasionalisms’ collocations. It is sated that the frequency of occasionalisms in the cycle of novels undergoes chronological change, as well as their combinability. In this paper, a method of occasionalisms’ automated extraction due to keyness score was proposed, however, it requires the subsequent manual verification.