{"title":"在 19 世纪德语手稿研究中使用计算机和语料库工具 Bok of Notes and Extracts","authors":"Martin Braxatoris, Anita Braxatorisová","doi":"10.2478/jazcas-2023-0046","DOIUrl":null,"url":null,"abstract":"Abstract The study explores the possibilities of using computer and corpus tools in the interpretation of texts of the genre of book of notes and extracts; these are documents consisting of extracts and modified excerpts from contemporary press and literature, records of the author’s own thoughts, etc. Samuel Ferjenčík’s manuscript is a Germanlanguage document by a Slovak author intended for private use; cited or adapted passages are usually given without any reference to the source. The paper introduces the problems of automatic identification of the source base, which relate to the application of OCR and content similarity detection tools. It discusses the results of text matching, which revealed several manipulations of source texts, especially substitutions, indicating attitudes and priority problems in the author’s thought-world. It further interprets the results of the use of the Sketch Engine corpus manager tools by which the frequency of occurrence of key terms and their collocability were investigated, paying special attention to substituted words. The paper is an example of the application of computer and corpus-linguistics methods to the interpretation of literary texts, which is represented by a number of current studies in the field of digital humanities. The proposed approaches are applicable to research on other books of notes and extracts, topical in the context of research trends related to egodocuments, as well as to textual research on monu ments of other genres.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"67 1","pages":"287 - 300"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Use of Computer and Corpus Tols in the Research of a 19th Century German -Language Manuscript Bok of Notes and Extracts\",\"authors\":\"Martin Braxatoris, Anita Braxatorisová\",\"doi\":\"10.2478/jazcas-2023-0046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract The study explores the possibilities of using computer and corpus tools in the interpretation of texts of the genre of book of notes and extracts; these are documents consisting of extracts and modified excerpts from contemporary press and literature, records of the author’s own thoughts, etc. Samuel Ferjenčík’s manuscript is a Germanlanguage document by a Slovak author intended for private use; cited or adapted passages are usually given without any reference to the source. The paper introduces the problems of automatic identification of the source base, which relate to the application of OCR and content similarity detection tools. It discusses the results of text matching, which revealed several manipulations of source texts, especially substitutions, indicating attitudes and priority problems in the author’s thought-world. It further interprets the results of the use of the Sketch Engine corpus manager tools by which the frequency of occurrence of key terms and their collocability were investigated, paying special attention to substituted words. The paper is an example of the application of computer and corpus-linguistics methods to the interpretation of literary texts, which is represented by a number of current studies in the field of digital humanities. The proposed approaches are applicable to research on other books of notes and extracts, topical in the context of research trends related to egodocuments, as well as to textual research on monu ments of other genres.\",\"PeriodicalId\":262732,\"journal\":{\"name\":\"Journal of Linguistics/Jazykovedný casopis\",\"volume\":\"67 1\",\"pages\":\"287 - 300\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Linguistics/Jazykovedný casopis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/jazcas-2023-0046\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Linguistics/Jazykovedný casopis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jazcas-2023-0046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Use of Computer and Corpus Tols in the Research of a 19th Century German -Language Manuscript Bok of Notes and Extracts
Abstract The study explores the possibilities of using computer and corpus tools in the interpretation of texts of the genre of book of notes and extracts; these are documents consisting of extracts and modified excerpts from contemporary press and literature, records of the author’s own thoughts, etc. Samuel Ferjenčík’s manuscript is a Germanlanguage document by a Slovak author intended for private use; cited or adapted passages are usually given without any reference to the source. The paper introduces the problems of automatic identification of the source base, which relate to the application of OCR and content similarity detection tools. It discusses the results of text matching, which revealed several manipulations of source texts, especially substitutions, indicating attitudes and priority problems in the author’s thought-world. It further interprets the results of the use of the Sketch Engine corpus manager tools by which the frequency of occurrence of key terms and their collocability were investigated, paying special attention to substituted words. The paper is an example of the application of computer and corpus-linguistics methods to the interpretation of literary texts, which is represented by a number of current studies in the field of digital humanities. The proposed approaches are applicable to research on other books of notes and extracts, topical in the context of research trends related to egodocuments, as well as to textual research on monu ments of other genres.