{"title":"Authorship classification techniques: Bridging textual domains and\n languages","authors":"Arta Misini, A. Kadriu, Ercan Canhasi","doi":"10.59035/ukbe1226","DOIUrl":null,"url":null,"abstract":"Authorship classification analyzes an author's prior work to identify their\n writing style, a unique trait of each language and individual author. This research aims\n to conduct a thorough comparative analysis of various methods for classifying\n authorship. The study leverages two corpora: AAALitCorpus of Albanian literary texts and\n CCAT10 of English columns. We evaluate model-generated features across different\n configurations. The richness of the features and the breadth of the analysis provide a\n significant understanding of the problem, setting a new standard for comprehensive\n linguistic investigations across multiple languages. The study indicates that machine\n learning algorithms accurately discern authorial writing styles, highlighting the\n complexities of classifying authorship in a cross-linguistic context.","PeriodicalId":42317,"journal":{"name":"International Journal on Information Technologies and Security","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Information Technologies and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59035/ukbe1226","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Authorship classification analyzes an author's prior work to identify their
writing style, a unique trait of each language and individual author. This research aims
to conduct a thorough comparative analysis of various methods for classifying
authorship. The study leverages two corpora: AAALitCorpus of Albanian literary texts and
CCAT10 of English columns. We evaluate model-generated features across different
configurations. The richness of the features and the breadth of the analysis provide a
significant understanding of the problem, setting a new standard for comprehensive
linguistic investigations across multiple languages. The study indicates that machine
learning algorithms accurately discern authorial writing styles, highlighting the
complexities of classifying authorship in a cross-linguistic context.