作者分类技术：连接文本领域和语言

IF 1 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal on Information Technologies and Security Pub Date : 2024-03-01 DOI:10.59035/ukbe1226

Arta Misini, A. Kadriu, Ercan Canhasi

{"title":"作者分类技术：连接文本领域和语言","authors":"Arta Misini, A. Kadriu, Ercan Canhasi","doi":"10.59035/ukbe1226","DOIUrl":null,"url":null,"abstract":"Authorship classification analyzes an author's prior work to identify their\n writing style, a unique trait of each language and individual author. This research aims\n to conduct a thorough comparative analysis of various methods for classifying\n authorship. The study leverages two corpora: AAALitCorpus of Albanian literary texts and\n CCAT10 of English columns. We evaluate model-generated features across different\n configurations. The richness of the features and the breadth of the analysis provide a\n significant understanding of the problem, setting a new standard for comprehensive\n linguistic investigations across multiple languages. The study indicates that machine\n learning algorithms accurately discern authorial writing styles, highlighting the\n complexities of classifying authorship in a cross-linguistic context.","PeriodicalId":42317,"journal":{"name":"International Journal on Information Technologies and Security","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Authorship classification techniques: Bridging textual domains and\\n languages\",\"authors\":\"Arta Misini, A. Kadriu, Ercan Canhasi\",\"doi\":\"10.59035/ukbe1226\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Authorship classification analyzes an author's prior work to identify their\\n writing style, a unique trait of each language and individual author. This research aims\\n to conduct a thorough comparative analysis of various methods for classifying\\n authorship. The study leverages two corpora: AAALitCorpus of Albanian literary texts and\\n CCAT10 of English columns. We evaluate model-generated features across different\\n configurations. The richness of the features and the breadth of the analysis provide a\\n significant understanding of the problem, setting a new standard for comprehensive\\n linguistic investigations across multiple languages. The study indicates that machine\\n learning algorithms accurately discern authorial writing styles, highlighting the\\n complexities of classifying authorship in a cross-linguistic context.\",\"PeriodicalId\":42317,\"journal\":{\"name\":\"International Journal on Information Technologies and Security\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Information Technologies and Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.59035/ukbe1226\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Information Technologies and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59035/ukbe1226","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

作者分类法通过分析作者以前的作品来确定其写作风格，这是每种语言和每个作者的独特特征。本研究旨在对各种作者分类方法进行全面的比较分析。研究利用了两个语料库：AAALitCorpus（阿尔巴尼亚文学文本）和CCAT10（英语专栏）。我们评估了不同配置下模型生成的特征。特征的丰富性和分析的广泛性为理解问题提供了重要依据，为跨多种语言的全面语言学研究设定了新标准。研究表明，机器学习算法能够准确辨别作者的写作风格，突出了跨语言背景下作者身份分类的复杂性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Authorship classification techniques: Bridging textual domains and languages

Authorship classification analyzes an author's prior work to identify their writing style, a unique trait of each language and individual author. This research aims to conduct a thorough comparative analysis of various methods for classifying authorship. The study leverages two corpora: AAALitCorpus of Albanian literary texts and CCAT10 of English columns. We evaluate model-generated features across different configurations. The richness of the features and the breadth of the analysis provide a significant understanding of the problem, setting a new standard for comprehensive linguistic investigations across multiple languages. The study indicates that machine learning algorithms accurately discern authorial writing styles, highlighting the complexities of classifying authorship in a cross-linguistic context.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal on Information Technologies and Security COMPUTER SCIENCE, INFORMATION SYSTEMS-

自引率

66.70%

发文量