作者身份分析任务与技术综述

Arta Misini, A. Kadriu, Ercan Canhasi
{"title":"作者身份分析任务与技术综述","authors":"Arta Misini, A. Kadriu, Ercan Canhasi","doi":"10.2478/seeur-2022-0100","DOIUrl":null,"url":null,"abstract":"Abstract Authorship Analysis (AA) is a natural language processing field that examines the previous works of writers to identify the author of a text based on its features. Studies in authorship analysis include authorship identification, authorship profiling, and authorship verification. Due to its relevance, to many applications in this field attention has been paid. It is widely used in the attribution of historical literature. Other applications include legal linguistics, criminal law, forensic investigations, and computer forensics. This paper aims to provide an overview of the work done and the techniques applied in the authorship analysis domain. The examination of recent developments in this field is the principal focus. Many different criteria can be used to define a writer’s style. This paper investigates stylometric features in different author-related tasks, including lexical, syntactic, semantic, structural, and content-specific ones. A lot of classification methods have been applied to authorship analysis tasks. We examine many research studies that use different machine learning and deep learning techniques. As a means of pointing the direction for future studies, we present the most relevant methods recently proposed. The reviewed studies include documents of different types and different languages. In summary, due to the fact that each natural language has its own set of features, there is no standard technique generically applicable for solving the AA problem.","PeriodicalId":332987,"journal":{"name":"SEEU Review","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Survey on Authorship Analysis Tasks and Techniques\",\"authors\":\"Arta Misini, A. Kadriu, Ercan Canhasi\",\"doi\":\"10.2478/seeur-2022-0100\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Authorship Analysis (AA) is a natural language processing field that examines the previous works of writers to identify the author of a text based on its features. Studies in authorship analysis include authorship identification, authorship profiling, and authorship verification. Due to its relevance, to many applications in this field attention has been paid. It is widely used in the attribution of historical literature. Other applications include legal linguistics, criminal law, forensic investigations, and computer forensics. This paper aims to provide an overview of the work done and the techniques applied in the authorship analysis domain. The examination of recent developments in this field is the principal focus. Many different criteria can be used to define a writer’s style. This paper investigates stylometric features in different author-related tasks, including lexical, syntactic, semantic, structural, and content-specific ones. A lot of classification methods have been applied to authorship analysis tasks. We examine many research studies that use different machine learning and deep learning techniques. As a means of pointing the direction for future studies, we present the most relevant methods recently proposed. The reviewed studies include documents of different types and different languages. In summary, due to the fact that each natural language has its own set of features, there is no standard technique generically applicable for solving the AA problem.\",\"PeriodicalId\":332987,\"journal\":{\"name\":\"SEEU Review\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SEEU Review\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/seeur-2022-0100\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SEEU Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/seeur-2022-0100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

作者分析(Authorship Analysis, AA)是一种自然语言处理领域,它通过分析作者以前的作品,根据文本的特征来识别文本的作者。作者身份分析的研究包括作者身份鉴定、作者身份分析和作者身份验证。由于其相关性,在这一领域的许多应用都受到了重视。它被广泛应用于历史文献的归属。其他应用包括法律语言学、刑法、法医调查和计算机取证。本文旨在概述作者身份分析领域所做的工作和应用的技术。对这一领域的最新发展的审查是主要的重点。许多不同的标准可以用来定义一个作家的风格。本文从词法、句法、语义、结构和内容等方面考察了不同作者任务的文体特征。许多分类方法已经应用于作者身份分析任务。我们研究了许多使用不同机器学习和深度学习技术的研究。为了给今后的研究指明方向,我们介绍了最近提出的最相关的方法。所审查的研究包括不同类型和不同语言的文件。总之,由于每种自然语言都有自己的一组特性,因此没有一种通用的标准技术可用于解决AA问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Survey on Authorship Analysis Tasks and Techniques
Abstract Authorship Analysis (AA) is a natural language processing field that examines the previous works of writers to identify the author of a text based on its features. Studies in authorship analysis include authorship identification, authorship profiling, and authorship verification. Due to its relevance, to many applications in this field attention has been paid. It is widely used in the attribution of historical literature. Other applications include legal linguistics, criminal law, forensic investigations, and computer forensics. This paper aims to provide an overview of the work done and the techniques applied in the authorship analysis domain. The examination of recent developments in this field is the principal focus. Many different criteria can be used to define a writer’s style. This paper investigates stylometric features in different author-related tasks, including lexical, syntactic, semantic, structural, and content-specific ones. A lot of classification methods have been applied to authorship analysis tasks. We examine many research studies that use different machine learning and deep learning techniques. As a means of pointing the direction for future studies, we present the most relevant methods recently proposed. The reviewed studies include documents of different types and different languages. In summary, due to the fact that each natural language has its own set of features, there is no standard technique generically applicable for solving the AA problem.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信