作者身份分析任务与技术综述

SEEU Review Pub Date : 2022-12-01 DOI:10.2478/seeur-2022-0100

Arta Misini, A. Kadriu, Ercan Canhasi

{"title":"作者身份分析任务与技术综述","authors":"Arta Misini, A. Kadriu, Ercan Canhasi","doi":"10.2478/seeur-2022-0100","DOIUrl":null,"url":null,"abstract":"Abstract Authorship Analysis (AA) is a natural language processing field that examines the previous works of writers to identify the author of a text based on its features. Studies in authorship analysis include authorship identification, authorship profiling, and authorship verification. Due to its relevance, to many applications in this field attention has been paid. It is widely used in the attribution of historical literature. Other applications include legal linguistics, criminal law, forensic investigations, and computer forensics. This paper aims to provide an overview of the work done and the techniques applied in the authorship analysis domain. The examination of recent developments in this field is the principal focus. Many different criteria can be used to define a writer’s style. This paper investigates stylometric features in different author-related tasks, including lexical, syntactic, semantic, structural, and content-specific ones. A lot of classification methods have been applied to authorship analysis tasks. We examine many research studies that use different machine learning and deep learning techniques. As a means of pointing the direction for future studies, we present the most relevant methods recently proposed. The reviewed studies include documents of different types and different languages. In summary, due to the fact that each natural language has its own set of features, there is no standard technique generically applicable for solving the AA problem.","PeriodicalId":332987,"journal":{"name":"SEEU Review","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Survey on Authorship Analysis Tasks and Techniques\",\"authors\":\"Arta Misini, A. Kadriu, Ercan Canhasi\",\"doi\":\"10.2478/seeur-2022-0100\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Authorship Analysis (AA) is a natural language processing field that examines the previous works of writers to identify the author of a text based on its features. Studies in authorship analysis include authorship identification, authorship profiling, and authorship verification. Due to its relevance, to many applications in this field attention has been paid. It is widely used in the attribution of historical literature. Other applications include legal linguistics, criminal law, forensic investigations, and computer forensics. This paper aims to provide an overview of the work done and the techniques applied in the authorship analysis domain. The examination of recent developments in this field is the principal focus. Many different criteria can be used to define a writer’s style. This paper investigates stylometric features in different author-related tasks, including lexical, syntactic, semantic, structural, and content-specific ones. A lot of classification methods have been applied to authorship analysis tasks. We examine many research studies that use different machine learning and deep learning techniques. As a means of pointing the direction for future studies, we present the most relevant methods recently proposed. The reviewed studies include documents of different types and different languages. In summary, due to the fact that each natural language has its own set of features, there is no standard technique generically applicable for solving the AA problem.\",\"PeriodicalId\":332987,\"journal\":{\"name\":\"SEEU Review\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SEEU Review\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/seeur-2022-0100\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SEEU Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/seeur-2022-0100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

作者分析(Authorship Analysis, AA)是一种自然语言处理领域，它通过分析作者以前的作品，根据文本的特征来识别文本的作者。作者身份分析的研究包括作者身份鉴定、作者身份分析和作者身份验证。由于其相关性，在这一领域的许多应用都受到了重视。它被广泛应用于历史文献的归属。其他应用包括法律语言学、刑法、法医调查和计算机取证。本文旨在概述作者身份分析领域所做的工作和应用的技术。对这一领域的最新发展的审查是主要的重点。许多不同的标准可以用来定义一个作家的风格。本文从词法、句法、语义、结构和内容等方面考察了不同作者任务的文体特征。许多分类方法已经应用于作者身份分析任务。我们研究了许多使用不同机器学习和深度学习技术的研究。为了给今后的研究指明方向，我们介绍了最近提出的最相关的方法。所审查的研究包括不同类型和不同语言的文件。总之，由于每种自然语言都有自己的一组特性，因此没有一种通用的标准技术可用于解决AA问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Survey on Authorship Analysis Tasks and Techniques

Abstract Authorship Analysis (AA) is a natural language processing field that examines the previous works of writers to identify the author of a text based on its features. Studies in authorship analysis include authorship identification, authorship profiling, and authorship verification. Due to its relevance, to many applications in this field attention has been paid. It is widely used in the attribution of historical literature. Other applications include legal linguistics, criminal law, forensic investigations, and computer forensics. This paper aims to provide an overview of the work done and the techniques applied in the authorship analysis domain. The examination of recent developments in this field is the principal focus. Many different criteria can be used to define a writer’s style. This paper investigates stylometric features in different author-related tasks, including lexical, syntactic, semantic, structural, and content-specific ones. A lot of classification methods have been applied to authorship analysis tasks. We examine many research studies that use different machine learning and deep learning techniques. As a means of pointing the direction for future studies, we present the most relevant methods recently proposed. The reviewed studies include documents of different types and different languages. In summary, due to the fact that each natural language has its own set of features, there is no standard technique generically applicable for solving the AA problem.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

SEEU Review

自引率

0.00%

发文量