作者归因

IF 8.3 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
P. Juola
{"title":"作者归因","authors":"P. Juola","doi":"10.1561/1500000005","DOIUrl":null,"url":null,"abstract":"Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. Recent work in \"non-traditional\" authorship attribution demonstrates the practicality of automatically analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and few \"best practices\" are available. In part because of this confusion, the field has perhaps had less uptake and general acceptance than is its due. \n \nThis review surveys the history and present state of the discipline, presenting some comparative results when available. It shows, first, that the discipline is quite successful, even in difficult cases involving small documents in unfamiliar and less studied languages; it further analyzes the types of analysis and features used and tries to determine characteristics of well-performing systems, finally formulating these in a set of recommendations for best practices.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"23 1","pages":"233-334"},"PeriodicalIF":8.3000,"publicationDate":"2008-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"962","resultStr":"{\"title\":\"Authorship Attribution\",\"authors\":\"P. Juola\",\"doi\":\"10.1561/1500000005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. Recent work in \\\"non-traditional\\\" authorship attribution demonstrates the practicality of automatically analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and few \\\"best practices\\\" are available. In part because of this confusion, the field has perhaps had less uptake and general acceptance than is its due. \\n \\nThis review surveys the history and present state of the discipline, presenting some comparative results when available. It shows, first, that the discipline is quite successful, even in difficult cases involving small documents in unfamiliar and less studied languages; it further analyzes the types of analysis and features used and tries to determine characteristics of well-performing systems, finally formulating these in a set of recommendations for best practices.\",\"PeriodicalId\":48829,\"journal\":{\"name\":\"Foundations and Trends in Information Retrieval\",\"volume\":\"23 1\",\"pages\":\"233-334\"},\"PeriodicalIF\":8.3000,\"publicationDate\":\"2008-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"962\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foundations and Trends in Information Retrieval\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1561/1500000005\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations and Trends in Information Retrieval","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1561/1500000005","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 962

摘要

作者归属是一门从作者所写文献的特征推断作者特征的科学,是一个历史悠久、应用广泛的问题。最近在“非传统”作者归属方面的工作证明了基于作者风格自动分析文档的实用性,但目前的技术状况令人困惑。分析很难应用,对错误类型或错误率知之甚少,而且很少有“最佳实践”可用。在某种程度上,由于这种混乱,该领域可能没有得到应有的重视和普遍接受。本文回顾了该学科的历史和现状,并在可用的情况下提出了一些比较结果。它表明,首先,这门学科是相当成功的,即使是在涉及用不熟悉和研究较少的语言编写的小文件的困难情况下;它进一步分析了所使用的分析类型和特性,并尝试确定性能良好的系统的特征,最后将这些特征形成一组最佳实践建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Authorship Attribution
Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. Recent work in "non-traditional" authorship attribution demonstrates the practicality of automatically analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and few "best practices" are available. In part because of this confusion, the field has perhaps had less uptake and general acceptance than is its due. This review surveys the history and present state of the discipline, presenting some comparative results when available. It shows, first, that the discipline is quite successful, even in difficult cases involving small documents in unfamiliar and less studied languages; it further analyzes the types of analysis and features used and tries to determine characteristics of well-performing systems, finally formulating these in a set of recommendations for best practices.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Foundations and Trends in Information Retrieval
Foundations and Trends in Information Retrieval COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
39.10
自引率
0.00%
发文量
3
期刊介绍: The surge in research across all domains in the past decade has resulted in a plethora of new publications, causing an exponential growth in published research. Navigating through this extensive literature and staying current has become a time-consuming challenge. While electronic publishing provides instant access to more articles than ever, discerning the essential ones for a comprehensive understanding of any topic remains an issue. To tackle this, Foundations and Trends® in Information Retrieval - FnTIR - addresses the problem by publishing high-quality survey and tutorial monographs in the field. Each issue of Foundations and Trends® in Information Retrieval - FnT IR features a 50-100 page monograph authored by research leaders, covering tutorial subjects, research retrospectives, and survey papers that provide state-of-the-art reviews within the scope of the journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信