The Role of Traditional Features in Authorship Attribution

L. Shang, Lizhen Liu, Wei Song, Miaomiao Cheng
{"title":"The Role of Traditional Features in Authorship Attribution","authors":"L. Shang, Lizhen Liu, Wei Song, Miaomiao Cheng","doi":"10.1109/ICEIEC49280.2020.9152360","DOIUrl":null,"url":null,"abstract":"As an important direction of natural language processing, authorship attribution has been paid much attention. Nowadays, the research methods mainly based on neural network and make great progress. However, compared with traditional methods, the interpretability of these methods has certain limitations. It is difficult for us to know which features are specifically used in the classification of neural network models, and the weight distribution of these features, etc. Without an exact understanding of the model, it is difficult for us to use it in key fields. We used 162 manually defined features at 5 levels and n-grams features at the character level to conduct authorship attribution experiment on the Chinese data set, and conducted a comparative study on these features to obtain the features that the model plays an important role in authorship attribution of Chinese text.","PeriodicalId":352285,"journal":{"name":"2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC)","volume":"6 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIEC49280.2020.9152360","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

As an important direction of natural language processing, authorship attribution has been paid much attention. Nowadays, the research methods mainly based on neural network and make great progress. However, compared with traditional methods, the interpretability of these methods has certain limitations. It is difficult for us to know which features are specifically used in the classification of neural network models, and the weight distribution of these features, etc. Without an exact understanding of the model, it is difficult for us to use it in key fields. We used 162 manually defined features at 5 levels and n-grams features at the character level to conduct authorship attribution experiment on the Chinese data set, and conducted a comparative study on these features to obtain the features that the model plays an important role in authorship attribution of Chinese text.
传统特征在作者归属中的作用
作者归属作为自然语言处理的一个重要方向,受到了广泛的关注。目前,基于神经网络的研究方法取得了很大的进展。但与传统方法相比,这些方法的可解释性存在一定的局限性。我们很难知道在神经网络模型的分类中具体使用了哪些特征,以及这些特征的权重分布等。没有对模型的准确理解,我们很难在关键领域使用它。我们使用5个层次的162个人工定义特征和字符层次的n-grams特征对中文数据集进行了作者归属实验,并对这些特征进行了对比研究,得到了该模型在中文文本作者归属中发挥重要作用的特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信