Measuring authorship legitimacy by statistical linguistic modelling

Ghazanfar Hussain, A. Husnain, Rida Zahra, S. M. U. Din
{"title":"Measuring authorship legitimacy by statistical linguistic modelling","authors":"Ghazanfar Hussain, A. Husnain, Rida Zahra, S. M. U. Din","doi":"10.1109/ICACS.2018.8333276","DOIUrl":null,"url":null,"abstract":"Smart text spinning and paid content writing have jeopardized authorship identity in literary spheres. Authors frequently outsource their work to freelance writers or forge a new piece of writing by using text spinners. These activities seemingly go unnoticed by the readers. In this paper, we propose a way of uncovering true authorship by sampling statistical model of writing features. We acquire dataset of original work from a group of authors and perform feature vector analysis to formulate every author's profile. The profile includes normalized laxative and grammatical components derived from sample space of dataset. Based upon those features, once a new writing is fed, the algorithm extracts relevant components, assigns associative weights and classifies the writing with respect to the author. The algorithm intelligently adjusts weights for swift convergence and precise classification. So far our system is able to achieve an accuracy of 100% above a certain range of words. We have tested it on various text models, spun texts and plagiarised content and the performance of our algorithm has been very promising. It is a great help in academia and professional publishing houses.","PeriodicalId":128949,"journal":{"name":"2018 International Conference on Advancements in Computational Sciences (ICACS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Advancements in Computational Sciences (ICACS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACS.2018.8333276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Smart text spinning and paid content writing have jeopardized authorship identity in literary spheres. Authors frequently outsource their work to freelance writers or forge a new piece of writing by using text spinners. These activities seemingly go unnoticed by the readers. In this paper, we propose a way of uncovering true authorship by sampling statistical model of writing features. We acquire dataset of original work from a group of authors and perform feature vector analysis to formulate every author's profile. The profile includes normalized laxative and grammatical components derived from sample space of dataset. Based upon those features, once a new writing is fed, the algorithm extracts relevant components, assigns associative weights and classifies the writing with respect to the author. The algorithm intelligently adjusts weights for swift convergence and precise classification. So far our system is able to achieve an accuracy of 100% above a certain range of words. We have tested it on various text models, spun texts and plagiarised content and the performance of our algorithm has been very promising. It is a great help in academia and professional publishing houses.
通过统计语言模型测量作者合法性
智能文本旋转和付费内容写作已经危及了文学领域的作者身份。作家们经常把他们的工作外包给自由撰稿人,或者用文字旋转器来伪造一篇新文章。这些活动似乎没有被读者注意到。本文提出了一种利用文字特征抽样统计模型揭示真实作者身份的方法。我们从一组作者中获取原创作品数据集,并进行特征向量分析,以形成每个作者的个人资料。该轮廓包括从数据集的样本空间导出的规范化的泻药和语法成分。基于这些特征,一旦输入新的文章,算法提取相关成分,分配关联权重,并根据作者对文章进行分类。该算法通过智能调整权值,实现快速收敛和精确分类。到目前为止,我们的系统能够在一定范围内达到100%的准确率。我们已经在各种文本模型、旋转文本和抄袭内容上进行了测试,我们的算法的性能非常有希望。对学术界和专业出版社有很大的帮助。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信