Ghazanfar Hussain, A. Husnain, Rida Zahra, S. M. U. Din
{"title":"通过统计语言模型测量作者合法性","authors":"Ghazanfar Hussain, A. Husnain, Rida Zahra, S. M. U. Din","doi":"10.1109/ICACS.2018.8333276","DOIUrl":null,"url":null,"abstract":"Smart text spinning and paid content writing have jeopardized authorship identity in literary spheres. Authors frequently outsource their work to freelance writers or forge a new piece of writing by using text spinners. These activities seemingly go unnoticed by the readers. In this paper, we propose a way of uncovering true authorship by sampling statistical model of writing features. We acquire dataset of original work from a group of authors and perform feature vector analysis to formulate every author's profile. The profile includes normalized laxative and grammatical components derived from sample space of dataset. Based upon those features, once a new writing is fed, the algorithm extracts relevant components, assigns associative weights and classifies the writing with respect to the author. The algorithm intelligently adjusts weights for swift convergence and precise classification. So far our system is able to achieve an accuracy of 100% above a certain range of words. We have tested it on various text models, spun texts and plagiarised content and the performance of our algorithm has been very promising. It is a great help in academia and professional publishing houses.","PeriodicalId":128949,"journal":{"name":"2018 International Conference on Advancements in Computational Sciences (ICACS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Measuring authorship legitimacy by statistical linguistic modelling\",\"authors\":\"Ghazanfar Hussain, A. Husnain, Rida Zahra, S. M. U. Din\",\"doi\":\"10.1109/ICACS.2018.8333276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Smart text spinning and paid content writing have jeopardized authorship identity in literary spheres. Authors frequently outsource their work to freelance writers or forge a new piece of writing by using text spinners. These activities seemingly go unnoticed by the readers. In this paper, we propose a way of uncovering true authorship by sampling statistical model of writing features. We acquire dataset of original work from a group of authors and perform feature vector analysis to formulate every author's profile. The profile includes normalized laxative and grammatical components derived from sample space of dataset. Based upon those features, once a new writing is fed, the algorithm extracts relevant components, assigns associative weights and classifies the writing with respect to the author. The algorithm intelligently adjusts weights for swift convergence and precise classification. So far our system is able to achieve an accuracy of 100% above a certain range of words. We have tested it on various text models, spun texts and plagiarised content and the performance of our algorithm has been very promising. It is a great help in academia and professional publishing houses.\",\"PeriodicalId\":128949,\"journal\":{\"name\":\"2018 International Conference on Advancements in Computational Sciences (ICACS)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Advancements in Computational Sciences (ICACS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACS.2018.8333276\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Advancements in Computational Sciences (ICACS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACS.2018.8333276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Measuring authorship legitimacy by statistical linguistic modelling
Smart text spinning and paid content writing have jeopardized authorship identity in literary spheres. Authors frequently outsource their work to freelance writers or forge a new piece of writing by using text spinners. These activities seemingly go unnoticed by the readers. In this paper, we propose a way of uncovering true authorship by sampling statistical model of writing features. We acquire dataset of original work from a group of authors and perform feature vector analysis to formulate every author's profile. The profile includes normalized laxative and grammatical components derived from sample space of dataset. Based upon those features, once a new writing is fed, the algorithm extracts relevant components, assigns associative weights and classifies the writing with respect to the author. The algorithm intelligently adjusts weights for swift convergence and precise classification. So far our system is able to achieve an accuracy of 100% above a certain range of words. We have tested it on various text models, spun texts and plagiarised content and the performance of our algorithm has been very promising. It is a great help in academia and professional publishing houses.