作者归属的离散化属性的顺序选择

Grzegorz Baron
{"title":"作者归属的离散化属性的顺序选择","authors":"Grzegorz Baron","doi":"10.1109/INISTA.2017.8001162","DOIUrl":null,"url":null,"abstract":"Different data mining techniques are employed in stylometry domain for performing authorship attribution tasks. Sometimes to improve the decision system the discretization of input data can be applied. In many cases such approach allows to obtain better classification results. On the other hand, there were situations in which discretization decreased overall performance of the system. Therefore, the question arose what would be the result if only some selected attributes were discretized. The paper presents the results of the research performed for forward sequential selection of attributes to be discretized. The influence of such approach on the performance of the decision system, based on Naive Bayes classifier in authorship attribution domain, is presented. Some basic discretization methods and different approaches to discretization of the test datasets are taken into consideration.","PeriodicalId":314687,"journal":{"name":"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)","volume":"201 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"On sequential selection of attributes to be discretized for authorship attribution\",\"authors\":\"Grzegorz Baron\",\"doi\":\"10.1109/INISTA.2017.8001162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Different data mining techniques are employed in stylometry domain for performing authorship attribution tasks. Sometimes to improve the decision system the discretization of input data can be applied. In many cases such approach allows to obtain better classification results. On the other hand, there were situations in which discretization decreased overall performance of the system. Therefore, the question arose what would be the result if only some selected attributes were discretized. The paper presents the results of the research performed for forward sequential selection of attributes to be discretized. The influence of such approach on the performance of the decision system, based on Naive Bayes classifier in authorship attribution domain, is presented. Some basic discretization methods and different approaches to discretization of the test datasets are taken into consideration.\",\"PeriodicalId\":314687,\"journal\":{\"name\":\"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)\",\"volume\":\"201 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INISTA.2017.8001162\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INISTA.2017.8001162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

在文体学领域中,为了执行作者归属任务,采用了不同的数据挖掘技术。有时为了改进决策系统,可以对输入数据进行离散化处理。在许多情况下,这种方法可以获得更好的分类结果。另一方面,在某些情况下离散化会降低系统的整体性能。因此,问题出现了,如果只将一些选定的属性离散化,结果会是什么。本文给出了待离散化属性的前向顺序选择的研究结果。给出了基于朴素贝叶斯分类器的作者归属领域决策方法对决策系统性能的影响。介绍了几种基本的离散化方法和不同的测试数据集离散化方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On sequential selection of attributes to be discretized for authorship attribution
Different data mining techniques are employed in stylometry domain for performing authorship attribution tasks. Sometimes to improve the decision system the discretization of input data can be applied. In many cases such approach allows to obtain better classification results. On the other hand, there were situations in which discretization decreased overall performance of the system. Therefore, the question arose what would be the result if only some selected attributes were discretized. The paper presents the results of the research performed for forward sequential selection of attributes to be discretized. The influence of such approach on the performance of the decision system, based on Naive Bayes classifier in authorship attribution domain, is presented. Some basic discretization methods and different approaches to discretization of the test datasets are taken into consideration.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信