多少特征是最优的:一种基于近似函数的姿态检测新方法

S. Vychegzhanin, E. Razova, E. Kotelnikov
{"title":"多少特征是最优的:一种基于近似函数的姿态检测新方法","authors":"S. Vychegzhanin, E. Razova, E. Kotelnikov","doi":"10.1145/3357419.3357430","DOIUrl":null,"url":null,"abstract":"Selecting a text representation model faces a crucial problem of choosing an optimal number of features. The optimality criterion is the minimum number of features, which allows to achieve (or preserve) the maximum performance. The article suggests a new method of determining the optimal number of features, in which both components of the optimality criterion are taken into consideration. Using the proposed method, we first construct the dependence of task performance on the number of features, then the obtained dependence is approximated on the basis of Weibull distribution function, and the optimal number of features is determined by analyzing the growth rate of this function. We called this method DOFNAF (Determining the Optimal Feature Number by the Approximating Function). The proposed method is tested on stance detection task, consisting in identifying the position (\"for\" or \"against\"), which the author of the text supports towards the object (or objects) under discussion. The comparison involves constant methods, a method based on the function of the total number of features, a method of performance maximum, as well as Recursive Feature Elimination with Cross-Validation (RFECV) and Correlation-based Feature Selection (CFS) methods. The DOFNAF method allows to determine the minimum number of features compared with the existing methods and at the same time to maintain the classification performance.","PeriodicalId":261951,"journal":{"name":"Proceedings of the 9th International Conference on Information Communication and Management","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"What Number of Features is Optimal: A New Method Based on Approximation Function for Stance Detection Task\",\"authors\":\"S. Vychegzhanin, E. Razova, E. Kotelnikov\",\"doi\":\"10.1145/3357419.3357430\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Selecting a text representation model faces a crucial problem of choosing an optimal number of features. The optimality criterion is the minimum number of features, which allows to achieve (or preserve) the maximum performance. The article suggests a new method of determining the optimal number of features, in which both components of the optimality criterion are taken into consideration. Using the proposed method, we first construct the dependence of task performance on the number of features, then the obtained dependence is approximated on the basis of Weibull distribution function, and the optimal number of features is determined by analyzing the growth rate of this function. We called this method DOFNAF (Determining the Optimal Feature Number by the Approximating Function). The proposed method is tested on stance detection task, consisting in identifying the position (\\\"for\\\" or \\\"against\\\"), which the author of the text supports towards the object (or objects) under discussion. The comparison involves constant methods, a method based on the function of the total number of features, a method of performance maximum, as well as Recursive Feature Elimination with Cross-Validation (RFECV) and Correlation-based Feature Selection (CFS) methods. The DOFNAF method allows to determine the minimum number of features compared with the existing methods and at the same time to maintain the classification performance.\",\"PeriodicalId\":261951,\"journal\":{\"name\":\"Proceedings of the 9th International Conference on Information Communication and Management\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 9th International Conference on Information Communication and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3357419.3357430\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Conference on Information Communication and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3357419.3357430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

选择文本表示模型面临一个关键问题,即选择最优数量的特征。最优性准则是允许实现(或保持)最大性能的最小特征数量。本文提出了一种确定最优特征数量的新方法,该方法同时考虑了最优性准则的两个组成部分。该方法首先构造任务性能与特征数量的依赖关系,然后基于威布尔分布函数对得到的依赖关系进行近似,通过分析威布尔分布函数的增长速度确定最优特征数量。我们称这种方法为DOFNAF(通过逼近函数确定最优特征数)。所提出的方法在立场检测任务上进行了测试,该任务包括识别文本作者对所讨论的对象(或多个对象)支持的位置(“赞成”或“反对”)。比较方法包括常数法、基于特征总数函数的方法、性能最大值法、交叉验证递归特征消除法(RFECV)和基于相关性的特征选择法(CFS)。与现有方法相比,DOFNAF方法可以确定最小特征数,同时保持分类性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
What Number of Features is Optimal: A New Method Based on Approximation Function for Stance Detection Task
Selecting a text representation model faces a crucial problem of choosing an optimal number of features. The optimality criterion is the minimum number of features, which allows to achieve (or preserve) the maximum performance. The article suggests a new method of determining the optimal number of features, in which both components of the optimality criterion are taken into consideration. Using the proposed method, we first construct the dependence of task performance on the number of features, then the obtained dependence is approximated on the basis of Weibull distribution function, and the optimal number of features is determined by analyzing the growth rate of this function. We called this method DOFNAF (Determining the Optimal Feature Number by the Approximating Function). The proposed method is tested on stance detection task, consisting in identifying the position ("for" or "against"), which the author of the text supports towards the object (or objects) under discussion. The comparison involves constant methods, a method based on the function of the total number of features, a method of performance maximum, as well as Recursive Feature Elimination with Cross-Validation (RFECV) and Correlation-based Feature Selection (CFS) methods. The DOFNAF method allows to determine the minimum number of features compared with the existing methods and at the same time to maintain the classification performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信