概念漂移环境下在线学习特征选择算法的性能评价

Anais do XV Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2018) Pub Date : 2018-10-22 DOI:10.5753/ENIAC.2018.4438

M. B. D. Moraes, A. Gradvohl

{"title":"概念漂移环境下在线学习特征选择算法的性能评价","authors":"M. B. D. Moraes, A. Gradvohl","doi":"10.5753/ENIAC.2018.4438","DOIUrl":null,"url":null,"abstract":"Data streams are transmitted at high speeds with huge volume and may contain critical information need processing in real-time. Hence, to reduce computational cost and time, the system may apply a feature selection algorithm. However, this is not a trivial task due to the concept drift. In this work, we show that two feature selection algorithms, Information Gain and Online Feature Selection, present lower performance when compared to classification tasks without feature selection. Both algorithms presented more relevant results in one distinct scenario each, showing final accuracies up to 14% higher. The experiments using both real and artificial datasets present a potential for using these methods due to their better adaptability in some concept drift situations.","PeriodicalId":152292,"journal":{"name":"Anais do XV Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2018)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Evaluation of Feature Selection Algorithms Applied to Online Learning in Concept Drift Environments\",\"authors\":\"M. B. D. Moraes, A. Gradvohl\",\"doi\":\"10.5753/ENIAC.2018.4438\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data streams are transmitted at high speeds with huge volume and may contain critical information need processing in real-time. Hence, to reduce computational cost and time, the system may apply a feature selection algorithm. However, this is not a trivial task due to the concept drift. In this work, we show that two feature selection algorithms, Information Gain and Online Feature Selection, present lower performance when compared to classification tasks without feature selection. Both algorithms presented more relevant results in one distinct scenario each, showing final accuracies up to 14% higher. The experiments using both real and artificial datasets present a potential for using these methods due to their better adaptability in some concept drift situations.\",\"PeriodicalId\":152292,\"journal\":{\"name\":\"Anais do XV Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2018)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Anais do XV Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2018)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5753/ENIAC.2018.4438\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do XV Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2018)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/ENIAC.2018.4438","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

数据流传输速度快，容量大，可能包含需要实时处理的关键信息。因此，为了减少计算成本和时间，系统可以采用特征选择算法。然而，由于概念漂移，这不是一项微不足道的任务。在这项工作中，我们证明了两种特征选择算法，信息增益和在线特征选择，与没有特征选择的分类任务相比，表现出较低的性能。这两种算法在一个不同的场景中都给出了更相关的结果，最终的准确率提高了14%。使用真实和人工数据集的实验显示了使用这些方法的潜力，因为它们在某些概念漂移情况下具有更好的适应性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance Evaluation of Feature Selection Algorithms Applied to Online Learning in Concept Drift Environments

Data streams are transmitted at high speeds with huge volume and may contain critical information need processing in real-time. Hence, to reduce computational cost and time, the system may apply a feature selection algorithm. However, this is not a trivial task due to the concept drift. In this work, we show that two feature selection algorithms, Information Gain and Online Feature Selection, present lower performance when compared to classification tasks without feature selection. Both algorithms presented more relevant results in one distinct scenario each, showing final accuracies up to 14% higher. The experiments using both real and artificial datasets present a potential for using these methods due to their better adaptability in some concept drift situations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Anais do XV Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2018)

自引率

0.00%

发文量