数据流挖掘研究的开放挑战

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2014-09-25 DOI:10.1145/2674026.2674028

G. Krempl, I. Žliobaitė, D. Brzezinski, E. Hüllermeier, Mark Last, V. Lemaire, T. Noack, Ammar Shaker, S. Sievi, M. Spiliopoulou, J. Stefanowski

{"title":"数据流挖掘研究的开放挑战","authors":"G. Krempl, I. Žliobaitė, D. Brzezinski, E. Hüllermeier, Mark Last, V. Lemaire, T. Noack, Ammar Shaker, S. Sievi, M. Spiliopoulou, J. Stefanowski","doi":"10.1145/2674026.2674028","DOIUrl":null,"url":null,"abstract":"Every day, huge volumes of sensory, transactional, and web data are continuously generated as streams, which need to be analyzed online as they arrive. Streaming data can be considered as one of the main sources of what is called big data. While predictive modeling for data streams and big data have received a lot of attention over the last decade, many research approaches are typically designed for well-behaved controlled problem settings, overlooking important challenges imposed by real-world applications. This article presents a discussion on eight open challenges for data stream mining. Our goal is to identify gaps between current research and meaningful applications, highlight open problems, and define new application-relevant research directions for data stream mining. The identified challenges cover the full cycle of knowledge discovery and involve such problems as: protecting data privacy, dealing with legacy systems, handling incomplete and delayed information, analysis of complex data, and evaluation of stream mining algorithms. The resulting analysis is illustrated by practical applications and provides general suggestions concerning lines of future research in data stream mining.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"1 1","pages":"1-10"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"267","resultStr":"{\"title\":\"Open challenges for data stream mining research\",\"authors\":\"G. Krempl, I. Žliobaitė, D. Brzezinski, E. Hüllermeier, Mark Last, V. Lemaire, T. Noack, Ammar Shaker, S. Sievi, M. Spiliopoulou, J. Stefanowski\",\"doi\":\"10.1145/2674026.2674028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Every day, huge volumes of sensory, transactional, and web data are continuously generated as streams, which need to be analyzed online as they arrive. Streaming data can be considered as one of the main sources of what is called big data. While predictive modeling for data streams and big data have received a lot of attention over the last decade, many research approaches are typically designed for well-behaved controlled problem settings, overlooking important challenges imposed by real-world applications. This article presents a discussion on eight open challenges for data stream mining. Our goal is to identify gaps between current research and meaningful applications, highlight open problems, and define new application-relevant research directions for data stream mining. The identified challenges cover the full cycle of knowledge discovery and involve such problems as: protecting data privacy, dealing with legacy systems, handling incomplete and delayed information, analysis of complex data, and evaluation of stream mining algorithms. The resulting analysis is illustrated by practical applications and provides general suggestions concerning lines of future research in data stream mining.\",\"PeriodicalId\":90050,\"journal\":{\"name\":\"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining\",\"volume\":\"1 1\",\"pages\":\"1-10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"267\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2674026.2674028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2674026.2674028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 267

摘要

每天，大量的感官、交易和网络数据以流的形式不断产生，这些数据在到达时需要在线分析。流数据可以被认为是所谓大数据的主要来源之一。虽然数据流和大数据的预测建模在过去十年中受到了很多关注，但许多研究方法通常是为行为良好的受控问题设置而设计的，忽略了现实世界应用所带来的重要挑战。本文讨论了数据流挖掘的八个开放挑战。我们的目标是识别当前研究和有意义的应用之间的差距，突出开放的问题，并为数据流挖掘定义新的应用相关的研究方向。所确定的挑战涵盖了知识发现的整个周期，并涉及以下问题:保护数据隐私，处理遗留系统，处理不完整和延迟的信息，分析复杂数据以及评估流挖掘算法。通过实际应用对分析结果进行了说明，并对数据流挖掘的未来研究方向提出了一般性建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Open challenges for data stream mining research

Every day, huge volumes of sensory, transactional, and web data are continuously generated as streams, which need to be analyzed online as they arrive. Streaming data can be considered as one of the main sources of what is called big data. While predictive modeling for data streams and big data have received a lot of attention over the last decade, many research approaches are typically designed for well-behaved controlled problem settings, overlooking important challenges imposed by real-world applications. This article presents a discussion on eight open challenges for data stream mining. Our goal is to identify gaps between current research and meaningful applications, highlight open problems, and define new application-relevant research directions for data stream mining. The identified challenges cover the full cycle of knowledge discovery and involve such problems as: protecting data privacy, dealing with legacy systems, handling incomplete and delayed information, analysis of complex data, and evaluation of stream mining algorithms. The resulting analysis is illustrated by practical applications and provides general suggestions concerning lines of future research in data stream mining.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining

自引率

0.00%

发文量