A systematic review on detection and adaptation of concept drift in streaming data using machine learning techniques

WIREs Data Mining and Knowledge Discovery Pub Date : 2024-03-19 DOI:10.1002/widm.1536

Shruti Arora, Rinkle Rani, Nitin Saxena

{"title":"A systematic review on detection and adaptation of concept drift in streaming data using machine learning techniques","authors":"Shruti Arora, Rinkle Rani, Nitin Saxena","doi":"10.1002/widm.1536","DOIUrl":null,"url":null,"abstract":"Last decade demonstrate the massive growth in organizational data which keeps on increasing multi-fold as millions of records get updated every second. Handling such vast and continuous data is challenging which further opens up many research areas. The continuously flowing data from various sources and in real-time is termed as streaming data. While deriving valuable statistics from data streams, the variation that occurs in data distribution is called concept drift. These drifts play a significant role in a variety of disciplines, including data mining, machine learning, ubiquitous knowledge discovery, quantitative decision theory, and so forth. As a result, a substantial amount of research is carried out for studying methodologies and approaches for dealing with drifts. However, the available material is scattered and lacks guidelines for selecting an effective technique for a particular application. The primary novel objective of this survey is to present an understanding of concept drift challenges and allied studies. Further, it assists researchers from diverse domains to accommodate detection and adaptation algorithms for concept drifts in their applications. Overall, this study aims to contribute to deeper insights into the classification of various types of drifts and methods for detection and adaptation along with their key features and limitations. Furthermore, this study also highlights performance metrics used to evaluate the concept drift detection methods for streaming data. This paper presents the future research scope by highlighting gaps in the existing literature for the development of techniques to handle concept drifts.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"WIREs Data Mining and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/widm.1536","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Last decade demonstrate the massive growth in organizational data which keeps on increasing multi-fold as millions of records get updated every second. Handling such vast and continuous data is challenging which further opens up many research areas. The continuously flowing data from various sources and in real-time is termed as streaming data. While deriving valuable statistics from data streams, the variation that occurs in data distribution is called concept drift. These drifts play a significant role in a variety of disciplines, including data mining, machine learning, ubiquitous knowledge discovery, quantitative decision theory, and so forth. As a result, a substantial amount of research is carried out for studying methodologies and approaches for dealing with drifts. However, the available material is scattered and lacks guidelines for selecting an effective technique for a particular application. The primary novel objective of this survey is to present an understanding of concept drift challenges and allied studies. Further, it assists researchers from diverse domains to accommodate detection and adaptation algorithms for concept drifts in their applications. Overall, this study aims to contribute to deeper insights into the classification of various types of drifts and methods for detection and adaptation along with their key features and limitations. Furthermore, this study also highlights performance metrics used to evaluate the concept drift detection methods for streaming data. This paper presents the future research scope by highlighting gaps in the existing literature for the development of techniques to handle concept drifts.

Abstract Image

查看原文本刊更多论文

利用机器学习技术检测和调整流数据中的概念漂移的系统综述

近十年来，随着每秒更新的记录数以百万计，组织数据不断成倍增长。处理如此庞大和连续的数据极具挑战性，这进一步开辟了许多研究领域。来自不同来源的实时、持续流动的数据被称为流数据。在从数据流中获取有价值的统计数据时，数据分布中出现的变化被称为概念漂移。这些漂移在数据挖掘、机器学习、泛在知识发现、定量决策理论等多个学科中发挥着重要作用。因此，为研究处理漂移的方法和途径，开展了大量研究。然而，现有资料比较分散，缺乏为特定应用选择有效技术的指导原则。本调查报告的主要新目标是介绍对概念漂移挑战和相关研究的理解。此外，它还有助于不同领域的研究人员在其应用中采用概念漂移的检测和适应算法。总之，本研究旨在帮助人们深入了解各种类型的漂移分类、检测和适应方法，以及它们的主要特点和局限性。此外，本研究还强调了用于评估流数据概念漂移检测方法的性能指标。本文通过强调现有文献在开发处理概念漂移的技术方面存在的不足，提出了未来的研究范围。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

WIREs Data Mining and Knowledge Discovery

自引率

0.00%

发文量