Joanna Komorniczak , Paweł Ksieniewicz , Paweł Zyblewski
{"title":"Structuring the processing frameworks for data stream evaluation and application","authors":"Joanna Komorniczak , Paweł Ksieniewicz , Paweł Zyblewski","doi":"10.1016/j.patcog.2025.112516","DOIUrl":null,"url":null,"abstract":"<div><div>The following work addresses the problem of <em>frameworks</em> for data stream processing that can be used to evaluate the solutions in an environment that resembles real-world applications. The definition of structured frameworks stems from the need to reliably assess data stream classification methods, considering the constraints of <em>delayed</em> label access, the costs of their acquisition and the costs of model adaptation. The current experimental evaluation often boundlessly exploits the assumption of the immediate label access to monitor the recognition quality and adapt the methods to the changing concepts. The problem is leveraged by reviewing currently described methods and techniques for <em>data stream processing</em> and verifying their outcomes in <em>simulated environment</em>. This work defines a taxonomy of <em>data stream processing frameworks</em> and presents four processing schemes that link the tasks of <em>drift detection</em> and <em>classification</em> while considering a natural phenomenon of <em>label delay</em>. The presented research shows that classification quality is significantly affected not only by the disruptive phenomenon of <em>concept drifts</em> and <em>label delay</em>, but also by the undertaken processing scheme that describes the flow of labels in the recognition system. Considering a specific processing framework depending on real-world constraints proves to be a critical aspect of reliable and realistic experimental evaluation.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112516"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325011793","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The following work addresses the problem of frameworks for data stream processing that can be used to evaluate the solutions in an environment that resembles real-world applications. The definition of structured frameworks stems from the need to reliably assess data stream classification methods, considering the constraints of delayed label access, the costs of their acquisition and the costs of model adaptation. The current experimental evaluation often boundlessly exploits the assumption of the immediate label access to monitor the recognition quality and adapt the methods to the changing concepts. The problem is leveraged by reviewing currently described methods and techniques for data stream processing and verifying their outcomes in simulated environment. This work defines a taxonomy of data stream processing frameworks and presents four processing schemes that link the tasks of drift detection and classification while considering a natural phenomenon of label delay. The presented research shows that classification quality is significantly affected not only by the disruptive phenomenon of concept drifts and label delay, but also by the undertaken processing scheme that describes the flow of labels in the recognition system. Considering a specific processing framework depending on real-world constraints proves to be a critical aspect of reliable and realistic experimental evaluation.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.