{"title":"基于重叠窗和Mann-Whitney U检验的特征漂移检测","authors":"Jafseer K T, S. S, S. A.","doi":"10.1109/ICITIIT57246.2023.10068710","DOIUrl":null,"url":null,"abstract":"As data is ubiquitous in several real-world problems, data stream mining is a rapidly growing research area. It is expected that data stream sources will undergo changes in data distribution due to their ephemeral nature, which is called concept drift. There has been a very scant study of one particular type of drift, namely feature drift, so this paper aims to explore that type of drift. As a result of feature drift, learners must detect and adapt to changes in the relevant subset of features and the changing nature of the learning task itself. An approach to detecting feature drift was developed in this work. We used overlapping landmark windowing to keep the previous data's properties windows while analyzing the most recent data. Using the Mann-Whitney U test, we compare and store the distribution of each feature in two consecutive windows. Whenever the statistical properties of the window exclude a particular boundary from the distribution, drift is detected. We validated the effectiveness of our proposal by conducting experiments on real data.","PeriodicalId":170485,"journal":{"name":"2023 4th International Conference on Innovative Trends in Information Technology (ICITIIT)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature Drift Detection using Overlapping Window and Mann-Whitney U Test\",\"authors\":\"Jafseer K T, S. S, S. A.\",\"doi\":\"10.1109/ICITIIT57246.2023.10068710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As data is ubiquitous in several real-world problems, data stream mining is a rapidly growing research area. It is expected that data stream sources will undergo changes in data distribution due to their ephemeral nature, which is called concept drift. There has been a very scant study of one particular type of drift, namely feature drift, so this paper aims to explore that type of drift. As a result of feature drift, learners must detect and adapt to changes in the relevant subset of features and the changing nature of the learning task itself. An approach to detecting feature drift was developed in this work. We used overlapping landmark windowing to keep the previous data's properties windows while analyzing the most recent data. Using the Mann-Whitney U test, we compare and store the distribution of each feature in two consecutive windows. Whenever the statistical properties of the window exclude a particular boundary from the distribution, drift is detected. We validated the effectiveness of our proposal by conducting experiments on real data.\",\"PeriodicalId\":170485,\"journal\":{\"name\":\"2023 4th International Conference on Innovative Trends in Information Technology (ICITIIT)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 4th International Conference on Innovative Trends in Information Technology (ICITIIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICITIIT57246.2023.10068710\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Conference on Innovative Trends in Information Technology (ICITIIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITIIT57246.2023.10068710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature Drift Detection using Overlapping Window and Mann-Whitney U Test
As data is ubiquitous in several real-world problems, data stream mining is a rapidly growing research area. It is expected that data stream sources will undergo changes in data distribution due to their ephemeral nature, which is called concept drift. There has been a very scant study of one particular type of drift, namely feature drift, so this paper aims to explore that type of drift. As a result of feature drift, learners must detect and adapt to changes in the relevant subset of features and the changing nature of the learning task itself. An approach to detecting feature drift was developed in this work. We used overlapping landmark windowing to keep the previous data's properties windows while analyzing the most recent data. Using the Mann-Whitney U test, we compare and store the distribution of each feature in two consecutive windows. Whenever the statistical properties of the window exclude a particular boundary from the distribution, drift is detected. We validated the effectiveness of our proposal by conducting experiments on real data.