2022 IEEE International Conference on Data Mining Workshops (ICDMW)最新文献_第2页

HMM-Boost: Improved Time Series State Prediction Via Supervised Hidden Markov Models: Case Studies in Epileptic Seizure and Complex Care Management HMM-Boost:通过监督隐马尔可夫模型改进的时间序列状态预测:癫痫发作和复杂护理管理的案例研究

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00050

Georgios Mavroudeas, M. Magdon-Ismail, Xiao Shou, Kristin P. Bennett

引用次数: 0

Augmenting Graph Convolution with Distance Preserving Embedding for Improved Learning 基于距离保持嵌入的增强图卷积改进学习

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00012

Guojing Cong, Seung-Hwan Lim, Steven Young

引用次数: 0

MetaSieve: Performance vs. Complexity Sieve for Time Series Forecasting MetaSieve:时间序列预测的性能与复杂性筛

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00037

Pavel Shumkovskii, A. Kovantsev, Elizaveta Stavinova, P. Chunaev

引用次数: 0

Identifying Patterns of Vulnerability Incidence in Foundational Machine Learning Repositories on GitHub: An Unsupervised Graph Embedding Approach 识别GitHub基础机器学习存储库中的漏洞发生率模式:一种无监督图嵌入方法

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00084

Agrim Sachdeva, Ben Lazarine, Ruchik Dama, S. Samtani, Hongyi Zhu

{"title":"Identifying Patterns of Vulnerability Incidence in Foundational Machine Learning Repositories on GitHub: An Unsupervised Graph Embedding Approach","authors":"Agrim Sachdeva, Ben Lazarine, Ruchik Dama, S. Samtani, Hongyi Zhu","doi":"10.1109/ICDMW58026.2022.00084","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00084","url":null,"abstract":"The rapid pace of the development of artificial intelligence (AI) solutions is enabled by leveraging foundational tools and frameworks that allow AI developers to focus on application logic and rapid prototyping. However, the security vulnerabilities present in foundation repositories might cause irreparable damage due to the AI solutions built using these libraries being deployed in production environments. Our research leverages source code hosted on the prevailing social coding platform GitHub to identify vulnerabilities in foundational repositories commonly used for modern AI development (Linux, BERT, PyTorch, and Transformers), as well as the AI repositories that utilize foundation repositories as dependencies. Using an unsupervised graph embedding approach, we generate graph embeddings that capture vulnerability information and the relationships between repositories. Based on these embeddings, we performed clustering as our downstream task to group similarly vulnerable repositories. Our research identifies patterns and similarities between repositories and will help develop effective mitigation of vulnerabilities present in groups of repositories based on foundational AI repositories. We also discuss the implications of identifying such clusters of vulnerable repositories.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114357632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SV-Learn: Learning Matrix Singular Values with Neural Networks 用神经网络学习矩阵奇异值

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00039

Derek Xu, William Shiao, Jia Chen, E. Papalexakis

引用次数: 0

cSmartML-Glassbox: Increasing Transparency and Controllability in Automated Clustering cSmartML-Glassbox:在自动集群中增加透明度和可控性

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00015

Radwa El Shawi, S. Sakr

{"title":"cSmartML-Glassbox: Increasing Transparency and Controllability in Automated Clustering","authors":"Radwa El Shawi, S. Sakr","doi":"10.1109/ICDMW58026.2022.00015","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00015","url":null,"abstract":"Machine learning algorithms have been widely employed in various applications and fields. Novel technologies in automated machine learning (AutoML) ease algorithm selection and hyperparameter optimization complexity. AutoML frame-works have achieved notable success in hyperparameter tuning and surpassed the performance of human experts. However, depending on such frameworks as black-box can leave machine learning practitioners without insights into the inner working of the AutoML process and hence influence their trust in the models produced. In addition, excluding humans from the loop creates several limitations. For example, most of the current AutoML frameworks ignore the user preferences on defining or controlling the search space, which consequently can impact the performance of the models produced and the acceptance of these models by the end-users. The research in the area of transparency and controllability of AutoML has attracted much interest lately, both in academia and industry. However, existing tools are usually restricted to supervised learning tasks such as classification and regression, while unsupervised learning, particularly clustering, remains a largely unexplored problem. Motivated by these shortcomings, we design and implement cSmartML-GlassBox, an interactive visualization tool that en-ables users to refine the search space of AutoML and analyze the results. cSmartML-GlassBox is equipped with a recommendation engine to recommend a time budget that is likely adequate for a new dataset to obtain well-performing pipeline. In addition, the tool supports multi-granularity visualization to enable machine learning practitioners to monitor the AutoML process, analyze the explored configurations and refine/control the search space. Furthermore, cSmartML-GlassBox is equipped with a logging mechanism such that repeated runs on the same dataset can be more effective by avoiding evaluating the same previously considered configurations. We demonstrate the effectiveness and usability of the cSmartML-GlassBox through a user evaluation study with 23 participants and an expert-based usability study based on four experts. We find that the proposed tool increases users' understanding and trust in the AutoML frameworks.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"2002 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129571501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Case Study on Periodic Spatio- Temporal Hotspot Detection in Azure Traffic Data 基于Azure交通数据的周期性时空热点检测实例研究

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00135

Venkata M. V. Gunturi, Rakesh Rajeev, Vipul Bondre, Aaditya Barnwal, Samir Jain, Ashank Anshuman, Manish Gupta

{"title":"A Case Study on Periodic Spatio- Temporal Hotspot Detection in Azure Traffic Data","authors":"Venkata M. V. Gunturi, Rakesh Rajeev, Vipul Bondre, Aaditya Barnwal, Samir Jain, Ashank Anshuman, Manish Gupta","doi":"10.1109/ICDMW58026.2022.00135","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00135","url":null,"abstract":"Given a spatio-temporal event framework E and a collection of time-stamped events A (over E), the goal of the periodic spatio-temporal hotspot detection (PST-Hotspot) problem is to determine spatial regions which show high “intensity” of events at certain periodic intervals. The output of the PST-Hotspot detection problem consists of the following: (a) a col-lection of spatial regions (which show high intensity of events) and, (b) their respective time intervals of high activity and periodicity values (e.g., daily, weekday-only, etc). PST-Hotspot detection poses significant challenge for designing a suitable interest measure. The aim over here is to design a mathematical representation of a PST-Hotspot such that it can differentiate interesting periodic patterns from trivial persistent patterns in the dataset. The current state of the art in the area of spatial and spatio-temporal hotspot detection focus on non-periodic patterns. In contrast, our proposed approach is able to determine periodic hotspots. We experimentally evaluated our proposed algorithm using real Azure traffic dataset from the Indian region.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127129500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Joint Low-rank and Orthogonal Deep Multi-view Subspace Clustering based on Local Fusion 基于局部融合的联合低秩正交深度多视图子空间聚类

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00017

Guixiang Wang, Hongwei Yin, Wenjun Hu, Y. Liu, Ruiqin Wang

{"title":"Joint Low-rank and Orthogonal Deep Multi-view Subspace Clustering based on Local Fusion","authors":"Guixiang Wang, Hongwei Yin, Wenjun Hu, Y. Liu, Ruiqin Wang","doi":"10.1109/ICDMW58026.2022.00017","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00017","url":null,"abstract":"In recent years, a number of multi-view clustering methods have been proposed through a global fusion paradigm. These methods take the entire sample space as the fusion object, where the global complementarity between views is explored and exploited to improve the clustering performance. However, local structures with strong or weak clustering capacity could coexist in each view. The traditional global fusion paradigm ignores the differences in clustering capacity of local structures, which makes it impossible to explore and exploit local complementarity between views. In this paper, a novel deep multi view subspace clustering method based on local fusion is proposed to solve this problem. First, a low rank self-expression layer is inserted into the deep autoencoder to eliminate the influence of noises when obtaining local cluster structure. Then, the fusion object is refined from the entire sample space to the local cluster structure, where a self-weighted strategy is designed to assign contribution weight according to the clustering capacity of the local cluster structure. Meanwhile, we joint orthogonal constraint to enhance the discriminative of local cluster structure that is more suitable for downstream clustering task. Experiments on several real-world datasets show that the proposed method achieves better clustering performance than most traditional multi-view clustering methods based on global fusion.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127398369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Incremental Learning in Time-series Data using Reinforcement Learning 使用强化学习的时间序列数据增量学习

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00115

Mustafa Shuqair, J. Jimenez-shahed, B. Ghoraani

{"title":"Incremental Learning in Time-series Data using Reinforcement Learning","authors":"Mustafa Shuqair, J. Jimenez-shahed, B. Ghoraani","doi":"10.1109/ICDMW58026.2022.00115","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00115","url":null,"abstract":"System monitoring has become an area of interest with the increasing growth in wearable sensors and continuous monitoring tools. However, the generalizability of the classification models to unseen incoming data remains challenging. This paper proposes a novel architecture based on reinforcement learning (RL) to incre-mentally learn patterns of time-series data and detect changes in the system state. Our rationale is that RL's ability to learn from past experiences can help increase the performance and generalizability of classification models in time-series monitoring applications. Our novel definition of the environment consists of a set of one-class anomaly detectors to define environment states based on the dynamics of the incoming data and a reward function to reward the RL agent according to its actions. A deep RL agent incrementally learns to perform continuous, binary classification predictions according to the environment states and the received reward. We applied the proposed model for detecting response to medication (ON or OFF) in patients with Parkinson's disease (PD). The PD dataset consisted of 170 minutes of time-series movement signals collected from 12 patients using two wearable sensors. Our proposed model, with a testing accuracy of 77.95%, outperformed Adaptive Boosting, Multi-layer Perceptron, and Support Vector Machines with 53.10%, 44.92%, and 52.70% testing accuracy, respectively. The proposed model had a slight decline in the F-score, decreasing from 88.15% validation score to 78.42% in testing, a significantly slight decline compared to the other three models. These evidence the potential of the proposed RL-based classifier in time-series monitoring applications as a highly generalizable model for unseen incoming data.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127319013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Using Image Processing Techniques to Identify and Quantify Spatiotemporal Carbon Cycle Extremes 利用图像处理技术识别和量化时空碳循环极值

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI: 10.1109/ICDMW58026.2022.00148

Bharat Sharma, J. Kumar, A. Ganguly, F. Hoffman

{"title":"Using Image Processing Techniques to Identify and Quantify Spatiotemporal Carbon Cycle Extremes","authors":"Bharat Sharma, J. Kumar, A. Ganguly, F. Hoffman","doi":"10.1109/ICDMW58026.2022.00148","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00148","url":null,"abstract":"Rising atmospheric carbon dioxide due to human activities through fossil fuel emissions and land use changes have increased climate extremes such as heat waves and droughts that have led to and are expected to increase the occurrence of carbon cycle extremes. Carbon cycle extremes represent large anomalies in the carbon cycle that are associated with gains or losses in carbon uptake. Carbon cycle extremes could be continuous in space and time and cross political boundaries. Here, we present a methodology to identify large spatiotemporal extremes (STEs) in the terrestrial carbon cycle using image processing tools for feature detection. We characterized the STE events based on neighborhood structures that are three-dimensional adjacency matrices for the detection of spatiotemporal manifolds of carbon cycle extremes. We found that the area affected and carbon loss during negative carbon cycle extremes were consistent with continuous neighborhood structures. In the gross primary production data we used, 100 carbon cycle STEs accounted for more than 75% of all the negative carbon cycle extremes. This paper presents a comparative analysis of the magnitude of carbon cycle STEs and attribution of those STEs to climate drivers as a function of neighborhood structures for two observational datasets and an Earth system model simulation.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126411172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1