2021 IEEE International Conference on Data Mining (ICDM)最新文献

筛选
英文 中文
Matrix Profile XXIII: Contrast Profile: A Novel Time Series Primitive that Allows Real World Classification 矩阵剖面XXIII:对比剖面:一种允许真实世界分类的新型时间序列原语
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00151
nonymous” Ryan Mercer, S. Alaee, Alireza Abdoli, Shailendra Singh, Amy Murillo, Eamonn J. Keogh
{"title":"Matrix Profile XXIII: Contrast Profile: A Novel Time Series Primitive that Allows Real World Classification","authors":"nonymous” Ryan Mercer, S. Alaee, Alireza Abdoli, Shailendra Singh, Amy Murillo, Eamonn J. Keogh","doi":"10.1109/ICDM51629.2021.00151","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00151","url":null,"abstract":"Time series data remains a perennially important datatype considered in data mining. In the last decade there has been an increasing realization that time series data can best understood by reasoning about time series subsequences on the basis of their similarity to other subsequences: the two most familiar such time series concepts being motifs and discords. Time series motifs refer to two particularly close subsequences, whereas time series discords indicate subsequences that are far from their nearest neighbors. However, we argue that it can sometimes be useful to simultaneously reason about a subsequence’s closeness to certain data and its distance to other data. In this work we introduce a novel primitive called the Contrast Profile that allows us to efficiently compute such a definition in a principled way. As we will show, the Contrast Profile has many downstream uses, including anomaly detection, data exploration, and preprocessing unstructured data for classification.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123380837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
SSDNet: State Space Decomposition Neural Network for Time Series Forecasting 用于时间序列预测的状态空间分解神经网络
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00048
Yang Lin, I. Koprinska, Mashud Rana
{"title":"SSDNet: State Space Decomposition Neural Network for Time Series Forecasting","authors":"Yang Lin, I. Koprinska, Mashud Rana","doi":"10.1109/ICDM51629.2021.00048","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00048","url":null,"abstract":"In this paper, we present SSDNet, a novel deep learning approach for time series forecasting. SSDNet combines the Transformer architecture with state space models to provide probabilistic and interpretable forecasts, including trend and seasonality components and previous time steps important for the prediction. The Transformer architecture is used to learn the temporal patterns and estimate the parameters of the state space model directly and efficiently, without the need for Kalman filters. We comprehensively evaluate the performance of SSDNet on five data sets, showing that SSDNet is an effective method in terms of accuracy and speed, outperforming state-of-the-art deep learning and statistical methods, and able to provide meaningful trend and seasonality components.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127924635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
BaT: Beat-aligned Transformer for Electrocardiogram Classification 用于心电图分类的热对准变压器
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00043
Xiaoyu Li, Chen Li, Yuhua Wei, Yuyao Sun, Jishang Wei, Xiang Li, B. Qian
{"title":"BaT: Beat-aligned Transformer for Electrocardiogram Classification","authors":"Xiaoyu Li, Chen Li, Yuhua Wei, Yuyao Sun, Jishang Wei, Xiang Li, B. Qian","doi":"10.1109/ICDM51629.2021.00043","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00043","url":null,"abstract":"Electrocardiogram (ECG) is one of the critical diagnostic tools in healthcare. Various deep learning models, except Transformers, have been explored and applied to map ECG patterns to heart abnormalities. Transformer models have been adopted from natural language processing to computer vision with advanced features. Most recently, vision transformers show exceptional performances, even on moderate-scale datasets. However, naively applying vision transformers on electrocardiogram datasets leads to poor results. In this paper, we propose a novel network called Beat-aligned Transformer (BaT), a hierarchical Transformer that sufficiently exploits the cyclicity of ECG. We organize and treat an input ECG as multiple aligned beats instead of a single time series. In the BaT, shifted-window-based Transformer blocks (SW Block) are adopted to learn the representation for each beat, and aggregation blocks are designed to exchange information among the beat representations. Nested SW Blocks and aggregation blocks form a beat-aware hierarchical structure of BaT. In this way, the new data format and the BaT hierarchical structure boost Transformer performance on ECG classification. From the experiments on public ECG datasets, we observe BaT outperforms other Transformer-based models and achieves competitive performance compared with other state-of-the-art methods.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116340108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Precise Bayes Classifier: Summary of Results 精确贝叶斯分类器:结果总结
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00076
Amin Vahedian, Xun Zhou
{"title":"Precise Bayes Classifier: Summary of Results","authors":"Amin Vahedian, Xun Zhou","doi":"10.1109/ICDM51629.2021.00076","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00076","url":null,"abstract":"The Bayes Classifier is shown to have the minimal classification error, in addition to interpretable predictions. However, it requires the knowledge of underlying distributions of the predictors to be usable. This requirement is almost never satisfied. Naive Bayes classifiers and variants estimate this classifier by assuming the independence among predictors. This restrictive assumption hinders both the accuracy of these classifiers and their interpretability, as the calculated probabilities become less reliable. Moreover, it is argued in the literature that interpretability comes at the expense of accuracy and vice versa. In this paper, we are motivated by the accurate and interpretable nature of the Bayes Classifier. We propose Precise Bayes, which is a computationally efficient estimation of the Bayes Classifier based on a new formulation. Our method makes no assumptions, neither on independence nor on underlying distributions. We devise a new theoretical minimal error rate for our formulation and show that the error rate of Precise Bayes approaches this limit with increasing number of samples learned. Moreover, the calculated posterior probabilities, are actual empirical probabilities calculated by counting the observations and outcomes. This makes the predictions made by Precise Bayes fully explainable. Our evaluations on generated datasets and real datasets validate our theoretical claims on prediction error rate and computational efficiency.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126901349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Composition-Enhanced Graph Collaborative Filtering for Multi-behavior Recommendation 面向多行为推荐的组合增强图协同过滤
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00183
Daqing Wu, Xiao Luo, Zeyu Ma, Chong Chen, Pengfei Wang, Minghua Deng, Jinwen Ma
{"title":"Composition-Enhanced Graph Collaborative Filtering for Multi-behavior Recommendation","authors":"Daqing Wu, Xiao Luo, Zeyu Ma, Chong Chen, Pengfei Wang, Minghua Deng, Jinwen Ma","doi":"10.1109/ICDM51629.2021.00183","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00183","url":null,"abstract":"Rapid and accurate prediction of user preferences is the ultimate goal of today’s recommender systems. More and more researchers pay attention to multi-behavior recommender systems which utilize the auxiliary types of user-item interaction data, such as page view and add-to-cart to help estimate user preferences. Recently, graph-based methods were proposed to showcase an advanced capability in representation learning and capturing collaborative signals. However, we argue that these methods ignore the intrinsic difference between the two types of nodes in the bipartite graph and aggregate information from neighboring nodes with the same functions. Besides, these models do not fully explore the collaborative signals implied by the meta-path across different types of behavior, which causes a huge loss of the potential semantic information across behaviors. To address the above limitations, we present a unified graph model named SaGCN (short for Semantic-aware Graph Convolutional Networks). Specifically, we construct separate user-user and item-item graphs by meta-path, and apply separate aggregation and transformation functions to propagate user and item information. To perform better semantic propagation, we design a relation composition function and a semantic propagation architecture for heterogeneous collaborative filtering signals learning. Extensive experiments on two real-world datasets show that SaGCN outperforms a wide range of state-of-the-art methods in multi-behavior scenarios.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126679451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
PaGAN: Generative Adversarial Network for Patent understanding 专利理解的生成对抗网络
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00126
Guillaume Guarino, Ahmed Samet, Amir Nafi, D. Cavallucci
{"title":"PaGAN: Generative Adversarial Network for Patent understanding","authors":"Guillaume Guarino, Ahmed Samet, Amir Nafi, D. Cavallucci","doi":"10.1109/ICDM51629.2021.00126","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00126","url":null,"abstract":"In recent years, Deep Learning methods have become very popular in Natural Language Processing (NLP), especially transformer-based architecture. NLP domain requires a high volume of annotated data to work. Unfortunately, obtaining high-quality and voluminous labeled data is expensive and time-consuming. One promising method which has singled out for its performance in the context of data deficiency is semi-supervised learning with Generative Adversarial Networks (GAN). In this paper, we propose a new approach called PaGAN which is a combination of a document classifier and a sentence-level classifier inside a GAN for patent documents understanding. The idea is to mine the patent’s motivating problem (aka contradiction in TRIZ domain) which is fundamentally important to understand the underlying invention and its originality. PaGAN is applied and evaluated on a real-world dataset. Experiments show outperforming results of PaGAN comparatively to baseline approaches.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"279 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131713572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Summarizing User-Item Matrix By Group Utility Maximization 用群体效用最大化法总结用户-物品矩阵
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1145/3578586
Yongjie Wang, Ke Wang, Cheng Long, C. Miao
{"title":"Summarizing User-Item Matrix By Group Utility Maximization","authors":"Yongjie Wang, Ke Wang, Cheng Long, C. Miao","doi":"10.1145/3578586","DOIUrl":"https://doi.org/10.1145/3578586","url":null,"abstract":"A user-item matrix conveniently represents the utility measure associated with (user, item) pairs, such as citation counts, users’ rating/vote on items or locations, and clicks on items. A high utility value indicates a strong association of the pair. In this work, we consider the problem of summarizing strong associations for a large user-item matrix using a small summary size. The traditional techniques fail to distinguish user groups associated with different items, such as top-l item selection, or fail to focus on high utility, such as similarity based subspace clustering and biclustering. We define a new problem, called Group Utility Maximization, to summarize the entire user population through k groups and l items for each group; the goal is to maximize the sum of utility of selected items over all groups collectively. We propose the k-max algorithm for it, which iteratively refines existing k groups. We evaluate the proposed algorithm on two real-life datasets. The results provide an easyto-understand overview of the whole dataset efficiently.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132143264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MetaGB: A Gradient Boosting Framework for Efficient Task Adaptive Meta Learning MetaGB:一个用于高效任务自适应元学习的梯度增强框架
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00020
Manqing Dong, Lina Yao, Xianzhi Wang, Xiwei Xu, Liming Zhu
{"title":"MetaGB: A Gradient Boosting Framework for Efficient Task Adaptive Meta Learning","authors":"Manqing Dong, Lina Yao, Xianzhi Wang, Xiwei Xu, Liming Zhu","doi":"10.1109/ICDM51629.2021.00020","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00020","url":null,"abstract":"Deep learning frameworks generally require sufficient training data to generalize well while fail to adapt on small or few-shot datasets. Meta-learning offers an effective means of tackling few-shot scenarios and has drawn increasing attention in recent years. Meta-optimization aims to learn a shared set of parameters across tasks for meta-learning while facing challenges in determining whether an initialization condition can be generalized to tasks with diverse distributions. In this regard, we propose a meta-gradient boosting framework that can fit diverse distributions based on a base learner (which learns shared information across tasks) and a series of gradient-boosted modules (which capture task-specific information). We evaluate the model on several few-shot learning benchmarks and demonstrate the effectiveness of our model in modulating task-specific meta-learned priors and handling diverse distributions.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133382434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TRIO: Task-agnostic dataset representation optimized for automatic algorithm selection 为自动算法选择优化的任务不可知数据集表示
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00018
Noy Cohen-Shapira, L. Rokach
{"title":"TRIO: Task-agnostic dataset representation optimized for automatic algorithm selection","authors":"Noy Cohen-Shapira, L. Rokach","doi":"10.1109/ICDM51629.2021.00018","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00018","url":null,"abstract":"With the growing number of machine learning (ML) algorithms, the selection of the top-performing algorithms for a given dataset, task, and evaluation measure is known to be a challenging task. The human expertise required for this task has fueled the demand for automatic solutions. Meta-learning is a popular approach for automatic algorithm selection based on dataset characterization. Existing meta-learning methods often represent the datasets using predefined features and thus cannot be generalized for various ML tasks, or alternatively, learn their representations in a supervised fashion, and thus cannot address unsupervised tasks. In this study, we first propose a novel learning-based task-agnostic method for dataset representation. Second, we present TRIO, a meta-learning approach based on the proposed dataset representation, which is capable of accurately recommending top-performing algorithms for unseen datasets. TRIO first learns graphical representations from the datasets and then utilizes a graph convolutional neural network technique to extract their latent representations. An extensive evaluation on 337 datasets and 195 ML algorithms demonstrates the effectiveness of our approach over state-of-the-art methods for algorithm selection for both supervised (classification and regression) and unsupervised (clustering) tasks.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123930102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MASCOT: A Quantization Framework for Efficient Matrix Factorization in Recommender Systems 推荐系统中高效矩阵分解的量化框架
2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00039
Yunyong Ko, Jae-Seo Yu, Hong-Kyun Bae, Y. Park, Dongwon Lee, Sang-Wook Kim
{"title":"MASCOT: A Quantization Framework for Efficient Matrix Factorization in Recommender Systems","authors":"Yunyong Ko, Jae-Seo Yu, Hong-Kyun Bae, Y. Park, Dongwon Lee, Sang-Wook Kim","doi":"10.1109/ICDM51629.2021.00039","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00039","url":null,"abstract":"In recent years, quantization methods have successfully accelerated the training of large deep neural network (DNN) models by reducing the level of precision in computing operations (e.g., forward/backward passes) without sacrificing its accuracy. In this work, therefore, we attempt to apply such a quantization idea to the popular Matrix factorization (MF) methods to deal with the growing scale of models and datasets in recommender systems. However, to our dismay, we observe that the state-of-the-art quantization methods are not effective in the training of MF models, unlike their successes in the training of DNN models. To this phenomenon, we posit that two distinctive features in training MF models could explain the difference: (i) the training of MF models is much more memory-intensive than that of DNN models, and (ii) the quantization errors across users and items in recommendation are not uniform. From these observations, we develop a quantization framework for MF models, named MASCOT, employing novel strategies (i.e., m-quantization and g-switching) to successfully address the aforementioned limitations of quantization in the training of MF models. The comprehensive evaluation using four real-world datasets demonstrates that MASCOT improves the training performance of MF models by about 45%, compared to the training without quantization, while maintaining low model errors, and the strategies and implementation optimizations of MASCOT are quite effective in the training of MF models. For the detailed information about MASCOT, we release the code of MASCOT and the datasets at: https://github.com/Yujaeseo/lCDM-2021_MASCOT.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116395968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信