2021 IEEE International Conference on Data Mining (ICDM)最新文献_第5页

Robust Low-rank Deep Feature Recovery in CNNs: Toward Low Information Loss and Fast Convergence cnn的鲁棒低秩深度特征恢复:面向低信息丢失和快速收敛

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00064

Jiahuan Ren, Zhao Zhang, Jicong Fan, Haijun Zhang, Mingliang Xu, Meng Wang

{"title":"Robust Low-rank Deep Feature Recovery in CNNs: Toward Low Information Loss and Fast Convergence","authors":"Jiahuan Ren, Zhao Zhang, Jicong Fan, Haijun Zhang, Mingliang Xu, Meng Wang","doi":"10.1109/ICDM51629.2021.00064","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00064","url":null,"abstract":"Convolutional Neural Networks (CNNs)-guided deep models have obtained impressive performance for image representation, however the representation ability may still be restricted and usually needs more epochs to make the model converge in training, due to the useful information loss during the convolution and pooling operations. We therefore propose a general feature recovery layer, termed Low-rank Deep Feature Recovery (LDFR), to enhance the representation ability of the convolutional features by seamlessly integrating low-rank recovery into CNNs, which can be easily extended to all existing CNNs-based models. To be specific, to recover the lost information during the convolution operation, LDFR aims at learning the low-rank projections to embed the feature maps onto a low-rank subspace based on some selected informative convolutional feature maps. Such low-rank recovery operation can ensure all convolutional feature maps to be reconstructed easily to recover the underlying subspace with more useful and detailed information discovered, e.g., the strokes of characters or the texture information of clothes can be enhanced after LDFR. In addition, to make the learnt low-rank subspaces more powerful for feature recovery, we design a fusion strategy to obtain a generalized subspace, which averages over all learnt sub-spaces in each LDFR layer, so that the convolutional feature maps in test phase can be recovered effectively via low-rank embedding. Extensive results on several image datasets show that existing CNNs-based models equipped with our LDFR layer can obtain better performance.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129226127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

MC-RGCN: A Multi-Channel Recurrent Graph Convolutional Network to Learn High-Order Social Relations for Diffusion Prediction MC-RGCN:一种学习高阶社会关系的多通道循环图卷积网络

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00130

Ningbo Huang, Gang Zhou, Mengli Zhang, Mengli Zhang

{"title":"MC-RGCN: A Multi-Channel Recurrent Graph Convolutional Network to Learn High-Order Social Relations for Diffusion Prediction","authors":"Ningbo Huang, Gang Zhou, Mengli Zhang, Mengli Zhang","doi":"10.1109/ICDM51629.2021.00130","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00130","url":null,"abstract":"Information diffusion prediction aims to predict the tendency of information spreading in the network. Previous methods focus on extracting chronological features from diffusion paths and leverage relations in social graph as side information to facilitate diffusion prediction. However, abundant high-order social relations in information diffusion have not been sufficiently utilized, such as co-repose and co-following which can further mine potential user common preferences. In this paper, we construct a heterogeneous diffusion network (HDN) from the social graph and information cascades to model the high-order social relations in information diffusion. Then, we design a novel model named Multi-Channel Recurrent Graph Convolutional Network (MC-RGCN), which can extract high-order social relation semantics from the channels of HDN to promote prediction performance. In each channel, we depict a specific social relations from the views of global topology, pairwise strength, and local structure. Finally, we conduct extensive experiments on three real-world datasets, and the results show that our proposed method outperforms the state-of-the-art models on diffusion prediction.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117172434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Linear Primal-Dual Multi-Instance SVM for Big Data Classifications 面向大数据分类的线性原对偶多实例支持向量机

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00012

Lodewijk Brand, L. Baker, Carla Ellefsen, Jackson Sargent, Hua Wang

{"title":"A Linear Primal-Dual Multi-Instance SVM for Big Data Classifications","authors":"Lodewijk Brand, L. Baker, Carla Ellefsen, Jackson Sargent, Hua Wang","doi":"10.1109/ICDM51629.2021.00012","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00012","url":null,"abstract":"Multi-instance learning (MIL) is an area of machine learning that handles data that is organized into sets of instances known as bags. Traditionally, MIL is used in the supervised-learning setting and is able to classify bags which can contain any number of instances. This property allows MIL to be naturally applied to solve the problems in a wide variety of real-world applications from computer vision to healthcare. However, many traditional MIL algorithms do not scale efficiently to large datasets. In this paper we present a novel Primal-Dual Multi-Instance Support Vector Machine (pdMISVM) derivation and implementation that can operate efficiently on large scale data. Our method relies on an algorithm derived using a multi-block variation of the alternating direction method of multipliers (ADMM). The approach presented in this work is able to scale to large-scale data since it avoids iteratively solving quadratic programming problems which are generally used to optimize MIL algorithms based on SVMs. In addition, we modify our derivation to include an additional optimization designed to avoid solving a least-squares problem during our algorithm; this optimization increases the utility of our approach to handle a large number of features as well as bags. Finally, we apply our approach to synthetic and real-world multi-instance datasets to illustrate the scalability, promising predictive performance, and interpretability of our proposed method. We end our discussion with an extension of our approach to handle non-linear decision boundaries. Code and data for our methods are available online at: https://github.com/minds-mines/pdMISVM.jl.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114329156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Bi-Level Attention Graph Neural Networks 双层注意图神经网络

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00133

Roshni G. Iyer, Wen Wang, Yizhou Sun

{"title":"Bi-Level Attention Graph Neural Networks","authors":"Roshni G. Iyer, Wen Wang, Yizhou Sun","doi":"10.1109/ICDM51629.2021.00133","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00133","url":null,"abstract":"Recent graph neural networks (GNNs) with the attention mechanism have historically been limited to small-scale homogeneous graphs (HoGs). However, GNNs handling heterogeneous graphs (HeGs), which contain several entity and relation types, all have shortcomings in handling attention. Most GNNs that learn graph attention for HeGs learn either node-level or relation-level attention, but not both, limiting their ability to predict both important entities and relations in the HeG. Even the best existing method that learns both levels of attention has the limitation of assuming graph relations are independent and that its learned attention disregards this dependency association. To effectively model both multi-relational and multi-entity large-scale HeGs, we present Bi-Level Attention Graph Neural Networks (BA-GNN), scalable neural networks (NNs) that use a novel bi-level graph attention mechanism. BAGNN models both node-node and relation-relation interactions in a personalized way, by hierarchically attending to both types of information from local neighborhood contexts instead of the global graph context. Rigorous experiments on seven real-world HeGs show BA-GNN consistently outperforms all baselines, and demonstrate quality and transferability of its learned relation-level attention to improve performance of other GNNs.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127645323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Limited-memory Common-directions Method With Subsampled Newton Directions for Large-scale Linear Classification 利用子采样牛顿方向的有限内存公共方向法进行大规模线性分类

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00188

Jui-Nan Yen, Chih-Jen Lin

引用次数: 0

Towards Interpretability and Personalization: A Predictive Framework for Clinical Time-series Analysis 迈向可解释性和个性化:临床时间序列分析的预测框架

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00045

Yang Li, Xianli Zhang, B. Qian, Zeyu Gao, Chong Guan, Yefeng Zheng, Hansen Zheng, Fenglang Wu, Chen Li

{"title":"Towards Interpretability and Personalization: A Predictive Framework for Clinical Time-series Analysis","authors":"Yang Li, Xianli Zhang, B. Qian, Zeyu Gao, Chong Guan, Yefeng Zheng, Hansen Zheng, Fenglang Wu, Chen Li","doi":"10.1109/ICDM51629.2021.00045","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00045","url":null,"abstract":"Clinical time-series is receiving long-term attention in data mining and machine learning communities and has boosted a variety of data-driven applications. Identifying similar patients or subgroups from clinical time-series is an essential step to design tailored treatments in clinical practice. However, most of the existing methods are either purely unsupervised that tend to neglect the patient outcome information or cannot generate personalized patient representation through supervised learning, thus may fail to identify ‘truly similar patients’ (i.e., patients who similar in both outcomes and individual outcome-related clinical variables). To tackle these limitations, we propose a novel predictive clinical time-series analysis framework. Specifically, our framework uses task-specific information to rule out the task-irrelevant factors in each patient data individually and generates the contribution scores that reveal the factors’ importance for the patient outcome. Then a patient representation construction method is proposed to generate task-related and personalized representations by combining remained factors and their contribution scores. At last, similarity measurement or cluster analysis can be conducted. We evaluate our framework on three real-world clinical time-series datasets, empirically demonstrate that our framework achieves improvements in prediction performance, similarity measurement, and clustering, thus potentially benefiting patient-similarity-based precision medicine applications.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126384883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ENGINE: Enhancing Neuroimaging and Genetic Information by Neural Embedding 引擎:通过神经嵌入增强神经成像和遗传信息

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00139

Wonjun Ko, Wonsik Jung, Eunjin Jeon, A. Mulyadi, Heung-Il Suk

{"title":"ENGINE: Enhancing Neuroimaging and Genetic Information by Neural Embedding","authors":"Wonjun Ko, Wonsik Jung, Eunjin Jeon, A. Mulyadi, Heung-Il Suk","doi":"10.1109/ICDM51629.2021.00139","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00139","url":null,"abstract":"Recently, deep learning, a branch of machine learning and data mining, has gained widespread acceptance in many applications thanks to its unprecedented successes. In this regard, pioneering studies employed deep learning frameworks for imaging genetics in virtue of their own representation caliber. But, existing approaches suffer from some limitations: (i) exploiting a simple concatenation strategy for joint analysis, (ii) a lack of extension to biomedical applications, and (iii) insufficient and inappropriate interpretations in the viewpoint of both data science and bio-neuroscience. In this work, we propose a novel deep learning framework to tackle the aforementioned issues simultaneously. Our proposed framework learns to effectively represent the neuroimaging and the genetic data jointly, and achieves state-of-the-art performance in its use for Alzheimer’s disease and mild cognitive impairment identification. Further, unlike the existing methods in the literature, the framework allows learning the relation between imaging phenotypes and genotypes in a nonlinear way without any prior neuroscientific knowledge. To demonstrate the validity of our proposed framework, we conducted experiments on a publicly available dataset and analyzed the results from diverse perspectives. Based on our experimental results, we believe that the proposed framework has a great potential to give new insights and perspectives in deep learning-based imaging genetics studies.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126410664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Pest-YOLO: Deep Image Mining and Multi-Feature Fusion for Real-Time Agriculture Pest Detection Pest- yolo:基于深度图像挖掘和多特征融合的实时农业害虫检测

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00169

Zhe Tang, Zhengyun Chen, Fang Qi, Lingyan Zhang, Shuhong Chen

{"title":"Pest-YOLO: Deep Image Mining and Multi-Feature Fusion for Real-Time Agriculture Pest Detection","authors":"Zhe Tang, Zhengyun Chen, Fang Qi, Lingyan Zhang, Shuhong Chen","doi":"10.1109/ICDM51629.2021.00169","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00169","url":null,"abstract":"The frequent outbreaks of agriculture pests have caused heavy losses in crop production. And the small size and high similarity of agricultural pests bring challenges to the prompt and accurate pest detection using imaging technologies. The key impetus of this paper is to achieve a good balance between efficiency and accuracy for pest detection on the basis of agricultural image data mining. This paper proposes Pest-YOLO which is a real-time agriculture pest detection method based on the improved convolutional neural network (CNN) and YOLOv4. First, a squeeze-and-excitation attention mechanism module is introduced to CNN for mining image data, extracting key features, and suppressing unrelated features. Then, a cross-stage multi-feature fusion method is designed to improve the structure of feature pyramid network and path aggregation network, thus enhancing the feature expressiveness of small targets like pests. Finally, our Pest-YOLO realizes end-to-end real-time pest detection with high accuracy based on improved CNN and YOLOv4. We evaluate the performance of our method on a typical large-scale pest dataset including 28k images and 24 classes. Experimental results demonstrate that our method outperforms the state-of-the-art solutions including Faster R-CNN and YOLO-based detectors, and achieves good performance with 71.6% mAP and 83.5% Recall. The proposed method is effective and applicable for accurate and real-time intelligent pest detection without expertise feature engineering.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128336306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

AS-GCN: Adaptive Semantic Architecture of Graph Convolutional Networks for Text-Rich Networks 富文本网络中图形卷积网络的自适应语义结构

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/ICDM51629.2021.00095

Zhizhi Yu, Di Jin, Ziyang Liu, Dongxiao He, Xiao Wang, Hanghang Tong, Jiawei Han

{"title":"AS-GCN: Adaptive Semantic Architecture of Graph Convolutional Networks for Text-Rich Networks","authors":"Zhizhi Yu, Di Jin, Ziyang Liu, Dongxiao He, Xiao Wang, Hanghang Tong, Jiawei Han","doi":"10.1109/ICDM51629.2021.00095","DOIUrl":"https://doi.org/10.1109/ICDM51629.2021.00095","url":null,"abstract":"Graph Neural Networks (GNNs) have demonstrated great power in many network analytical tasks. However, graphs (i.e., networks) in the real world are usually text-rich, implying that valuable semantic information needs to be carefully considered. Existing GNNs for text-rich networks typically treat text as attribute words alone, which inevitably leads to the loss of important semantic structures, limiting the representation capability of GNNs. In this paper, we propose an end-to-end adaptive semantic architecture of graph convolutional networks, namely AS-GCN, which unifies neural topic model and graph convolutional networks, for text-rich network representation. Specifically, we utilize a neural topic model to extract the global topic semantics, and accordingly augment the original text-rich network into a tri-typed heterogeneous network, capturing both the local word-sequence semantic structure and the global topic semantic structure from text. We then design an effective semantic-aware propagation of information by introducing a discriminative convolution mechanism. We further propose two strategies, that is, distribution sharing and joint training, to adaptively generate a proper network structure based on the learning objective to improve network representation. Extensive experiments on text-rich networks illustrate that our new architecture outperforms the state-of-the-art methods by a significant improvement. Meanwhile, this architecture can also be applied to e-commerce search scenes, and experiments on a real e-commerce problem from JD further demonstrate the superiority of the proposed architecture over the baselines.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130648768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

Area Chairs and Program Committee 地区主委及计划委员会

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI: 10.1109/icdm51629.2021.00007

引用次数: 0