2017 IEEE International Conference on Data Mining (ICDM)最新文献

筛选
英文 中文
Multi-level Multiple Attentions for Contextual Multimodal Sentiment Analysis 上下文多模态情感分析的多层次多关注
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.134
Soujanya Poria, E. Cambria, Devamanyu Hazarika, Navonil Majumder, Amir Zadeh, Louis-Philippe Morency
{"title":"Multi-level Multiple Attentions for Contextual Multimodal Sentiment Analysis","authors":"Soujanya Poria, E. Cambria, Devamanyu Hazarika, Navonil Majumder, Amir Zadeh, Louis-Philippe Morency","doi":"10.1109/ICDM.2017.134","DOIUrl":"https://doi.org/10.1109/ICDM.2017.134","url":null,"abstract":"Multimodal sentiment analysis involves identifying sentiment in videos and is a developing field of research. Unlike current works, which model utterances individually, we propose a recurrent model that is able to capture contextual information among utterances. In this paper, we also introduce attentionbased networks for improving both context learning and dynamic feature fusion. Our model shows 6-8% improvement over the state of the art on a benchmark dataset.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114363711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 142
Incorporating Spatio-Temporal Smoothness for Air Quality Inference 结合时空平滑的空气质量推断
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.158
Xiangyu Zhao, Tong Xu, Yanjie Fu, Enhong Chen, Hao Guo
{"title":"Incorporating Spatio-Temporal Smoothness for Air Quality Inference","authors":"Xiangyu Zhao, Tong Xu, Yanjie Fu, Enhong Chen, Hao Guo","doi":"10.1109/ICDM.2017.158","DOIUrl":"https://doi.org/10.1109/ICDM.2017.158","url":null,"abstract":"It is well recognized that air quality inference is of great importance for environmental protection. However, due to the limited monitoring stations and various impact factors, e.g., meteorology, traffic volume and human mobility, inference of air quality index (AQI) could be a difficult task. Recently, with the development of new ways for collecting and integrating urban, mobile, and public service data, there is a potential to leverage spatial relatedness and temporal dependencies for better AQI estimation. To that end, in this paper, we exploit a novel spatio-temporal multi-task learning strategy and develop an enhanced framework for AQI inference. Specifically, both time dependence within a single monitoring station, and spatial relatedness across all the stations will be captured, and then well trained with effective optimization to support AQI inference tasks. As air-quality related features from cross-domain data have been extracted and quantified, comprehensive experiments based on real-world datasets validate the effectiveness of our proposed framework with significant margin compared with several state-of-the-art baselines, which support the hypothesis that our spatio-temporal multi-task learning framework could better predict and interpret AQI fluctuation.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121568326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Novel Exact and Approximate Algorithms for the Closest Pair Problem 最近对问题的新的精确和近似算法
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.136
S. Rajasekaran, Subrata Saha, Xingyu Cai
{"title":"Novel Exact and Approximate Algorithms for the Closest Pair Problem","authors":"S. Rajasekaran, Subrata Saha, Xingyu Cai","doi":"10.1109/ICDM.2017.136","DOIUrl":"https://doi.org/10.1109/ICDM.2017.136","url":null,"abstract":"The closest pair problem (CPP) is an important problem that has numerous applications in clustering, graph partitioning, image processing, patterns identification, intrusion detection, etc. Numerous algorithms have been presented for solving the CPP. For instance, on n points there exists an O(n log n) time algorithm for CPP (when the dimension is a constant). There also exist randomized algorithms with an expected linear run time. However these algorithms do not perform well in practice. The algorithms that are employed in practice have a worst case quadratic run time. One of the best performing algorithms for the CPP is MK (originally designed for solving the time series motif finding problem). In this paper we present an elegant exact algorithm called MPR for the CPP that performs better than MK. Also, we present approximation algorithms for the CPP that are faster than MK by up to a factor of more than 40, while maintaining a very good accuracy.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115849668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Short-Term Rainfall Prediction Model Using Multi-task Convolutional Neural Networks 基于多任务卷积神经网络的短期降雨预测模型
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.49
Minghui Qiu, P. Zhao, K. Zhang, Jun Huang, Xing Shi, Xiaoguang Wang, Wei Chu
{"title":"A Short-Term Rainfall Prediction Model Using Multi-task Convolutional Neural Networks","authors":"Minghui Qiu, P. Zhao, K. Zhang, Jun Huang, Xing Shi, Xiaoguang Wang, Wei Chu","doi":"10.1109/ICDM.2017.49","DOIUrl":"https://doi.org/10.1109/ICDM.2017.49","url":null,"abstract":"Precipitation prediction, such as short-term rainfall prediction, is a very important problem in the field of meteorological service. In practice, most of recent studies focus on leveraging radar data or satellite images to make predictions. However, there is another scenario where a set of weather features are collected by various sensors at multiple observation sites. The observations of a site are sometimes incomplete but provide important clues for weather prediction at nearby sites, which are not fully exploited in existing work yet. To solve this problem, we propose a multi-task convolutional neural network model to automatically extract features from the time series measured at observation sites and leverage the correlation between the multiple sites for weather prediction via multi-tasking. To the best of our knowledge, this is the first attempt to use multi-task learning and deep learning techniques to predict short-term rainfall amount based on multi-site features. Specifically, we formulate the learning task as an end-to-end multi-site neural network model which allows to leverage the learned knowledge from one site to other correlated sites, and model the correlations between different sites. Extensive experiments show that the learned site correlations are insightful and the proposed model significantly outperforms a broad set of baseline models including the European Centre for Medium-range Weather Forecasts system (ECMWF).","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126893579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 88
A Self-Paced Category-Aware Approach for Unsupervised Adaptation Networks 无监督自适应网络的自定步分类感知方法
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.115
Wenzhen Huang, Peipei Yang, Kaiqi Huang
{"title":"A Self-Paced Category-Aware Approach for Unsupervised Adaptation Networks","authors":"Wenzhen Huang, Peipei Yang, Kaiqi Huang","doi":"10.1109/ICDM.2017.115","DOIUrl":"https://doi.org/10.1109/ICDM.2017.115","url":null,"abstract":"The success of deep neural networks usually relies on a large number of labeled training samples, which unfortunately are not easy to obtain in practice. Unsupervised domain adaptation focuses on the problem where there is no labeled data in the target domain. In this paper, we propose a novel deep unsupervised domain adaptation method that learns transferable features. Different from most existing methods, it attempts to learn a better domain-invariant feature representation by performing a category-wise adaptation to match the conditional distributions of samples with respect to each category. A self-paced learning strategy is used to bring the awareness of label information gradually, which makes the category-wise adaptation feasible even if the labels are unavailable in target domain. Then, we give detailed theoretical analysis to explain how the better performance is obtained. The experimental results show that our method outperforms the current state of the arts on standard domain adaptation datasets.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121043299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Probabilistic Geographical Aspect-Opinion Model for Geo-Tagged Microblogs 地理标记微博的概率地理方面-意见模型
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.82
Aman Ahuja, Wei Wei, Wei Lu, Kathleen M. Carley, C. Reddy
{"title":"A Probabilistic Geographical Aspect-Opinion Model for Geo-Tagged Microblogs","authors":"Aman Ahuja, Wei Wei, Wei Lu, Kathleen M. Carley, C. Reddy","doi":"10.1109/ICDM.2017.82","DOIUrl":"https://doi.org/10.1109/ICDM.2017.82","url":null,"abstract":"Due to the rapid increase in the number of users owning location-based devices, there is a considerable amount of geo-tagged data available on social media websites, such as Twitter and Facebook. This geo-tagged data can be useful in a variety of ways to extract location-specific information, as well as to comprehend the variation of information across different geographical regions. A lot of techniques have been proposed for extracting location-based information from social media, but none of these techniques aim to utilize an important characteristic of this data, which is the presence of aspects and their opinions, expressed by the users on these platforms. In this paper, we propose Geographic Aspect Opinion model (GASPOP), a probabilistic model that jointly discovers the variation of aspect and opinion, that correspond to different topics across various geographical regions from geo-tagged social media data. It incorporates the syntactic features of text in the generative process to differentiate aspect and opinion words from general background words. The user-based modeling of topics, also enables it to determine the interest distribution of various users. Furthermore, our model can be used to predict the location of different tweets based on their text. We evaluated our model on Twitter data, and our experimental results show that GASPOP can jointly discover latent aspect and opinion words for different topics across latent geographical regions. Moreover, a quantitative analysis of GASPOP using widely used evaluation metrics shows that it outperforms the state-of-the-art methods.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116554636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Mining Customer Valuations to Optimize Product Bundling Strategy 挖掘顾客价值,优化产品捆绑策略
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.65
Li Ye, Hong Xie, Weijie Wu, John C.S. Lui
{"title":"Mining Customer Valuations to Optimize Product Bundling Strategy","authors":"Li Ye, Hong Xie, Weijie Wu, John C.S. Lui","doi":"10.1109/ICDM.2017.65","DOIUrl":"https://doi.org/10.1109/ICDM.2017.65","url":null,"abstract":"Product bundling is widely adopted for information goods and online services because it can increase profit for companies. For example, cable companies often bundle Internet access and video streaming services together. However, it is challenging to obtain an optimal bundling strategy, not only because it is computationally expensive, but also that customers’ private information (e.g., valuations for products) is needed for the decision, and we need to infer it from accessible datasets. As customers’ purchasing data are getting richer due to the popularity of online shopping, doors are open for us to infer this information. This paper aims to address: (1) How to infer customers’ valuations from the purchasing data? (2) How to determine the optimal product bundle to maximize the profit? We first formulate a profit maximization framework to select the optimal bundle set. We show that finding the optimal bundle set is NPhard. We then identify key factors that impact the profitability of product bundling. These findings give us insights to develop a computationally efficient algorithm to approximate the optimal product bundle with a provable performance guarantee. To obtain the input of the bundling algorithm, we infer the distribution of customers’ valuations from their purchasing data, based on which we run our bundling algorithm and conduct experiments on an Amazon co-purchasing dataset. We extensively evaluate the accuracy of our inference and the bundling algorithm. Our results reveal conditions under which bundling is highly profitable and provide insights to guide the deployment of product bundling.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130150994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploratory Analysis of Graph Data by Leveraging Domain Knowledge 利用领域知识的图形数据探索性分析
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.28
Di Jin, Danai Koutra
{"title":"Exploratory Analysis of Graph Data by Leveraging Domain Knowledge","authors":"Di Jin, Danai Koutra","doi":"10.1109/ICDM.2017.28","DOIUrl":"https://doi.org/10.1109/ICDM.2017.28","url":null,"abstract":"Given the soaring amount of data being generated daily, graph mining tasks are becoming increasingly challenging, leading to tremendous demand for summarization techniques. Feature selection is a representative approach that simplifies a dataset by choosing features that are relevant to a specific task, such as classification, prediction, and anomaly detection. Although it can be viewed as a way to summarize a graph in terms of a few features, it is not well-defined for exploratory analysis, and it operates on a set of observations jointly rather than conditionally (i.e., feature selection from many graphs vs. selection for an input graph conditioned on other graphs). In this work, we introduce EAGLE (Exploratory Analysis of Graphs with domain knowLEdge), a novel method that creates interpretable, feature-based, and domain-specific graph summaries in a fully automatic way. That is, the same graph in different domains–e.g., social science and neuroscience–will be described via different EAGLE summaries, which automatically leverage the domain knowledge and expectations. We propose an optimization formulation that seeks to find an interpretable summary with the most representative features for the input graph so that it is: diverse, concise, domain-specific, and efficient. Extensive experiments on synthetic and real-world datasets with up to ~1M edges and ~400 features demonstrate the effectiveness and efficiency of EAGLE and its benefits over existing methods. We also show how our method can be applied to various graph mining tasks, such as classification and exploratory analysis.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116502769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Collective Entity Resolution in Familial Networks 家庭网络中的集体实体解析
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.32
Pigi Kouki, J. Pujara, C. Marcum, L. Koehly, L. Getoor
{"title":"Collective Entity Resolution in Familial Networks","authors":"Pigi Kouki, J. Pujara, C. Marcum, L. Koehly, L. Getoor","doi":"10.1109/ICDM.2017.32","DOIUrl":"https://doi.org/10.1109/ICDM.2017.32","url":null,"abstract":"Entity resolution in settings with rich relational structure often introduces complex dependencies between co-references. Exploiting these dependencies is challenging - it requires seamlessly combining statistical, relational, and logical dependencies. One task of particular interest is entity resolution in familial networks. In this setting, multiple partial representations of a family tree are provided, from the perspective of different family members, and the challenge is to reconstruct a family tree from these multiple, noisy, partial views. This reconstruction is crucial for applications such as understanding genetic inheritance, tracking disease contagion, and performing census surveys. Here, we design a model that incorporates statistical signals, such as name similarity, relational information, such as sibling overlap, and logical constraints, such as transitivity and bijective matching, in a collective model. We show how to integrate these features using probabilistic soft logic, a scalable probabilistic programming framework. In experiments on real-world data, our model significantly outperforms state-of-the-art classifiers that use relational features but are incapable of collective reasoning.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127658539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Effective Large-Scale Online Influence Maximization 有效的大规模在线影响力最大化
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.118
Paul Lagrée, O. Cappé, Bogdan Cautis, S. Maniu
{"title":"Effective Large-Scale Online Influence Maximization","authors":"Paul Lagrée, O. Cappé, Bogdan Cautis, S. Maniu","doi":"10.1109/ICDM.2017.118","DOIUrl":"https://doi.org/10.1109/ICDM.2017.118","url":null,"abstract":"In this paper, we study a highly generic version of influence maximization (IM), one of optimizing influence campaigns by sequentially selecting \"spread seeds\" from a set of candidates, a small subset of the node population, under the hypothesis that, in a given campaign, previously activated nodes remain \"persistently\" active throughout and thus do not yield further rewards. We call this problem online influence maximization with persistence. We introduce an estimator on the candidates' missing mass – the expected number of nodes that can still be reached from a given seed candidate – and justify its strength to rapidly estimate the desired value. We then describe a novel algorithm, GT-UCB, relying on upper confidence bounds on the missing mass. We show that our approach leads to high-quality spreads on classic IM datasets, even though it makes almost no assumptions on the diffusion medium. Importantly, it is orders of magnitude faster than state-of-the-art IM methods.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116708237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信