2013 IEEE 13th International Conference on Data Mining最新文献

An Unsupervised Algorithm for Learning Blocking Schemes 一种学习块方案的无监督算法

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.60

M. Kejriwal, Daniel P. Miranker

引用次数: 65

Binary Time-Series Query Framework for Efficient Quantitative Trait Association Study 高效数量性状关联研究的二元时间序列查询框架

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.42

Hongfei Wang, Xiang Zhang

{"title":"Binary Time-Series Query Framework for Efficient Quantitative Trait Association Study","authors":"Hongfei Wang, Xiang Zhang","doi":"10.1109/ICDM.2013.42","DOIUrl":"https://doi.org/10.1109/ICDM.2013.42","url":null,"abstract":"Quantitative trait association study examines the association between quantitative traits and genetic variants. As a promising tool, it has been widely applied to dissect the genetic basis of complex diseases. However, such study usually involves testing trillions of variant-trait pairs and demands intensive computational resources. Recently, several algorithms have been developed to improve its efficiency. In this paper, we propose a framework, Fabrique, which models quantitative trait association study as querying binary time-series and bridges the two seemly different problems. Specifically, in the proposed framework, genetic variants are treated as a database consisting of binary time-series. Finding trait-associated variants is equivalent to finding the nearest neighbors of the trait. For efficient query process, Fabrique partitions and normalizes the binary time-series, and estimates a tight upper bound for each group of time-series to prune the search space. Extensive experimental results demonstrate that Fabrique only needs to search a very small portion of the database to locate the target variants and significantly outperforms the state-of-the-art method. We also show that Fabrique can be applied to other binary time-series query problem in addition to the genetic association study.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125498820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Discriminatively Enhanced Topic Models 判别增强主题模型

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.107

Snigdha Chaturvedi, Hal Daumé, Taesun Moon

引用次数: 0

Context-Aware MIML Instance Annotation 上下文感知的MIML实例注释

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.115

Forrest Briggs, Xiaoli Z. Fern, R. Raich

{"title":"Context-Aware MIML Instance Annotation","authors":"Forrest Briggs, Xiaoli Z. Fern, R. Raich","doi":"10.1109/ICDM.2013.115","DOIUrl":"https://doi.org/10.1109/ICDM.2013.115","url":null,"abstract":"In multi-instance multi-label (MIML) instance annotation, the goal is to learn an instance classifier while training on a MIML dataset, which consists of bags of instances paired with label sets, instance labels are not provided in the training data. The MIML formulation can be applied in many domains. For example, in an image domain, bags are images, instances are feature vectors representing segments in the images, and the label sets are lists of objects or categories present in each image. Although many MIML algorithms have been developed for predicting the label set of a new bag, only a few have been specifically designed to predict instance labels. We propose MIML-ECC (ensemble of classifier chains), which exploits bag-level context through label correlations to improve instance-level prediction accuracy. The proposed method is scalable in all dimensions of a problem (bags, instances, classes, and feature dimension), and has no parameters that require tuning (which is a problem for prior methods). In experiments on two image datasets, a bioacoustics dataset, and two artificial datasets, MIML-ECC achieves higher or comparable accuracy in comparison to several recent methods and baselines.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"200 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116006215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Markov Blanket Feature Selection with Non-faithful Data Distributions 非忠实数据分布下的马尔可夫毯子特征选择

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.154

Kui Yu, Xindong Wu, Zan Zhang, Yang Mu, Hao Wang, W. Ding

引用次数: 11

Spatio-Temporal Topic Modeling in Mobile Social Media for Location Recommendation 面向位置推荐的移动社交媒体时空主题建模

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.139

Bo Hu, Mohsen Jamali, M. Ester

引用次数: 68

Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates Co-ClusterD:一种具有顺序更新的数据共聚的分布式框架

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.76

Sen Su, Xiang Cheng, Lixin Gao, Jiangtao Yin

引用次数: 4

Reconstructing Individual Mobility from Smart Card Transactions: A Space Alignment Approach 从智能卡交易中重构个人流动性:一种空间对齐方法

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.37

Nicholas Jing Yuan, Yingzi Wang, Fuzheng Zhang, Xing Xie, Guangzhong Sun

{"title":"Reconstructing Individual Mobility from Smart Card Transactions: A Space Alignment Approach","authors":"Nicholas Jing Yuan, Yingzi Wang, Fuzheng Zhang, Xing Xie, Guangzhong Sun","doi":"10.1109/ICDM.2013.37","DOIUrl":"https://doi.org/10.1109/ICDM.2013.37","url":null,"abstract":"Smart card transactions capture rich information of human mobility and urban dynamics, therefore are of particular interest to urban planners and location-based service providers. However, since most transaction systems are only designated for billing purpose, typically, fine-grained location information, such as the exact boarding and alighting stops of a bus trip, is only partially or not available at all, which blocks deep exploitation of this rich and valuable data at individual level. This paper presents a \"space alignment\" framework to reconstruct individual mobility history from a large-scale smart card transaction dataset pertaining to a metropolitan city. Specifically, we show that by delicately aligning the monetary space and geospatial space with the temporal space, we are able to extrapolate a series of critical domain specific constraints. Later, these constraints are naturally incorporated into a semi-supervised conditional random field to infer the exact boarding and alighting stops of all transit routes with a surprisingly high accuracy, e.g., given only 10% trips with known alighting/boarding stops, we successfully inferred more than 78% alighting and boarding stops from all unlabeled trips. In addition, we demonstrated that the smart card data enriched by the proposed approach dramatically improved the performance of a conventional method for identifying users' home and work places (with 88% improvement on home detection and 35% improvement on work place detection). The proposed method offers the possibility to mine individual mobility from common public transit transactions, and showcases how uncertain data can be leveraged with domain knowledge and constraints, to support cross-application data mining tasks.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128304888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 56

Hibernating Process: Modelling Mobile Calls at Multiple Scales 休眠过程:在多个尺度上模拟移动呼叫

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.82

Siyuan Liu, Lei Li, Rammaya Krishnan

引用次数: 0

Efficient Proper Length Time Series Motif Discovery 有效的适当长度时间序列基序发现

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.111

Sorrachai Yingchareonthawornchai, Haemwaan Sivaraks, T. Rakthanmanon, C. Ratanamahatana

引用次数: 28