Proceedings of the 21st ACM international conference on Information and knowledge management最新文献_第10页

Mining high utility itemsets without candidate generation 挖掘高效用项目集而不生成候选项目集

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2396773

Mengchi Liu, Jun-Feng Qu

{"title":"Mining high utility itemsets without candidate generation","authors":"Mengchi Liu, Jun-Feng Qu","doi":"10.1145/2396761.2396773","DOIUrl":"https://doi.org/10.1145/2396761.2396773","url":null,"abstract":"High utility itemsets refer to the sets of items with high utility like profit in a database, and efficient mining of high utility itemsets plays a crucial role in many real-life applications and is an important research issue in data mining area. To identify high utility itemsets, most existing algorithms first generate candidate itemsets by overestimating their utilities, and subsequently compute the exact utilities of these candidates. These algorithms incur the problem that a very large number of candidates are generated, but most of the candidates are found out to be not high utility after their exact utilities are computed. In this paper, we propose an algorithm, called HUI-Miner (High Utility Itemset Miner), for high utility itemset mining. HUI-Miner uses a novel structure, called utility-list, to store both the utility information about an itemset and the heuristic information for pruning the search space of HUI-Miner. By avoiding the costly generation and utility computation of numerous candidate itemsets, HUI-Miner can efficiently mine high utility itemsets from the utility-lists constructed from a mined database. We compared HUI-Miner with the state-of-the-art algorithms on various databases, and experimental results show that HUI-Miner outperforms these algorithms in terms of both running time and memory consumption.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131561423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 594

An evaluation of corpus-driven measures of medical concept similarity for information retrieval 基于语料库的医学概念相似度信息检索评价

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2398661

B. Koopman, G. Zuccon, P. Bruza, Laurianne Sitbon, Michael Lawley

引用次数: 44

Information-complete and redundancy-free keyword search over large data graphs 对大型数据图进行信息完整和无冗余的关键字搜索

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2398712

Byron J. Gao, Zhumin Chen, Qi Kang

引用次数: 1

Finding top k most influential spatial facilities over uncertain objects 找出对不确定对象影响最大的 k 个空间设施

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2396878

Liming Zhan, Ying Zhang, W. Zhang, Xuemin Lin

引用次数: 18

Time feature selection for identifying active household members 识别活跃家庭成员的时间特征选择

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2398628

P. Campos, Alejandro Bellogín, F. Díez, Iván Cantador

引用次数: 8

A tensor encoding model for semantic processing 语义处理的张量编码模型

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2398617

Mike Symonds, P. Bruza, Laurianne Sitbon, I. Turner

引用次数: 5

Accelerating locality preserving nonnegative matrix factorization 加速保局域非负矩阵分解

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2398618

Guanhong Yao, Deng Cai

{"title":"Accelerating locality preserving nonnegative matrix factorization","authors":"Guanhong Yao, Deng Cai","doi":"10.1145/2396761.2398618","DOIUrl":"https://doi.org/10.1145/2396761.2398618","url":null,"abstract":"Matrix factorization techniques have been frequently applied in information retrieval, computer vision and pattern recognition. Among them, Non-negative Matrix Factorization (NMF) has received considerable attention due to its psychological and physiological interpretation of naturally occurring data whose representation may be parts-based in the human brain. Locality Preserving Non-negative Matrix Factorization (LPNMF) is a recently proposed graph-based NMF extension which tries to preserves the intrinsic geometric structure of the data. Compared with the original NMF, LPNMF has more discriminating power on data representa- tion thanks to its geometrical interpretation and outstanding ability to discover the hidden topics. However, the computa- tional complexity of LPNMF is O(n3), where n is the number of samples. In this paper, we propose a novel approach called Accelerated LPNMF (A-LPNMF) to solve the com- putational issue of LPNMF. Specifically, A-LPNMF selects p (p j n) landmark points from the data and represents all the samples as the sparse linear combination of these landmarks. The non-negative factors which incorporates the geometric structure can then be efficiently computed. Experimental results on the real data sets demonstrate the effectiveness and efficiency of our proposed method.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"160 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132754557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

PLEAD 2012: politics, elections and data 恳求2012:政治、选举和数据

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2398759

Ingmar Weber, A. Popescu, M. Pennacchiotti

引用次数: 4

Sort-based query-adaptive loading of R-trees 基于排序的查询自适应r树加载

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2398577

Daniar Achakeev, B. Seeger, P. Widmayer

引用次数: 25

Exploring simultaneous keyword and key sentence extraction: improve graph-based ranking using wikipedia 探索同步关键字和关键句子提取:使用维基百科改进基于图的排名

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI: 10.1145/2396761.2398706

Xun Wang, Lei Wang, Jiwei Li, Sujian Li

{"title":"Exploring simultaneous keyword and key sentence extraction: improve graph-based ranking using wikipedia","authors":"Xun Wang, Lei Wang, Jiwei Li, Sujian Li","doi":"10.1145/2396761.2398706","DOIUrl":"https://doi.org/10.1145/2396761.2398706","url":null,"abstract":"Summarization and Keyword Selection are two important tasks in NLP community. Although both aim to summarize the source articles, they are usually treated separately by using sentences or words. In this paper, we propose a two-level graph based ranking algorithm to generate summarization and extract keywords at the same time. Previous works have reached a consensus that important sentence is composed by important keywords. In this paper, we further study the mutual impact between them through context analysis. We use Wikipedia to build a two-level concept-based graph, instead of traditional term-based graph, to express their homogenous relationship and heterogeneous relationship. We run PageRank and HITS rank on the graph to adjust both homogenous and heterogeneous relationships. A more reasonable relatedness value will be got for key sentence selection and keyword selection. We evaluate our algorithm on TAC 2011 data set. Traditional term-based approach achieves a score of 0.255 in ROUGE-1 and a score of 0.037 and ROUGE-2 and our approach can improve them to 0.323 and 0.048 separately.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132274391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10