2010 IEEE International Conference on Data Mining Workshops最新文献_第4页

Sequence Alignment Based Analysis of Player Behavior in Massively Multiplayer Online Role-Playing Games (MMORPGs) 基于序列对齐的大型多人在线角色扮演游戏(mmorpg)玩家行为分析

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.166

Kyong Jin Shim, J. Srivastava

引用次数: 6

Ensemble-Based Method for Task 2: Predicting Traffic Jam 任务2:预测交通阻塞的集成方法

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.54

Jingrui He, Qing He, G. Swirszcz, Y. Kamarianakis, Richard D. Lawrence, Wei Shen, L. Wynter

引用次数: 8

Contextual Sequential Pattern Mining 上下文顺序模式挖掘

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.182

Julien Rabatel, S. Bringay, P. Poncelet

引用次数: 15

Parametric Templates: A New Enzyme Active-Site Prediction Algorithm 参数模板:一种新的酶活性位点预测算法

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.176

Tsuyoshi Kato, Kazuhiro Suwa, N. Nagano

引用次数: 1

Empirical Analysis: News Impact on Stock Prices Based on News Density 基于新闻密度的新闻对股价影响实证分析

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.124

Xiaodong Li, Xiaotie Deng, Feng Wang, Keren Dong

引用次数: 7

Traffic Velocity Prediction Using GPS Data: IEEE ICDM Contest Task 3 Report 基于GPS数据的交通速度预测:IEEE ICDM竞赛任务3报告

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.52

Wei Shen, Y. Kamarianakis, L. Wynter, Jingrui He, Qing He, Richard D. Lawrence, G. Swirszcz

引用次数: 11

Automated Prompting in a Smart Home Environment 智能家居环境中的自动提示

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.147

Barnan Das, Chao Chen, N. Dasgupta, D. Cook, Adriana M. Seelye

引用次数: 21

Clutter-Adaptive Visualization for Mobile Data Mining 移动数据挖掘的杂波自适应可视化

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.134

Brett Gillick, Hasnain AlTaiar, S. Krishnaswamy, J. Liono, Nicholas Nicoloudis, Abhijat Sinha, A. Zaslavsky, M. Gaber

引用次数: 9

dMaximalCliques: A Distributed Algorithm for Enumerating All Maximal Cliques and Maximal Clique Distribution dMaximalCliques:一种枚举所有最大团和最大团分布的分布式算法

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.13

Li Lu, Yunhong Gu, R. Grossman

引用次数: 30

Semi-supervised PLSA for Document Clustering 半监督PLSA用于文档聚类

2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.85

Lingfeng Niu, Yong Shi

{"title":"Semi-supervised PLSA for Document Clustering","authors":"Lingfeng Niu, Yong Shi","doi":"10.1109/ICDMW.2010.85","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.85","url":null,"abstract":"By utilizing the must-link or cannot-link pair wise constraints in data, semi-supervised clustering improves the performance of unsupervised clustering significantly. A number of semi-supervised clustering algorithms have been proposed to consider such pair wise constraints. However, most of them assign a hard label to each data item and produce little information about the cluster itself. In this work, we propose a Probabilistic Latent Semantic Analysis(PLSA) based semi-supervised algorithm for documents clustering by employing the must-link supervision between two documents, which is available in many real world data. The new algorithm can produce the soft cluster label assignment for each document as well as the probabilistic representation of latent topics in the cluster. No additional parameters need to be estimated besides the parameters in standard PLSA. This reduces the risk of over-fitting especially when the data is sparse. We provide the Expectation Maximization(EM) procedure for semi-supervised PLSA to determine the local optimal parameters that maximize the likelihood. To utilize multiple computation nodes for large scale data set, we also propose a distributed implementation of the EM procedure based on the MapReduce framework. Experimental results on public data set validate the effectiveness and efficiency of the new method.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132968509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13