2010 IEEE International Conference on Data Mining Workshops最新文献

筛选
英文 中文
Meerkat: Community Mining with Dynamic Social Networks Meerkat:基于动态社交网络的社区挖掘
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.40
Jiyang Chen, Justin Fagnan, R. Goebel, Reihaneh Rabbany, Farzad Sangi, M. Takaffoli, Eric Verbeek, Osmar R Zaiane
{"title":"Meerkat: Community Mining with Dynamic Social Networks","authors":"Jiyang Chen, Justin Fagnan, R. Goebel, Reihaneh Rabbany, Farzad Sangi, M. Takaffoli, Eric Verbeek, Osmar R Zaiane","doi":"10.1109/ICDMW.2010.40","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.40","url":null,"abstract":"Meerkat is a tool for visualization and community mining of social networks. It is being developed to offer novel algorithms and functionality that other tools do not possess. Meerkat’s features include navigation through graphical representations of networks, network querying and filtering, a multitude of graphical layout algorithms, community mining using recently developed algorithms, and dynamic network event analysis using recently published algorithms. These features will allow more insightful exploratory analysis and more robust inferences about communities and the significance of entity relationships. Meerkat is under active development, and future features will include additional options for community mining and visualization, focusing on algorithms and user interface designs not existing in other social network analysis tools.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130215595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Frequent Closed Itemset Mining with Privacy Preserving for Distributed Databases 基于隐私保护的分布式数据库频繁闭项集挖掘
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.135
Shin-ya Kuno, K. Doi, Akihiro Yamamoto
{"title":"Frequent Closed Itemset Mining with Privacy Preserving for Distributed Databases","authors":"Shin-ya Kuno, K. Doi, Akihiro Yamamoto","doi":"10.1109/ICDMW.2010.135","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.135","url":null,"abstract":"In the present paper we introduce closed item sets into frequent item set mining from horizontally-partitioned transaction databases with preserving privacy. Closed item sets were originally from the research area of Formal Concept Analysis, and it is shown that even if results of frequent item set mining are restricted to closed item sets, all frequent item sets can be recovered from the results. This property suggests that using closed item sets would contribute to decreasing the cost of communication among distributed sites where a piece of horizontally-partitioned database is stored. We present a mining procedure revising and amalgamating two previous works: one is for mining closed item sets from horizontally-partitioned databases, and the other is for privacy preserving mining of item sets from such databases. We analyze the procedure on both of the viewpoint of communication cost and that of security. We also show results of some experimental practice of applying the procedure to a well-known dataset.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134565816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Domain-Driven Data Mining for IT Infrastructure Support 面向IT基础设施支持的领域驱动数据挖掘
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.132
Girish Keshav Palshikar, H. Vin, Mohammed Mudassar, M. Natu
{"title":"Domain-Driven Data Mining for IT Infrastructure Support","authors":"Girish Keshav Palshikar, H. Vin, Mohammed Mudassar, M. Natu","doi":"10.1109/ICDMW.2010.132","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.132","url":null,"abstract":"Support analytics (i.e., statistical analysis, modeling and mining of customer/operations support tickets data) is important in service industries. In this paper, we adopt a domain-driven data mining approach to support analytics with a focus on IT infrastructure Support (ITIS) services. We identify specific business questions and then propose algorithms for answering them. The questions are: (1) How to reduce the overall workload? (2) How to improve efforts spent in ticket processing? (3) How to improve compliance to service level agreements? We propose novel formalizations of these notions and propose rigorous statistics-based algorithms for these questions. The approach is domain-driven in the sense that the results produced are directly usable by and easy to understand for end-users having no expertise in data-mining, do not require any experimentation and often discover novel and non-obvious answers. All this helps in better acceptance among end-users and more active use of the results produced. The algorithms have been implemented and have produced satisfactory results on more than 25 real-life ITIS datasets, one of which we use for illustration.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131764889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Identifying Similar Neighborhood Structures in Private Social Networks 识别私人社会网络中的相似邻里结构
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.165
L. Singh, Clare Schramm
{"title":"Identifying Similar Neighborhood Structures in Private Social Networks","authors":"L. Singh, Clare Schramm","doi":"10.1109/ICDMW.2010.165","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.165","url":null,"abstract":"Many social networks being analyzed today are generated from sources with privacy concerns. A number of network centrality measures have been introduced to better quantify various social dynamics of interest to social scientists. In this paper, we propose an approximation of a social network that allows for certain centrality measures to be calculated while hiding information about the full network. Our approximation is not a perturbed graph, but rather a generalize trie structure containing a network hop expansion set for each node in the graph. We show that a network with certain topological structures, naturally hides nodes and increases the number of candidate nodes in each equivalence class. The storage of our graph approximation naturally clusters nodes of the network with similar graph expansion structure and therefore, can also be used as the basis for identifying ’like’ nodes in terms of similar structural position in the network. For branches of the trie that are not private enough, we introduce heuristics that locally merges segments of the trie to enforce k-node anonymity.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132701896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
XML Documents Clustering Using Tensor Space Model -- A Preliminary Study 基于张量空间模型的XML文档聚类初探
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.106
Sangeetha Kutty, R. Nayak, Yuefeng Li
{"title":"XML Documents Clustering Using Tensor Space Model -- A Preliminary Study","authors":"Sangeetha Kutty, R. Nayak, Yuefeng Li","doi":"10.1109/ICDMW.2010.106","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.106","url":null,"abstract":"A hierarchical structure is used to represent the content of the semi-structured documents such as XML and XHTML. The traditional Vector Space Model (VSM) is not sufficient to represent both the structure and the content of such web documents. Hence in this paper, we introduce a novel method of representing the XML documents in Tensor Space Model (TSM) and then utilize it for clustering. Empirical analysis shows that the proposed method is scalable for a real-life dataset as well as the factorized matrices produced from the proposed method helps to improve the quality of clusters due to the enriched document representation with both the structure and the content information.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133196152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Analysis of Collaborative Writing Processes Using Hidden Markov Models and Semantic Heuristics 基于隐马尔可夫模型和语义启发式的协同写作过程分析
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.118
Vilaythong Southavilay, K. Yacef, R. Calvo
{"title":"Analysis of Collaborative Writing Processes Using Hidden Markov Models and Semantic Heuristics","authors":"Vilaythong Southavilay, K. Yacef, R. Calvo","doi":"10.1109/ICDMW.2010.118","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.118","url":null,"abstract":"In this paper we are interested in discovering collaborative writing patterns in student data collected from a system we designed to support student collaborative writing, and which has been used by over 1,000 students in the past year. A particular functionality that we are investigating is the extraction and display to learners and teachers of the process followed during the course of the writing. We used a heuristic to derive semantic interpretation of specific sequences of raw data and Markov models (MM) to derive the processes. We propose two models, a Heuristic MM and a Hidden MM for analysing student’s writing behavior. We also refined the semantic preprocessing by adding the notion of pauses between activities. We illustrate our approach and compare these models using real data from two groups of high and low performance level and highlight the different information they each provide.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"361 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115919475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Mining Research Topics Evolving Over Time Using a Diachronic Multi-source Approach 使用历时多源方法挖掘随时间演变的研究主题
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.198
Jean-Charles Lamirel, Ghada Safi, Navesh Priyankar, Pascal Cuxac
{"title":"Mining Research Topics Evolving Over Time Using a Diachronic Multi-source Approach","authors":"Jean-Charles Lamirel, Ghada Safi, Navesh Priyankar, Pascal Cuxac","doi":"10.1109/ICDMW.2010.198","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.198","url":null,"abstract":"The acquisition of new scientific knowledge and the evolution of the needs of the society regularly call into question the orientations of research. Means to recall and visualize these evolutions are thus necessary. The existing tools for research survey give only one fixed vision of the research activity, which does not allow performing tasks of dynamic topic mining. The objective of this paper is thus to propose a new incremental approach in order to follow the evolution of research themes and research groups for a scientific discipline given in terms of emergence or decline. These behaviors are detectable by various methods of filtering. However, our choice is made on the exploitation of neural clustering methods in a multi-view context. This new approach makes it possible to take into account the incremental and chronological aspect of information by opening the way to the detection of convergences and divergences of research themes and groups.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114891379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Gaussian Processes for Dispatching Rule Selection in Production Scheduling: Comparison of Learning Techniques 生产调度调度规则选择的高斯过程:学习技术的比较
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.19
B. Scholz-Reiter, Jens Heger, T. Hildebrandt
{"title":"Gaussian Processes for Dispatching Rule Selection in Production Scheduling: Comparison of Learning Techniques","authors":"B. Scholz-Reiter, Jens Heger, T. Hildebrandt","doi":"10.1109/ICDMW.2010.19","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.19","url":null,"abstract":"Decentralized scheduling with dispatching rules is applied in many fields of logistics and production, especially in semiconductor manufacturing, which is characterized by high complexity and dynamics. Many dispatching rules have been found, which perform well on different scenarios, however no rule has been found, which outperforms other rules across various objectives. To tackle this drawback, approaches, which select dispatching rules depending on the current system conditions, have been proposed. Most of these use learning techniques to switch between rules regarding the current system status. Since the study of Rasmussen [1] has shown that Gaussian processes as a machine learning technique have outperformed other techniques like neural networks under certain conditions, we propose to use them for the selection of dispatching rules in dynamic scenarios. Our analysis has shown that Gaussian processes perform very well in this field of application. Additionally, we showed that the prediction quality Gaussian processes provide could be used successfully.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"55 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116423845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Distributed Classification on Peers with Variable Data Spaces and Distributions 具有可变数据空间和分布的对等体的分布式分类
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.125
Quach Vinh Thanh, V. Gopalkrishnan, Hock Hee Ang
{"title":"Distributed Classification on Peers with Variable Data Spaces and Distributions","authors":"Quach Vinh Thanh, V. Gopalkrishnan, Hock Hee Ang","doi":"10.1109/ICDMW.2010.125","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.125","url":null,"abstract":"The promise of distributed classification is to improve the classification accuracy of peers on their respective local data, using the knowledge of other peers in the distributed network. Though in reality, data across peers may be drastically different from each other (in the distribution of observations and/or the labels), current explorations implicitly assume that all learning agents receive data from the same distribution. We remove this simplifying assumption by allowing peers to draw from arbitrary data distributions and be based on arbitrary spaces, thus formalizing the general problem of distributed classification. We find that this problem is difficult because it does not admit state-of-the-art solutions in distributed classification. We also discuss the relation between the general problem and transfer learning, and show that transfer learning approaches cannot be trivially fitted to solve the problem. Finally, we present a list of open research problems in this challenging field.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124813549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of Objective Functions in Network Community Detection 网络社区检测中目标函数的比较
2010 IEEE International Conference on Data Mining Workshops Pub Date : 2010-12-13 DOI: 10.1109/ICDMW.2010.107
C. Shi, Yanan Cai, Philip S. Yu, Zhenyu Yan, Bin Wu
{"title":"A Comparison of Objective Functions in Network Community Detection","authors":"C. Shi, Yanan Cai, Philip S. Yu, Zhenyu Yan, Bin Wu","doi":"10.1109/ICDMW.2010.107","DOIUrl":"https://doi.org/10.1109/ICDMW.2010.107","url":null,"abstract":"Community detection, as an important unsupervised learning problem in social network analysis, has attracted great interests in various research areas. Many objective functions for community detection that can capture the intuition of communities have been introduced from different research fields. Based on the classical single objective optimization framework, this paper compares a variety of these objective functions and explores the characteristics of communities they can identify. Experiments show most objective functions have the resolution limit and their communities structure have many different characteristics.","PeriodicalId":170201,"journal":{"name":"2010 IEEE International Conference on Data Mining Workshops","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128529161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信