2009 IEEE International Conference on Data Mining Workshops最新文献

筛选
英文 中文
Weighted Frequent Subgraph Mining in Weighted Graph Databases 加权图数据库中的加权频繁子图挖掘
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.12
Masaki Shinoda, Tomonobu Ozaki, T. Ohkawa
{"title":"Weighted Frequent Subgraph Mining in Weighted Graph Databases","authors":"Masaki Shinoda, Tomonobu Ozaki, T. Ohkawa","doi":"10.1109/ICDMW.2009.12","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.12","url":null,"abstract":"We focus on the problem of pattern discovery from externally and internally weighted labeled graphs because the target data can be modeled more naturally and in detail by using weighted graphs. For example, while external weight can be used for representing a degree of importance and reliability of a graph itself, internal weight reflects utility and significance of each component in a graph. Therefore, we can expect to realize more precise knowledge discovery by employing weighted graphs. From these backgrounds, in this paper, we discuss two pattern mining problems with external and internal weighted frequencies, and propose two algorithms to solve them efficiently.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115989284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Improving Similarity Join Algorithms Using Fuzzy Clustering Technique 利用模糊聚类技术改进相似连接算法
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.50
L. Tan, F. Fotouhi, W. Grosky, Horia F. Pop, N. Mouaddib
{"title":"Improving Similarity Join Algorithms Using Fuzzy Clustering Technique","authors":"L. Tan, F. Fotouhi, W. Grosky, Horia F. Pop, N. Mouaddib","doi":"10.1109/ICDMW.2009.50","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.50","url":null,"abstract":"In this paper, we propose a pre-processing technique to improve existing string similarity join algorithms using fuzzy clustering. Our approach first identifies groups of related attributes and then, using this information, we apply existing string similarity join algorithms on these attributes. To identify the clustered attributes we use fuzzy techniques. This approach can be applied to the integration of knowledge bases and databases, as well as handle inconsistent values and naming conventions, incorrect or missing data values, and incomplete information from multiple sources with semi-compatible attributes or homogenous attributes. Using an experimental study, we have shown our preprocessing approach improves existing string similarity join algorithms by about 10 percent on precision and recall.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114230900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Bucket Learning: Improving Model Quality through Enhancing Local Patterns 桶式学习:通过增强局部模式来提高模型质量
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1016/j.knosys.2011.09.013
Guangzhi Qu, Hui Wu
{"title":"Bucket Learning: Improving Model Quality through Enhancing Local Patterns","authors":"Guangzhi Qu, Hui Wu","doi":"10.1016/j.knosys.2011.09.013","DOIUrl":"https://doi.org/10.1016/j.knosys.2011.09.013","url":null,"abstract":"","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"118166096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset 一个大规模的网络多媒体数据集
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.46
Hao Li, Meng Wang, Xiansheng Hua
{"title":"MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset","authors":"Hao Li, Meng Wang, Xiansheng Hua","doi":"10.1109/ICDMW.2009.46","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.46","url":null,"abstract":"In this paper, we introduce the second version of Microsoft Research Asia Multimedia (MSRA-MM), a dataset that aims to facilitate research in multimedia information retrieval and related areas. The images and videos in the dataset are collected from a commercial search engine with more than 1000 queries. It contains about 1 million images and 20,000 videos. We also provide the surrounding texts that are obtained from more than 1 million web pages. The images and videos have been comprehensively annotated, including their relevance levels to corresponding queries, semantic concepts of images, and category and quality information of videos. We define six standard tasks on the dataset: (1) image search reranking; (2) image annotation; (3) query-by-example image search; (4) video search reranking; (5) video categorization; and (6) video quality assessment.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121476195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 107
Differential Privacy for Clinical Trial Data: Preliminary Evaluations 临床试验数据的差异隐私:初步评估
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.52
Duy Vu, A. Slavkovic
{"title":"Differential Privacy for Clinical Trial Data: Preliminary Evaluations","authors":"Duy Vu, A. Slavkovic","doi":"10.1109/ICDMW.2009.52","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.52","url":null,"abstract":"The concept of differential privacy as a rigorous definition of privacy has emerged from the cryptographic community. However, further careful evaluation is needed before we can apply these theoretical results to privacy preservation in everyday data mining and statistical analysis. In this paper we demonstrate how to integrate a differential privacy framework with the classical statistical hypothesis testing in the domain of clinical trials where personal information is sensitive. We develop concrete methodology that researchers can use. We derive rules for the sample size adjustment whereby both statistical efficiency and differential privacy can be achieved for the specific tests for binomial random variables and in contingency tables.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"263 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115595374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 109
Multilayer Scene Similarity Assessment 多层场景相似度评估
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.117
A. Stefanidis, Caixia Wang, Xu Lu, Kevin M. Curtin
{"title":"Multilayer Scene Similarity Assessment","authors":"A. Stefanidis, Caixia Wang, Xu Lu, Kevin M. Curtin","doi":"10.1109/ICDMW.2009.117","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.117","url":null,"abstract":"As we move increasingly towards multi-source data analysis, the assessment of similarity of complex, multilayer scenes is becoming increasingly important for spatial data mining. In this paper, we present a content-based approach for scene similarity assessment. The proposed approach is based on a graph-matching scheme that models linear feature networks (road network) as graphs and additional GIS information (e.g. buildings) as layer content. This allows us to combine diverse but co-located pieces of information (e.g. roads and buildings) in an integrated similarity assessment process. In the paper we present key theoretical concepts and provide experimental results to demonstrate the capability and robustness of the proposed approach.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123340611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Mining of Attribute Interactions Using Information Theoretic Metrics 利用信息理论度量挖掘属性交互
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.51
P. Chanda, Young-Rae Cho, A. Zhang, M. Ramanathan
{"title":"Mining of Attribute Interactions Using Information Theoretic Metrics","authors":"P. Chanda, Young-Rae Cho, A. Zhang, M. Ramanathan","doi":"10.1109/ICDMW.2009.51","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.51","url":null,"abstract":"Knowledge of the statistical interactions between the attributes in a data set provides insight into the underlying structure of the data and explains the relationships (independence, synergy, redundancy) between the attributes. In a supervised learning problem, normally, a small subset of the classifying attributes are actually associated with the class label. Interaction information among the attributes captures the multivariate dependencies (synergy and redundancy) among the attributes and the class label. Mining the significant statistical interactions that contain information about the class label is a computationally challenging task - the number of possible interactions increases exponentially and most of these interactions contain redundant information when a number of correlated attributes are present. In this paper, we present a data mining method (named IM or Interaction Mining) to mine non-redundant attribute sets that have significant interactions with the class label. We further demonstrate that the mined statistical interactions are useful for improved feature selection as they successfully capture the multivariate inter-dependencies among the attributes.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129038185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
A New Measure of Feature Selection Algorithms' Stability 特征选择算法稳定性的新测度
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.32
J. Novovicová, P. Somol, P. Pudil
{"title":"A New Measure of Feature Selection Algorithms' Stability","authors":"J. Novovicová, P. Somol, P. Pudil","doi":"10.1109/ICDMW.2009.32","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.32","url":null,"abstract":"Stability or robustness of feature selection methods is a topic of recent interest. A new stability measure based on the Shannon entropy is proposed in this paper to evaluate the overall occurrence of individual features in selected subsets of possibly varying cardinality. We compare the new measure to stability measures proposed recently by Somol et al. The new measure is computationally very efficient and adds another type of insight into the stability problem. All considered measures have been used to compare the stability of several feature selection methods (individually best ranking, sequential forward selection, sequential forward floating selection and dynamic oscillating search) on a set of examples.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121062290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Discovery of Quantitative Sequential Patterns from Event Sequences 从事件序列中发现定量序列模式
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.13
Fumiya Nakagaito, Tomonobu Ozaki, T. Ohkawa
{"title":"Discovery of Quantitative Sequential Patterns from Event Sequences","authors":"Fumiya Nakagaito, Tomonobu Ozaki, T. Ohkawa","doi":"10.1109/ICDMW.2009.13","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.13","url":null,"abstract":"In this paper, we consider the problem of frequent pattern mining in databases of temporal events with intervals. Since quantitative temporal information might play important roles in many application domains, it is critical to discover patterns to which numerical attributes are associated. To this end, we consider two kinds of temporal patterns with quantitative information on the durations and time differences of events, and propose corresponding algorithms by incorporating numerical clustering techniques into existing temporal pattern miners. The effectiveness of the proposed algorithms was assessed by using real world datasets.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121739112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Video2Text: Learning to Annotate Video Content Video2Text:学习注释视频内容
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.79
H. Aradhye, G. Toderici, J. Yagnik
{"title":"Video2Text: Learning to Annotate Video Content","authors":"H. Aradhye, G. Toderici, J. Yagnik","doi":"10.1109/ICDMW.2009.79","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.79","url":null,"abstract":"This paper discusses a new method for automatic discovery and organization of descriptive concepts (labels) within large real-world corpora of user-uploaded multimedia, such as YouTube. com. Conversely, it also provides validation of existing labels, if any. While training, our method does not assume any explicit manual annotation other than the weak labels already available in the form of video title, description, and tags. Prior work related to such auto-annotation assumed that a vocabulary of labels of interest (e. g., indoor, outdoor, city, landscape) is specified a priori. In contrast, the proposed method begins with an empty vocabulary. It analyzes audiovisual features of 25 million YouTube. com videos -- nearly 150 years of video data -- effectively searching for consistent correlation between these features and text metadata. It autonomously extends the label vocabulary as and when it discovers concepts it can reliably identify, eventually leading to a vocabulary with thousands of labels and growing. We believe that this work significantly extends the state of the art in multimedia data mining, discovery, and organization based on the technical merit of the proposed ideas as well as the enormous scale of the mining exercise in a very challenging, unconstrained, noisy domain.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116200073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 68
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信