2009 IEEE International Conference on Data Mining Workshops最新文献

筛选
英文 中文
Induction of Mean Output Prediction Trees from Continuous Temporal Meteorological Data 从连续时态气象数据中归纳平均输出预测树
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.30
Dima Alberg, Mark Last, Roni Neuman, Avi Sharon
{"title":"Induction of Mean Output Prediction Trees from Continuous Temporal Meteorological Data","authors":"Dima Alberg, Mark Last, Roni Neuman, Avi Sharon","doi":"10.1109/ICDMW.2009.30","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.30","url":null,"abstract":"In this paper, we present a novel method for fast data-driven construction of regression trees from temporal datasets including continuous data streams. The proposed Mean Output Prediction Tree (MOPT) algorithm transforms continuous temporal data into two statistical moments according to a user-specified time resolution and builds a regression tree for estimating the prediction interval of the output (dependent) variable. Results on two benchmark data sets show that the MOPT algorithm produces more accurate and easily interpretable prediction models than other state-of-the-art regression tree methods.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122169241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Localized Content Based Image Retrieval with Self-Taught Multiple Instance Learning 基于本地化内容的图像检索与自学多实例学习
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.105
Qifeng Qiao, P. Beling
{"title":"Localized Content Based Image Retrieval with Self-Taught Multiple Instance Learning","authors":"Qifeng Qiao, P. Beling","doi":"10.1109/ICDMW.2009.105","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.105","url":null,"abstract":"There are many scenarios in which multi-instance learning problems may be difficult to solve because of a lack of correctly labeled examples for algorithm training. Labeled examples may be difficult or expensive to obtain because human effort is often needed to produce labels and because there may be limitations on the ability to collect large samples for training from a homogeneous population. In this paper, we present a technique called self-taught multiple-instance learning (STMIL) that deals with learning from a limited number of ambiguously labeled examples. STMIL uses a sparse representation for examples belonging to different classes in terms of a shared dictionary derived from the unlabeled data. This sparse representation can be optimized under the multiple instance setting to both construct high-level features and unite the data distribution. We present an optimization procedure for STMIL along with experiments on localized content-based image retrieval. Our experimental results suggest that, though it learns from a small number of labeled examples, STMIL is superior to standard algorithms in terms of computational efficiency and is at least competitive in terms of accuracy.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"226 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131654079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
The Flexible Climate Data Analysis Tools (CDAT) for Multi-model Climate Simulation Data 多模式气候模拟数据的灵活气候数据分析工具(CDAT)
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.64
Dean N. Williams, C. Doutriaux, B. Drach, R. McCoy
{"title":"The Flexible Climate Data Analysis Tools (CDAT) for Multi-model Climate Simulation Data","authors":"Dean N. Williams, C. Doutriaux, B. Drach, R. McCoy","doi":"10.1109/ICDMW.2009.64","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.64","url":null,"abstract":"Being able to incorporate, inspect, and analyze data with newly developed technologies, diagnostics, and visualizations in an easy and flexible way has been a longstanding challenge for scientists interested in understanding the intrinsic and extrinsic empirical assessment of multi-model climate output. To improve research ability and productivity, these technologies and tool must be made easily available to help scientists understand and solve complex scientific climate changes. To increase productivity and ease the challenges of incorporating new tools into the hands of scientists, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) developed the Climate Data Analysis Tools (CDAT). CDAT is an application for developing and bringing together disparate software tools for the discovery, examination, and intercomparison of coupled multi-model climate data. By collaborating with top climate institutions, computational organizations, and other science communities, the CDAT community of developers is leading the way to provide proven data management, analysis, visualization, and diagnostics capabilities to scientists. This communitywide effort has developed CDAT into a powerful and insightful application for knowledge discovery of observed and simulation climate data. As an analysis engine in the Earth System Grid (ESG) data infrastructure, CDAT is making it possible to remotely access and analyze climate data located at multiple sites around the world.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127811952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Fast Visual Trajectory Analysis Using Spatial Bayesian Networks 基于空间贝叶斯网络的快速视觉轨迹分析
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.44
T. Liebig, Christine Kopp, M. May
{"title":"Fast Visual Trajectory Analysis Using Spatial Bayesian Networks","authors":"T. Liebig, Christine Kopp, M. May","doi":"10.1109/ICDMW.2009.44","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.44","url":null,"abstract":"During the past years the first tools for visual analysis of trajectory data appeared. Considering the growing sizes of trajectory collections, one important task is to ensure user interactivity during data analysis. In this paper we present a fast, model-based visualization approach for the analysis of location dependencies in large trajectory collections. Existing approaches are not suitable for visual dependency analysis as the size and complexity of trajectory data constrain ad hoc and advance computations. Also recent developments in the area of trajectory data warehouses cannot be applied because the spatial correlations are lost during trajectory aggregation. Our approach builds a compact model which represents the dependency structures of the data. The visualisation toolkit then interacts only with the model and is thus independent of the size of the underlying trajectory database. More precisely, we build a Bayesian Network model using the Scalable Sparse Bayesian Network Learning (SSBNL) algorithm, which we improve to represent also negative correlations. We implement our approach into the GIS MapInfo using MapBasic scripts for the user interface and an independent mediator script to retrieve patterns from the model. We demonstrate our approach using mobile phone data of the city of Milan, Italy.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128158564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Detecting Similarity of Transferring Datasets Based on Features of Classification Rules 基于分类规则特征的迁移数据集相似性检测
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.99
H. Abe, S. Tsumoto
{"title":"Detecting Similarity of Transferring Datasets Based on Features of Classification Rules","authors":"H. Abe, S. Tsumoto","doi":"10.1109/ICDMW.2009.99","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.99","url":null,"abstract":"In order to transfer mined knowledge for various datasets obtained from transferring situations, it is important to detect not only availability of transferring the knowledge but also detecting their limitations of the transfer. Although most of methods to detect the limitations use performance indices of sets of classifiers such as accuracies of classifier sets, those of each classifier are also useful. Data characterizing techniques have been developed to control learning algorithm selection by using statistical measurements of a dataset. Expanding this framework, we consider a method to reuse objective rule evaluation indices of classification rules such as support, precision, and recall, to measure similarity of different datasets. In this paper, we present a method to characterize given datasets based on objective rule evaluation indices and classification learning algorithms. The experimental results show the method can detect similarity of datasets even if the datasets have totally different attribute sets. This indicates that the limitations of transferring both of classifiers and learning algorithms can be detected as the similarity among datasets by using a learning algorithm.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129120375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semantic Linking between Video Ads and Web Services with Progressive Search 通过渐进式搜索实现视频广告和Web服务之间的语义链接
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.43
Bo Wang, Jinqiao Wang, Shi Chen, Ling-yu Duan, Hanqing Lu
{"title":"Semantic Linking between Video Ads and Web Services with Progressive Search","authors":"Bo Wang, Jinqiao Wang, Shi Chen, Ling-yu Duan, Hanqing Lu","doi":"10.1109/ICDMW.2009.43","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.43","url":null,"abstract":"With the proliferation of online media services, ad video has become an important way to promote various products, services and ideas. Research efforts have been devoted to the contextual advertising whereas comprehensive recommendation of video ads is less exploited. In this paper, we propose to establish a semantic linking between video ads and relevant product/service online in a cross-media manner. First, we extract a representative key frame from the ad video and then conduct a three-step progressive search (i. e., visual search, tag aggregation and textual re-search) to link video ads with relevant Web service. We search visually similar product images, rank the context textual information by tags aggregation, and refine the results by textual re-search. Finally, we collect relevant products for user recommendation. Experiments on some popular E-commerce websites like eBay and Amazon have demonstrated the attractiveness of the semantic linking.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132237688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Differentially Private Graph Estimator 一个差分私有图估计
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.96
Darakhshan J. Mir, R. Wright
{"title":"A Differentially Private Graph Estimator","authors":"Darakhshan J. Mir, R. Wright","doi":"10.1109/ICDMW.2009.96","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.96","url":null,"abstract":"We consider the problem of making graph databases such as social network structures available to researchers for knowledge discovery while providing privacy to the participating entities. We show that for a specific parametric graph model, the Kronecker graph model, one can construct an estimator of the true parameter in a way that both satisfies the rigorous requirements of differential privacy and is asymptotically efficient in the statistical sense. The estimator, which may then be published, defines a probability distribution on graphs. Sampling such a distribution yields a synthetic graph that mimics important properties of the original sensitive graph and, consequently, could be useful for knowledge discovery.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127041544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Information Services and Middleware for the Coastal Sensor Web 沿海传感器网的信息服务和中间件
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.108
S. Durbha, R. King, Santhosh K. Amanchi, Shruthi Bheemireddy, N. Younan
{"title":"Information Services and Middleware for the Coastal Sensor Web","authors":"S. Durbha, R. King, Santhosh K. Amanchi, Shruthi Bheemireddy, N. Younan","doi":"10.1109/ICDMW.2009.108","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.108","url":null,"abstract":"It is well recognized that semantic conflicts are responsible for the most serious data heterogeneity problems hindering the efficient interoperability between heterogeneous information sources. In recent years, ontologies are widely used as a means for solving the information heterogeneity problems because of their capability to provide explicit meaning to the information. Several organizations are undertaking the development of domain specific ontlolgies to resolve the semantic ambiguities between various domain specific representations. These ontologies designed for a particular task could be a unique representation of their project needs. Hence, there arises a need to align heterogeneous ontologies to facilitate meaningful knowledge interchange between various sources. Thus, ontology mapping has emerged as an important requirement to enable semantic interoperability between different representations within a domain. In this paper we focus on the semantic heterogeneities present in the coastal information sources whose data are highly heterogeneous in syntax, structure and semantics. Ontological modeling was carried out for the various information sources. A data mining approach was adopted to align the concepts belonging to various land cover ontologies. We present a set of standardized information services and middleware for seamless access to information from various networks.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127629815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Mining Geophysical Content from Satellites and Global Climate Models 来自卫星和全球气候模式的数据挖掘地球物理内容
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.109
D. Erickson, Jamison Daniel, M. Allen, A. Ganguly, F. Hoffman, S. Pawson, L. Ott, Eric Neilson
{"title":"Data Mining Geophysical Content from Satellites and Global Climate Models","authors":"D. Erickson, Jamison Daniel, M. Allen, A. Ganguly, F. Hoffman, S. Pawson, L. Ott, Eric Neilson","doi":"10.1109/ICDMW.2009.109","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.109","url":null,"abstract":"We present an example of a simulated global climate model that is intended to stream real-time NASA data into the geophysical and climate science and assessment community over the next 5-10 years. It is known that the 3-D atmospheric wave structures and transport physics interact with spatially and time varying surface sources and sinks of CO2, and that this communication between surface and atmosphere results in an exceedingly complicated evolution of atmospheric CO2 in time and space. Data mining techniques may be applied to the further development this 4-D model by incorporating satellite-generated data sets for hundreds of geophysical climate variables into existing simulation structures. These data sets are of order 100’s of Terabytes. Data mining will allow the determination of the fluxes of atmospheric CO2. Data mining and knowledge acquisition contribute to the accurate determination of the sources and sinks of atmospheric CO2, facilitating among other scientific discoveries, global treaty verification.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127418771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Feature Selection with High-Dimensional Imbalanced Data 高维不平衡数据的特征选择
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.35
J. V. Hulse, T. Khoshgoftaar, Amri Napolitano, Randall Wald
{"title":"Feature Selection with High-Dimensional Imbalanced Data","authors":"J. V. Hulse, T. Khoshgoftaar, Amri Napolitano, Randall Wald","doi":"10.1109/ICDMW.2009.35","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.35","url":null,"abstract":"Feature selection is an important topic in data mining, especially for high dimensional datasets. Filtering techniques in particular have received much attention, but detailed comparisons of their performance is lacking. This work considers three filters using classifier performance metrics and six commonly-used filters. All nine filtering techniques are compared and contrasted using five different microarray expression datasets. In addition, given that these datasets exhibit an imbalance between the number of positive and negative examples, the utilization of sampling techniques in the context of feature selection is examined.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125794848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 151
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信