2012 Conference on Intelligent Data Understanding最新文献

筛选
英文 中文
EddyScan: A physically consistent ocean eddy monitoring application EddyScan:物理上一致的海洋涡流监测应用程序
2012 Conference on Intelligent Data Understanding Pub Date : 2012-12-24 DOI: 10.1109/CIDU.2012.6382189
James H. Faghmous, L. Styles, Varun Mithal, S. Boriah, S. Liess, Vipin Kumar, F. Vikebø, M. D. Mesquita
{"title":"EddyScan: A physically consistent ocean eddy monitoring application","authors":"James H. Faghmous, L. Styles, Varun Mithal, S. Boriah, S. Liess, Vipin Kumar, F. Vikebø, M. D. Mesquita","doi":"10.1109/CIDU.2012.6382189","DOIUrl":"https://doi.org/10.1109/CIDU.2012.6382189","url":null,"abstract":"Rotating coherent structures of water known as ocean eddies are the oceanic analog of storms in the atmosphere and a crucial component of ocean dynamics. In addition to dominating the ocean's kinetic energy, eddies play a significant role in the transport of water, salt, heat, and nutrients. Therefore, understanding current and future eddy activity is a central challenge to address future sustainability of marine ecosystems. The emergence of sea surface height observations from satellite radar altimeter has recently enabled researchers to track eddies at a global scale. The majority of studies that identify eddies from observational data employ highly parametrized connected component algorithms using expert filtered data, effectively making reproducibility and scalability challenging. In this paper, we improve upon the state-of-the-art connected component eddy monitoring algorithms to track eddies globally. This work makes three main contributions: first, we do not pre-process the data therefore minimizing the risk of wiping out important signals within the data. Second, we employ a physically-consistent convexity requirement on eddies based on theoretical and empirical studies to improve the accuracy and computational complexity of our method from quadratic to linear time in the size of each eddy. Finally, we accurately separate eddies that are in close spatial proximity, something existing methods cannot accomplish. We compare our results to those of the state of the art and discuss the impact of our improvements on the difference in results.","PeriodicalId":270712,"journal":{"name":"2012 Conference on Intelligent Data Understanding","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117081096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Importance of vegetation type in forest cover estimation 植被类型在森林覆盖估算中的重要性
2012 Conference on Intelligent Data Understanding Pub Date : 2012-12-24 DOI: 10.1109/CIDU.2012.6382203
A. Karpatne, Mace Blank, Michael Lau, S. Boriah, K. Steinhaeuser, M. Steinbach, Vipin Kumar
{"title":"Importance of vegetation type in forest cover estimation","authors":"A. Karpatne, Mace Blank, Michael Lau, S. Boriah, K. Steinhaeuser, M. Steinbach, Vipin Kumar","doi":"10.1109/CIDU.2012.6382203","DOIUrl":"https://doi.org/10.1109/CIDU.2012.6382203","url":null,"abstract":"Forests are an important natural resource that play a major role in sustaining a number of vital geochemical and bioclimatic processes. Since damage to forests due to natural and anthropogenic factors can have long-lasting impacts on the health of the planet, monitoring and estimating forest cover and its losses at global, regional and local scales is of primary concern. Developing forest cover estimation techniques that utilize remote sensing datasets offers global applicability at high temporal frequencies. However, estimating forest cover using satellite observations is challenging in the presence of heterogeneous vegetation types, each having its unique data characteristics. In this paper, we explore techniques for incorporating information about the vegetation type in forest cover estimation algorithms. We show that utilizing the vegetation type improves performance regardless of the choice of input data or forest cover learning algorithm. We also provide a mechanism to automatically extract information about the vegetation type by partitioning the input data using clustering.","PeriodicalId":270712,"journal":{"name":"2012 Conference on Intelligent Data Understanding","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128313595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Learning ensembles of Continuous Bayesian Networks: An application to rainfall prediction 连续贝叶斯网络的学习集成:在降雨预测中的应用
2012 Conference on Intelligent Data Understanding Pub Date : 2012-12-24 DOI: 10.1109/CIDU.2012.6382191
Scott Hellman, A. McGovern, M. Xue
{"title":"Learning ensembles of Continuous Bayesian Networks: An application to rainfall prediction","authors":"Scott Hellman, A. McGovern, M. Xue","doi":"10.1109/CIDU.2012.6382191","DOIUrl":"https://doi.org/10.1109/CIDU.2012.6382191","url":null,"abstract":"We introduce Ensembled Continuous Bayesian Networks (ECBN), an ensemble approach to learning salient dependence relationships and to predicting values for continuous data. By training individual Bayesian networks on both a subset of the data (bagging) and a subset of the attributes in the data (randomization), ECBN produces models for continuous domains that can be used to identify important variables in a dataset and to identify relationships between those variables. We use linear Gaussian distributions within our ensembles, providing efficient network-level inference. By ensembling these networks, we are able to represent nonlinear relationships. We empirically demonstrate that ECBN outperforms the meteorological forecast on a rainfall prediction task across the United States, and performs comparably to results reported for Random Forests.","PeriodicalId":270712,"journal":{"name":"2012 Conference on Intelligent Data Understanding","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129782198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Species distribution modeling and prediction: A class imbalance problem 物种分布建模与预测:一类不平衡问题
2012 Conference on Intelligent Data Understanding Pub Date : 2012-12-24 DOI: 10.1109/CIDU.2012.6382186
Reid A. Johnson, N. Chawla, J. Hellmann
{"title":"Species distribution modeling and prediction: A class imbalance problem","authors":"Reid A. Johnson, N. Chawla, J. Hellmann","doi":"10.1109/CIDU.2012.6382186","DOIUrl":"https://doi.org/10.1109/CIDU.2012.6382186","url":null,"abstract":"Predicting the distributions of species is central to a variety of applications in ecology and conservation biology. With increasing interest in using electronic occurrence records, many modeling techniques have been developed to utilize this data and compute the potential distribution of species as a proxy for actual observations. As the actual observations are typically overwhelmed by non-occurrences, we approach the modeling of species' distributions with a focus on the problem of class imbalance. Our analysis includes the evaluation of several machine learning methods that have been shown to address the problems of class imbalance, but which have rarely or never been applied to the domain of species distribution modeling. Evaluation of these methods includes the use of the area under the precision-recall curve (AUPR), which can supplement other metrics to provide a more informative assessment of model utility under conditions of class imbalance. Our analysis concludes that emphasizing techniques that specifically address the problem of class imbalance can provide AUROC and AUPR results competitive with traditional species distribution models.","PeriodicalId":270712,"journal":{"name":"2012 Conference on Intelligent Data Understanding","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133713575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
A new data mining framework for forest fire mapping 一种新的森林火灾制图数据挖掘框架
2012 Conference on Intelligent Data Understanding Pub Date : 2012-12-24 DOI: 10.1109/CIDU.2012.6382190
Xi C. Chen, A. Karpatne, Yashu Chamber, Varun Mithal, Michael Lau, K. Steinhaeuser, S. Boriah, M. Steinbach, Vipin Kumar, C. Potter, S. Klooster, Teji Abraham, J. Stanley, Juan Carlos Castilla-Rubio
{"title":"A new data mining framework for forest fire mapping","authors":"Xi C. Chen, A. Karpatne, Yashu Chamber, Varun Mithal, Michael Lau, K. Steinhaeuser, S. Boriah, M. Steinbach, Vipin Kumar, C. Potter, S. Klooster, Teji Abraham, J. Stanley, Juan Carlos Castilla-Rubio","doi":"10.1109/CIDU.2012.6382190","DOIUrl":"https://doi.org/10.1109/CIDU.2012.6382190","url":null,"abstract":"Forests are an important natural resource that support economic activity and play a significant role in regulating the climate and the carbon cycle, yet forest ecosystems are increasingly threatened by fires caused by a range of natural and anthropogenic factors. Mapping these fires, which can range in size from less than an acre to hundreds of thousands of acres, is an important task for supporting climate and carbon cycle studies as well as informing forest management. Currently, there are two primary approaches to fire mapping: field- and aerial-based surveys, which are costly and limited in their extent; and remote sensing-based approaches, which are more cost-effective but pose several interesting methodological and algorithmic challenges. In this paper, we introduce a new framework for mapping forest fires based on satellite observations. Specifically, we develop unsupervised spatio-temporal data mining methods for Moderate Resolution Imaging Spectroradiometer (MODIS) data to generate a history of forest fires. A systematic comparison with alternate approaches in two diverse geographic regions demonstrates that our algorithmic paradigm is able to overcome some of the limitations in both data and methods employed by prior efforts.","PeriodicalId":270712,"journal":{"name":"2012 Conference on Intelligent Data Understanding","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115410769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Data Understanding using Semi-Supervised Clustering 使用半监督聚类的数据理解
2012 Conference on Intelligent Data Understanding Pub Date : 2012-10-01 DOI: 10.1109/CIDU.2012.6382192
Vasudha Bhatnagar, Rashmi Dobariyal, P. Jain, A. Mahabal
{"title":"Data Understanding using Semi-Supervised Clustering","authors":"Vasudha Bhatnagar, Rashmi Dobariyal, P. Jain, A. Mahabal","doi":"10.1109/CIDU.2012.6382192","DOIUrl":"https://doi.org/10.1109/CIDU.2012.6382192","url":null,"abstract":"In the era of E-science, most scientific endeavors depend on intense data analysis to understand the underlying physical phenomenon. Predictive modeling is one of the popular machine learning tasks undertaken in such endeavors. Labeled data used for training the predictive model reflects understanding of the domain. In this paper we introduce data understanding as a computational problem and propose a solution for enhancing domain understanding based on semisupervised clustering The proposed DU-SSC (Data Understanding using SemiSupervised Clustering) algorithm is incremental, parameterless and performs single scan of data. Given labeled (training) data is discretized at user specified resolution and finer (micro) data distributions are identified within classes, along with outliers. The discovery process is based on grouping similar instances in data space, while taking into account the degree of influence each attribute exercises on the class label. Maximal Information Coefficient measure is used during similarity computations for this purpose. The study is supported by experiments and a detailed account of understanding gained is presented for two selected UCI data sets. General observations on nine other UCI datasets are presented, along with experiments that demonstrate use of discovered knowledge for improved classification.","PeriodicalId":270712,"journal":{"name":"2012 Conference on Intelligent Data Understanding","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133234151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Estimation and bias correction of aerosol abundance using data-driven machine learning and remote sensing 基于数据驱动的机器学习和遥感的气溶胶丰度估计和偏差校正
2012 Conference on Intelligent Data Understanding Pub Date : 2012-10-01 DOI: 10.1109/CIDU.2012.6382197
N. Malakar, David John Lary, A. Moore, D. Gençaga, B. Roscoe, A. Albayrak, Jennifer C. Wei
{"title":"Estimation and bias correction of aerosol abundance using data-driven machine learning and remote sensing","authors":"N. Malakar, David John Lary, A. Moore, D. Gençaga, B. Roscoe, A. Albayrak, Jennifer C. Wei","doi":"10.1109/CIDU.2012.6382197","DOIUrl":"https://doi.org/10.1109/CIDU.2012.6382197","url":null,"abstract":"Air quality information is increasingly becoming a public health concern, since some of the aerosol particles pose harmful effects to peoples health. One widely available metric of aerosol abundance is the aerosol optical depth (AOD). The AOD is the integrated light extinction coefficient over a vertical atmospheric column of unit cross section, which represents the extent to which the aerosols in that vertical profile prevent the transmission of light by absorption or scattering. The comparison between the AOD measured from the ground-based Aerosol Robotic Network (AERONET) system and the satellite MODIS instruments at 550 nm shows that there is a bias between the two data products. We performed a comprehensive search exploring possible factors which may be contributing to the inter-instrumental bias between MODIS-Aqua land data set and AERONET. The analysis used several measured variables, including the MODIS AOD, as input in order to train a neural network in regression mode to predict the AERONET AOD values. This not only allowed us to obtain an estimate, but also allowed us to infer the optimal sets of variables that played an important role in the prediction. In addition, we applied machine learning to infer the global abundance of ground level PM2.5 from the AOD data and other ancillary satellite and meteorology products. This research is part of our goal to provide air quality information, which can also be useful for global epidemiology studies.","PeriodicalId":270712,"journal":{"name":"2012 Conference on Intelligent Data Understanding","volume":"97 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133293361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Machine learning enhancement of Storm Scale Ensemble precipitation forecasts 风暴尺度集合降水预报的机器学习增强
2012 Conference on Intelligent Data Understanding Pub Date : 2011-08-21 DOI: 10.1145/2023568.2023581
D. Gagne, A. McGovern, M. Xue
{"title":"Machine learning enhancement of Storm Scale Ensemble precipitation forecasts","authors":"D. Gagne, A. McGovern, M. Xue","doi":"10.1145/2023568.2023581","DOIUrl":"https://doi.org/10.1145/2023568.2023581","url":null,"abstract":"Precipitation forecasts provide both a crucial service for the general populace and a challenging forecasting problem due to the complex, multi-scale interactions required for precipitation formation. The Center for the Analysis and Prediction of Storms (CAPS) Storm Scale Ensemble Forecast (SSEF) system is a promising method of providing high-resolution forecasts of the intensity and uncertainty in precipitation forecasts. The SSEF incorporates multiple models with varied parameterization scheme combinations and produces forecasts every 4 km over the continental US. The SSEF precipitation forecasts exhibit significant negative biases and placement errors. In order to correct these issues, multiple machine learning algorithms have been applied to the SSEF precipitation forecasts to correct the forecasts using the NSSL National Mosaic and Multisensor QPE (NMQ) grid as verification. The 2010 SSEF was used for training. Two levels of post-processing are performed. In the first, probabilities of any precipitation are determined and used to find optimal thresholds for the precipitation areas. Then, three types of forecasts are produced in those areas. First, the probability of the 1-hour accumulated precipitation exceeding a threshold is predicted with random forests, logistic regression, and multivariate adaptive regression splines (MARS). Second, deterministic forecasts based on a correction from the ensemble mean are made with linear regression, random forests, and MARS. Third, fixed probability interval forecasts are made with quantile regressions and quantile regression forests. Models are generated from points sampled from the western, central, and eastern sections of the domain. Verification statistics and case study results show improvements in the reliability and skill of the forecasts compared to the original ensemble while controlling for the over-prediction of the precipitation areas and without sacrificing smaller scale details from the model runs.","PeriodicalId":270712,"journal":{"name":"2012 Conference on Intelligent Data Understanding","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126981996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信