气候大数据分析的混合数据仓库模型

2017 International Conference on Circuit ,Power and Computing Technologies (ICCPCT) Pub Date : 2017-04-01 DOI:10.1109/ICCPCT.2017.8074229

Doreswamy, Ibrahim Gad, B. Manjunatha

{"title":"气候大数据分析的混合数据仓库模型","authors":"Doreswamy, Ibrahim Gad, B. Manjunatha","doi":"10.1109/ICCPCT.2017.8074229","DOIUrl":null,"url":null,"abstract":"The amount of data being collected and stored in the world is a highly unprecedented rate. The management and processing of huge data sets are time-consuming, costly, and hindrance to research. So, the process to store, manage, analyze and extract meaningful value from the vast volume of data is a big challenge to researchers. Data warehouse is a Decision Support System (DSS) technology that allows extracting, grouping and analyzing historical data from different sources in order to discover information relevant to decision making. Climate data is collected and stored in the national climatic data center (NCDC), the format of dataset support a rich set of meteorological elements. The data warehouse has the ability to manage data having a huge size in Terabytes range or higher, data is collected from different meteorological stations and stored in records to analyze it later in future. The process of big data analysis has become increasingly important for climate analysis field, which requires rapid and transparent data access. Recently, a new distributed computing paradigm, called MapReduce and it is implemented in an open source Hadoop, which has been widely adopted due to its impressive scalability and flexibility to handle structured, unstructured and semi-structured data. The purpose of this paper is to develop a conceptual data model and the implementation of hybrid data warehouse model to store NCDC's weather variables. The hybrid data warehouse model for climate big data enables the identification of weather patterns that would be useful for agriculture fields, climatic change studies and contingency plans over weather extreme conditions.","PeriodicalId":208028,"journal":{"name":"2017 International Conference on Circuit ,Power and Computing Technologies (ICCPCT)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Hybrid data warehouse model for climate big data analysis\",\"authors\":\"Doreswamy, Ibrahim Gad, B. Manjunatha\",\"doi\":\"10.1109/ICCPCT.2017.8074229\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The amount of data being collected and stored in the world is a highly unprecedented rate. The management and processing of huge data sets are time-consuming, costly, and hindrance to research. So, the process to store, manage, analyze and extract meaningful value from the vast volume of data is a big challenge to researchers. Data warehouse is a Decision Support System (DSS) technology that allows extracting, grouping and analyzing historical data from different sources in order to discover information relevant to decision making. Climate data is collected and stored in the national climatic data center (NCDC), the format of dataset support a rich set of meteorological elements. The data warehouse has the ability to manage data having a huge size in Terabytes range or higher, data is collected from different meteorological stations and stored in records to analyze it later in future. The process of big data analysis has become increasingly important for climate analysis field, which requires rapid and transparent data access. Recently, a new distributed computing paradigm, called MapReduce and it is implemented in an open source Hadoop, which has been widely adopted due to its impressive scalability and flexibility to handle structured, unstructured and semi-structured data. The purpose of this paper is to develop a conceptual data model and the implementation of hybrid data warehouse model to store NCDC's weather variables. The hybrid data warehouse model for climate big data enables the identification of weather patterns that would be useful for agriculture fields, climatic change studies and contingency plans over weather extreme conditions.\",\"PeriodicalId\":208028,\"journal\":{\"name\":\"2017 International Conference on Circuit ,Power and Computing Technologies (ICCPCT)\",\"volume\":\"85 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Circuit ,Power and Computing Technologies (ICCPCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCPCT.2017.8074229\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Circuit ,Power and Computing Technologies (ICCPCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCPCT.2017.8074229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

世界上收集和存储的数据量正以前所未有的速度增长。庞大数据集的管理和处理耗时、成本高，而且会阻碍研究。因此，从海量数据中存储、管理、分析和提取有意义价值的过程对研究人员来说是一个巨大的挑战。数据仓库是一种决策支持系统(DSS)技术，它允许从不同来源提取、分组和分析历史数据，以便发现与决策相关的信息。气候数据由国家气候数据中心(NCDC)收集和存储，数据集格式支持丰富的气象要素。数据仓库具有管理tb级或更大数据的能力，数据从不同的气象站收集并存储在记录中，以便将来进行分析。大数据分析过程在气候分析领域变得越来越重要，需要快速、透明的数据访问。最近，一种新的分布式计算范式被称为MapReduce，它在开源Hadoop中实现，由于其令人印象深刻的可扩展性和处理结构化、非结构化和半结构化数据的灵活性而被广泛采用。本文的目的是建立一个概念数据模型和实现混合数据仓库模型来存储NCDC的天气变量。气候大数据的混合数据仓库模型能够识别天气模式，这将对农业领域、气候变化研究和极端天气条件下的应急计划有用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hybrid data warehouse model for climate big data analysis

The amount of data being collected and stored in the world is a highly unprecedented rate. The management and processing of huge data sets are time-consuming, costly, and hindrance to research. So, the process to store, manage, analyze and extract meaningful value from the vast volume of data is a big challenge to researchers. Data warehouse is a Decision Support System (DSS) technology that allows extracting, grouping and analyzing historical data from different sources in order to discover information relevant to decision making. Climate data is collected and stored in the national climatic data center (NCDC), the format of dataset support a rich set of meteorological elements. The data warehouse has the ability to manage data having a huge size in Terabytes range or higher, data is collected from different meteorological stations and stored in records to analyze it later in future. The process of big data analysis has become increasingly important for climate analysis field, which requires rapid and transparent data access. Recently, a new distributed computing paradigm, called MapReduce and it is implemented in an open source Hadoop, which has been widely adopted due to its impressive scalability and flexibility to handle structured, unstructured and semi-structured data. The purpose of this paper is to develop a conceptual data model and the implementation of hybrid data warehouse model to store NCDC's weather variables. The hybrid data warehouse model for climate big data enables the identification of weather patterns that would be useful for agriculture fields, climatic change studies and contingency plans over weather extreme conditions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Circuit ,Power and Computing Technologies (ICCPCT)

自引率

0.00%

发文量