{"title":"MapReduce与Hadoop在天气数据和单词统计分析方面的优势","authors":"Sree Lakshmi K, Theertha Jayarajan N, Nitha L","doi":"10.1109/ICOEI51242.2021.9452980","DOIUrl":null,"url":null,"abstract":"Data flows from various sources in structured, semistructured or unstructured form and this type of data flow is referred as big data. Due to their large scale, rapid growth and diverse formats, these datasets are difficult to manage using conventional tools and techniques. Big Data analysis is a daunting activity as it requires large decentralized file systems that should be adaptive, resilient and responsive to fault. For the effective analysis of big data, Map Reduce is commonly used. Big data analysis helps researchers, scholars, and business users to extract the value and knowledge. Huge amounts of data have become accessible to decision makers in the information age. Due to the rapid increase of such data, strategies to manage and obtain value and knowledge from these datasets must be studied and delivered. Moreover, decision-makers must be able to extract useful information from such a dynamic and rapidly changing set of data, which includes everything from daily transactions to customer contact and social media data. In this paper, we explore Hadoop's parallel processing power in two application areas. The first scenario is calculation of minimum and maximum temperature with huge amount of weather data, which has been collected from an open source. The application analyses the entire weather station data set and the minimum and maximum temperatures (in Fahrenheit) of the respective weather stations will be displayed. The second scenario is to find the word count from huge datasets and checks the frequency of each word in a given data set irrespective of the data volume.","PeriodicalId":420826,"journal":{"name":"2021 5th International Conference on Trends in Electronics and Informatics (ICOEI)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Ascendancy of MapReduce with Hadoop for Weather Data and Word Count Analytics\",\"authors\":\"Sree Lakshmi K, Theertha Jayarajan N, Nitha L\",\"doi\":\"10.1109/ICOEI51242.2021.9452980\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data flows from various sources in structured, semistructured or unstructured form and this type of data flow is referred as big data. Due to their large scale, rapid growth and diverse formats, these datasets are difficult to manage using conventional tools and techniques. Big Data analysis is a daunting activity as it requires large decentralized file systems that should be adaptive, resilient and responsive to fault. For the effective analysis of big data, Map Reduce is commonly used. Big data analysis helps researchers, scholars, and business users to extract the value and knowledge. Huge amounts of data have become accessible to decision makers in the information age. Due to the rapid increase of such data, strategies to manage and obtain value and knowledge from these datasets must be studied and delivered. Moreover, decision-makers must be able to extract useful information from such a dynamic and rapidly changing set of data, which includes everything from daily transactions to customer contact and social media data. In this paper, we explore Hadoop's parallel processing power in two application areas. The first scenario is calculation of minimum and maximum temperature with huge amount of weather data, which has been collected from an open source. The application analyses the entire weather station data set and the minimum and maximum temperatures (in Fahrenheit) of the respective weather stations will be displayed. The second scenario is to find the word count from huge datasets and checks the frequency of each word in a given data set irrespective of the data volume.\",\"PeriodicalId\":420826,\"journal\":{\"name\":\"2021 5th International Conference on Trends in Electronics and Informatics (ICOEI)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th International Conference on Trends in Electronics and Informatics (ICOEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOEI51242.2021.9452980\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th International Conference on Trends in Electronics and Informatics (ICOEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOEI51242.2021.9452980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ascendancy of MapReduce with Hadoop for Weather Data and Word Count Analytics
Data flows from various sources in structured, semistructured or unstructured form and this type of data flow is referred as big data. Due to their large scale, rapid growth and diverse formats, these datasets are difficult to manage using conventional tools and techniques. Big Data analysis is a daunting activity as it requires large decentralized file systems that should be adaptive, resilient and responsive to fault. For the effective analysis of big data, Map Reduce is commonly used. Big data analysis helps researchers, scholars, and business users to extract the value and knowledge. Huge amounts of data have become accessible to decision makers in the information age. Due to the rapid increase of such data, strategies to manage and obtain value and knowledge from these datasets must be studied and delivered. Moreover, decision-makers must be able to extract useful information from such a dynamic and rapidly changing set of data, which includes everything from daily transactions to customer contact and social media data. In this paper, we explore Hadoop's parallel processing power in two application areas. The first scenario is calculation of minimum and maximum temperature with huge amount of weather data, which has been collected from an open source. The application analyses the entire weather station data set and the minimum and maximum temperatures (in Fahrenheit) of the respective weather stations will be displayed. The second scenario is to find the word count from huge datasets and checks the frequency of each word in a given data set irrespective of the data volume.