利用Hadoop和MapReduce实现基于复制的大数据资源分配查询管理

IF 6.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Big Data Mining and Analytics Pub Date : 2023-08-29 DOI:10.26599/BDMA.2022.9020026

Ankit Kumar;Neeraj Varshney;Surbhi Bhatiya;Kamred Udham Singh

{"title":"利用Hadoop和MapReduce实现基于复制的大数据资源分配查询管理","authors":"Ankit Kumar;Neeraj Varshney;Surbhi Bhatiya;Kamred Udham Singh","doi":"10.26599/BDMA.2022.9020026","DOIUrl":null,"url":null,"abstract":"We live in an age where everything around us is being created. Data generation rates are so scary, creating pressure to implement costly and straightforward data storage and recovery processes. MapReduce model functionality is used for creating a cluster parallel, distributed algorithm, and large datasets. The MapReduce strategy from Hadoop helps develop a community of non-commercial use to offer a new algorithm for resolving such problems for commercial applications as expected from this working algorithm with insights as a result of disproportionate or discriminatory Hadoop cluster results. Expected results are obtained in the work and the exam conducted under this job; many of them are scheduled to set schedules, match matrices' data positions, clustering before determining to click, and accurate mapping and internal reliability to be closed together to avoid running and execution times. Mapper output and proponents have been implemented, and the map has been used to reduce the function. The execution input key/value pair and output key/value pair have been set. This paper focuses on evaluating this technique for the efficient retrieval of large volumes of data. The technique allows for capabilities to inform a massive database of information, from storage and indexing techniques to the distribution of queries, scalability, and performance in heterogeneous environments. The results show that the proposed work reduces the data processing time by 30%.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"6 4","pages":"465-477"},"PeriodicalIF":6.2000,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/10233239/10233249.pdf","citationCount":"0","resultStr":"{\"title\":\"Replication-Based Query Management for Resource Allocation Using Hadoop and MapReduce over Big Data\",\"authors\":\"Ankit Kumar;Neeraj Varshney;Surbhi Bhatiya;Kamred Udham Singh\",\"doi\":\"10.26599/BDMA.2022.9020026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We live in an age where everything around us is being created. Data generation rates are so scary, creating pressure to implement costly and straightforward data storage and recovery processes. MapReduce model functionality is used for creating a cluster parallel, distributed algorithm, and large datasets. The MapReduce strategy from Hadoop helps develop a community of non-commercial use to offer a new algorithm for resolving such problems for commercial applications as expected from this working algorithm with insights as a result of disproportionate or discriminatory Hadoop cluster results. Expected results are obtained in the work and the exam conducted under this job; many of them are scheduled to set schedules, match matrices' data positions, clustering before determining to click, and accurate mapping and internal reliability to be closed together to avoid running and execution times. Mapper output and proponents have been implemented, and the map has been used to reduce the function. The execution input key/value pair and output key/value pair have been set. This paper focuses on evaluating this technique for the efficient retrieval of large volumes of data. The technique allows for capabilities to inform a massive database of information, from storage and indexing techniques to the distribution of queries, scalability, and performance in heterogeneous environments. The results show that the proposed work reduces the data processing time by 30%.\",\"PeriodicalId\":52355,\"journal\":{\"name\":\"Big Data Mining and Analytics\",\"volume\":\"6 4\",\"pages\":\"465-477\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2023-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/iel7/8254253/10233239/10233249.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big Data Mining and Analytics\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10233249/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Mining and Analytics","FirstCategoryId":"1093","ListUrlMain":"https://ieeexplore.ieee.org/document/10233249/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

我们生活在一个我们周围的一切都在被创造的时代。数据生成率如此之高，给实施成本高昂且简单的数据存储和恢复过程带来了压力。MapReduce模型功能用于创建集群并行、分布式算法和大型数据集。Hadoop的MapReduce策略有助于开发一个非商业用途的社区，以提供一种新的算法来解决商业应用程序中的此类问题，正如该工作算法所预期的那样，由于Hadoop集群结果不相称或歧视性，它具有洞察力。在这份工作下进行的工作和考试取得了预期成绩；它们中的许多都被安排来设置时间表、匹配矩阵的数据位置、在确定点击之前进行聚类、准确的映射和内部可靠性，以避免运行和执行时间。已经实现了映射器输出和支持者，并使用映射来减少功能。执行输入键值对和输出键值对已经设置。本文的重点是评估这种技术对大量数据的有效检索。该技术允许向大型数据库提供信息，从存储和索引技术到异构环境中的查询分布、可扩展性和性能。结果表明，所提出的工作将数据处理时间减少了30%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Replication-Based Query Management for Resource Allocation Using Hadoop and MapReduce over Big Data

We live in an age where everything around us is being created. Data generation rates are so scary, creating pressure to implement costly and straightforward data storage and recovery processes. MapReduce model functionality is used for creating a cluster parallel, distributed algorithm, and large datasets. The MapReduce strategy from Hadoop helps develop a community of non-commercial use to offer a new algorithm for resolving such problems for commercial applications as expected from this working algorithm with insights as a result of disproportionate or discriminatory Hadoop cluster results. Expected results are obtained in the work and the exam conducted under this job; many of them are scheduled to set schedules, match matrices' data positions, clustering before determining to click, and accurate mapping and internal reliability to be closed together to avoid running and execution times. Mapper output and proponents have been implemented, and the map has been used to reduce the function. The execution input key/value pair and output key/value pair have been set. This paper focuses on evaluating this technique for the efficient retrieval of large volumes of data. The technique allows for capabilities to inform a massive database of information, from storage and indexing techniques to the distribution of queries, scalability, and performance in heterogeneous environments. The results show that the proposed work reduces the data processing time by 30%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Big Data Mining and Analytics Computer Science-Computer Science Applications

CiteScore

20.90

自引率

2.20%

发文量

期刊介绍： Big Data Mining and Analytics, a publication by Tsinghua University Press, presents groundbreaking research in the field of big data research and its applications. This comprehensive book delves into the exploration and analysis of vast amounts of data from diverse sources to uncover hidden patterns, correlations, insights, and knowledge. Featuring the latest developments, research issues, and solutions, this book offers valuable insights into the world of big data. It provides a deep understanding of data mining techniques, data analytics, and their practical applications. Big Data Mining and Analytics has gained significant recognition and is indexed and abstracted in esteemed platforms such as ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, and more. With its wealth of information and its ability to transform the way we perceive and utilize data, this book is a must-read for researchers, professionals, and anyone interested in the field of big data analytics.