基于RDBMS和Hadoop的目标特征存储研究

Yanqi Wang, Yusheng Jia, Xiaodan Xie
{"title":"基于RDBMS和Hadoop的目标特征存储研究","authors":"Yanqi Wang, Yusheng Jia, Xiaodan Xie","doi":"10.1109/IIKI.2016.33","DOIUrl":null,"url":null,"abstract":"As the amount of target characteristics data increasing rapidly, the tradition methods cannot satisfy the need of the storage and management of those data. According to the features of those data, a new storage system is proposed base on RDBMS and Hadoop. The structured data and the metadata of unstructured data is stored in the RDBMS under certain schema, while the large amount of unstructured one allocated among numbers of nodes in the hadoop cluster. In order to maximize the superiority of storage, the HBase is used for storing massive small-size unstructured data and the HDFS is applied for holding the large-scale ones. Meanwhile, the access control and the multi-thread upload and download approach combined with load balancing and caching mechanism is applied for improving the efficiency of data transmission. Experiment results show that the proposed storage system is reasonable and practicable.","PeriodicalId":371106,"journal":{"name":"2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Research of Target Characteristics Storage Based on RDBMS and Hadoop\",\"authors\":\"Yanqi Wang, Yusheng Jia, Xiaodan Xie\",\"doi\":\"10.1109/IIKI.2016.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the amount of target characteristics data increasing rapidly, the tradition methods cannot satisfy the need of the storage and management of those data. According to the features of those data, a new storage system is proposed base on RDBMS and Hadoop. The structured data and the metadata of unstructured data is stored in the RDBMS under certain schema, while the large amount of unstructured one allocated among numbers of nodes in the hadoop cluster. In order to maximize the superiority of storage, the HBase is used for storing massive small-size unstructured data and the HDFS is applied for holding the large-scale ones. Meanwhile, the access control and the multi-thread upload and download approach combined with load balancing and caching mechanism is applied for improving the efficiency of data transmission. Experiment results show that the proposed storage system is reasonable and practicable.\",\"PeriodicalId\":371106,\"journal\":{\"name\":\"2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IIKI.2016.33\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIKI.2016.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

随着目标特征数据量的迅速增加,传统的方法已不能满足目标特征数据存储和管理的需要。根据这些数据的特点,提出了一种基于关系型数据库管理系统和Hadoop的存储系统。结构化数据和非结构化数据的元数据按照一定的模式存储在RDBMS中,而大量的非结构化数据则分布在hadoop集群的多个节点之间。为了最大限度地发挥存储的优势,HBase用于存储海量的小规模非结构化数据,HDFS用于存储大规模的非结构化数据。同时,采用访问控制和多线程上传下载方式,结合负载均衡和缓存机制,提高数据传输效率。实验结果表明,该存储系统是合理可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Research of Target Characteristics Storage Based on RDBMS and Hadoop
As the amount of target characteristics data increasing rapidly, the tradition methods cannot satisfy the need of the storage and management of those data. According to the features of those data, a new storage system is proposed base on RDBMS and Hadoop. The structured data and the metadata of unstructured data is stored in the RDBMS under certain schema, while the large amount of unstructured one allocated among numbers of nodes in the hadoop cluster. In order to maximize the superiority of storage, the HBase is used for storing massive small-size unstructured data and the HDFS is applied for holding the large-scale ones. Meanwhile, the access control and the multi-thread upload and download approach combined with load balancing and caching mechanism is applied for improving the efficiency of data transmission. Experiment results show that the proposed storage system is reasonable and practicable.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信