基于RDBMS和Hadoop的目标特征存储研究

2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI) Pub Date : 2016-10-01 DOI:10.1109/IIKI.2016.33

Yanqi Wang, Yusheng Jia, Xiaodan Xie

{"title":"基于RDBMS和Hadoop的目标特征存储研究","authors":"Yanqi Wang, Yusheng Jia, Xiaodan Xie","doi":"10.1109/IIKI.2016.33","DOIUrl":null,"url":null,"abstract":"As the amount of target characteristics data increasing rapidly, the tradition methods cannot satisfy the need of the storage and management of those data. According to the features of those data, a new storage system is proposed base on RDBMS and Hadoop. The structured data and the metadata of unstructured data is stored in the RDBMS under certain schema, while the large amount of unstructured one allocated among numbers of nodes in the hadoop cluster. In order to maximize the superiority of storage, the HBase is used for storing massive small-size unstructured data and the HDFS is applied for holding the large-scale ones. Meanwhile, the access control and the multi-thread upload and download approach combined with load balancing and caching mechanism is applied for improving the efficiency of data transmission. Experiment results show that the proposed storage system is reasonable and practicable.","PeriodicalId":371106,"journal":{"name":"2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Research of Target Characteristics Storage Based on RDBMS and Hadoop\",\"authors\":\"Yanqi Wang, Yusheng Jia, Xiaodan Xie\",\"doi\":\"10.1109/IIKI.2016.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the amount of target characteristics data increasing rapidly, the tradition methods cannot satisfy the need of the storage and management of those data. According to the features of those data, a new storage system is proposed base on RDBMS and Hadoop. The structured data and the metadata of unstructured data is stored in the RDBMS under certain schema, while the large amount of unstructured one allocated among numbers of nodes in the hadoop cluster. In order to maximize the superiority of storage, the HBase is used for storing massive small-size unstructured data and the HDFS is applied for holding the large-scale ones. Meanwhile, the access control and the multi-thread upload and download approach combined with load balancing and caching mechanism is applied for improving the efficiency of data transmission. Experiment results show that the proposed storage system is reasonable and practicable.\",\"PeriodicalId\":371106,\"journal\":{\"name\":\"2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IIKI.2016.33\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIKI.2016.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

随着目标特征数据量的迅速增加，传统的方法已不能满足目标特征数据存储和管理的需要。根据这些数据的特点，提出了一种基于关系型数据库管理系统和Hadoop的存储系统。结构化数据和非结构化数据的元数据按照一定的模式存储在RDBMS中，而大量的非结构化数据则分布在hadoop集群的多个节点之间。为了最大限度地发挥存储的优势，HBase用于存储海量的小规模非结构化数据，HDFS用于存储大规模的非结构化数据。同时，采用访问控制和多线程上传下载方式，结合负载均衡和缓存机制，提高数据传输效率。实验结果表明，该存储系统是合理可行的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Research of Target Characteristics Storage Based on RDBMS and Hadoop

As the amount of target characteristics data increasing rapidly, the tradition methods cannot satisfy the need of the storage and management of those data. According to the features of those data, a new storage system is proposed base on RDBMS and Hadoop. The structured data and the metadata of unstructured data is stored in the RDBMS under certain schema, while the large amount of unstructured one allocated among numbers of nodes in the hadoop cluster. In order to maximize the superiority of storage, the HBase is used for storing massive small-size unstructured data and the HDFS is applied for holding the large-scale ones. Meanwhile, the access control and the multi-thread upload and download approach combined with load balancing and caching mechanism is applied for improving the efficiency of data transmission. Experiment results show that the proposed storage system is reasonable and practicable.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI)

自引率

0.00%

发文量