Zhang Helin, Jiang Meiling, Wang Yiting, Zhang Hang, Liao Huadong, C. Anqing
{"title":"High-Speed Retrieval Method for Unstructured Big Data Platform Based on K-Ary Search Tree Algorithm","authors":"Zhang Helin, Jiang Meiling, Wang Yiting, Zhang Hang, Liao Huadong, C. Anqing","doi":"10.1109/TOCS56154.2022.10016179","DOIUrl":null,"url":null,"abstract":"With the popularity of the Internet, the amount of data created by people every day increases exponentially, and most of these data are unstructured data. How to search for useful information from a large amount of data is the problem to be solved in this paper. In this regard, this paper builds an unstructured big data platform for data retrieval, and conducts two test experiments. One is to test the retrieval efficiency of the platform in stand-alone mode and distributed mode. The retrieval efficiency is better; the second is to test the impact of the k-ary search tree algorithm on the retrieval computing efficiency of the platform. It is found that when the amount of data to be retrieved exceeds 400M, the platform can effectively improve the computing speed by using this algorithm.","PeriodicalId":227449,"journal":{"name":"2022 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TOCS56154.2022.10016179","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the popularity of the Internet, the amount of data created by people every day increases exponentially, and most of these data are unstructured data. How to search for useful information from a large amount of data is the problem to be solved in this paper. In this regard, this paper builds an unstructured big data platform for data retrieval, and conducts two test experiments. One is to test the retrieval efficiency of the platform in stand-alone mode and distributed mode. The retrieval efficiency is better; the second is to test the impact of the k-ary search tree algorithm on the retrieval computing efficiency of the platform. It is found that when the amount of data to be retrieved exceeds 400M, the platform can effectively improve the computing speed by using this algorithm.