Jiwei Li , Bo Li , Shi Liu , Hongwei Lv , Fei Zheng , Qing Liu
{"title":"Power global multi-source heterogeneous unified metadata query method under pluggable storage framework","authors":"Jiwei Li , Bo Li , Shi Liu , Hongwei Lv , Fei Zheng , Qing Liu","doi":"10.1016/j.rineng.2025.104600","DOIUrl":null,"url":null,"abstract":"<div><div>In the efficient access and query of power users to multiple data sources, the iterative query process of multi-source heterogeneous data is too time-consuming. This paper studies the global multi-source heterogeneous unified metadata query method of power under the pluggable storage framework. Hive technology is used to build a pluggable storage framework for global multi-source heterogeneous metadata of electric power. Hive technology is used as a pure computing engine, HDFS is used as the underlying storage technology, and Hive metadata is shared through the pluggable metadata framework. Hive technology adopts local ontology integration to construct the domain ontology of global multi-source heterogeneous unified metadata of electric power, and extracts the theme concept of metadata from the constructed domain ontology. The local sensitive hash index is located layer by layer, and then the Top-k distance query is carried out to realize the global multi-source heterogeneous unified metadata query of electric power. The experimental results show that this method can effectively query the global multi-source heterogeneous unified metadata of electric power, and the acceleration ratio of metadata query is higher than 1.5, which simplifies the complexity of data management and query, improves the consistency and accuracy of data, dynamically adds or removes data storage components according to actual needs, enables users to obtain the required data faster, solves the query and management problems of multi-source heterogeneous data, and provides strong support for the digital transformation of electric power industry.</div></div>","PeriodicalId":36919,"journal":{"name":"Results in Engineering","volume":"26 ","pages":"Article 104600"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Results in Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590123025006784","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In the efficient access and query of power users to multiple data sources, the iterative query process of multi-source heterogeneous data is too time-consuming. This paper studies the global multi-source heterogeneous unified metadata query method of power under the pluggable storage framework. Hive technology is used to build a pluggable storage framework for global multi-source heterogeneous metadata of electric power. Hive technology is used as a pure computing engine, HDFS is used as the underlying storage technology, and Hive metadata is shared through the pluggable metadata framework. Hive technology adopts local ontology integration to construct the domain ontology of global multi-source heterogeneous unified metadata of electric power, and extracts the theme concept of metadata from the constructed domain ontology. The local sensitive hash index is located layer by layer, and then the Top-k distance query is carried out to realize the global multi-source heterogeneous unified metadata query of electric power. The experimental results show that this method can effectively query the global multi-source heterogeneous unified metadata of electric power, and the acceleration ratio of metadata query is higher than 1.5, which simplifies the complexity of data management and query, improves the consistency and accuracy of data, dynamically adds or removes data storage components according to actual needs, enables users to obtain the required data faster, solves the query and management problems of multi-source heterogeneous data, and provides strong support for the digital transformation of electric power industry.