{"title":"The semantic correlation mining method of multimodal data in constructing techno-economic knowledge graph of power grid","authors":"Ling Qiu, Mengqi Pan, Nuoya Lv","doi":"10.1016/j.iswa.2025.200588","DOIUrl":null,"url":null,"abstract":"<div><div>Due to the diverse formats and complex structures of multimodal data, effectively managing its complexity and correlations remains challenging. Moreover, when dealing with large-scale data, traditional methods often encounter issues such as low computational efficiency and inaccurate results. This paper proposes a semantic association mining method for multimodal data. This method utilizes ETL technology to convert text and table data from different files into nodes and relational edges in the knowledge graph. By optimizing the word vector matrix through the skip character model, it can better capture the semantic information of text data and accurately reflect semantic similarity. Through integrating nodes such as equipment, design technologies and installation addresses, a technical and economic knowledge graph of the power grid is constructed. For the calculation of multimodal object associations, the data first undergoes label preprocessing, feature processing, and semantic relationship structuring before the association is computed using the cosine similarity formula. By using the association rule algorithm to mine the correlation relationships among time-series variables, potential correlations such as the operating status of equipment and the overall performance of the power grid can be discovered, thereby improving the understanding and prediction ability of the power grid’s operating status. The experimental results demonstrate that the proposed method achieves the highest accuracy and recall rate at 98.20 %, with an F-measure of 93.89 %, a bit error rate below 0.9, and a time consumption of approximately 7.34 s.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200588"},"PeriodicalIF":4.3000,"publicationDate":"2025-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667305325001140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Due to the diverse formats and complex structures of multimodal data, effectively managing its complexity and correlations remains challenging. Moreover, when dealing with large-scale data, traditional methods often encounter issues such as low computational efficiency and inaccurate results. This paper proposes a semantic association mining method for multimodal data. This method utilizes ETL technology to convert text and table data from different files into nodes and relational edges in the knowledge graph. By optimizing the word vector matrix through the skip character model, it can better capture the semantic information of text data and accurately reflect semantic similarity. Through integrating nodes such as equipment, design technologies and installation addresses, a technical and economic knowledge graph of the power grid is constructed. For the calculation of multimodal object associations, the data first undergoes label preprocessing, feature processing, and semantic relationship structuring before the association is computed using the cosine similarity formula. By using the association rule algorithm to mine the correlation relationships among time-series variables, potential correlations such as the operating status of equipment and the overall performance of the power grid can be discovered, thereby improving the understanding and prediction ability of the power grid’s operating status. The experimental results demonstrate that the proposed method achieves the highest accuracy and recall rate at 98.20 %, with an F-measure of 93.89 %, a bit error rate below 0.9, and a time consumption of approximately 7.34 s.