Binhong Li;Licheng Lin;Shijie Zhang;Jianliang Xu;Jiang Xiao;Bo Li;Hai Jin
{"title":"FlexIM: Efficient and Verifiable Index Management in Blockchain","authors":"Binhong Li;Licheng Lin;Shijie Zhang;Jianliang Xu;Jiang Xiao;Bo Li;Hai Jin","doi":"10.1109/TKDE.2025.3546997","DOIUrl":null,"url":null,"abstract":"Blockchain-based query with its traceability and data provenance has become increasingly popular and widely adopted in numerous applications. Yet existing index-based query approaches are only efficient under static blockchain query workloads where the query attribute or type must be fixed. It turns out to be particularly challenging to construct an efficient index for dynamic workloads due to prohibitively long construction time and excessive storage consumption. In this paper, we present FlexIM, the first efficient and verifiable index management system for blockchain dynamic queries. The key innovation in FlexIM is to uncover the inherent characteristics of blockchain, i.e., data distribution and block access frequency, and then to optimally choose the index by utilizing reinforcement learning technique under varying workloads. In addition, we enhance and facilitate verifiability with low storage overhead by leveraging Root Merkle Tree (RMT) and Bloom Filter Merkle Tree (BMT). Our comprehensive evaluations demonstrate that FlexIM outperforms the state-of-the-art blockchain query mechanism, vChain+, by achieving a 26.5% speedup while consuming 94.2% less storage, on average, over real-world Bitcoin datasets.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3399-3412"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10908875","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10908875/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Blockchain-based query with its traceability and data provenance has become increasingly popular and widely adopted in numerous applications. Yet existing index-based query approaches are only efficient under static blockchain query workloads where the query attribute or type must be fixed. It turns out to be particularly challenging to construct an efficient index for dynamic workloads due to prohibitively long construction time and excessive storage consumption. In this paper, we present FlexIM, the first efficient and verifiable index management system for blockchain dynamic queries. The key innovation in FlexIM is to uncover the inherent characteristics of blockchain, i.e., data distribution and block access frequency, and then to optimally choose the index by utilizing reinforcement learning technique under varying workloads. In addition, we enhance and facilitate verifiability with low storage overhead by leveraging Root Merkle Tree (RMT) and Bloom Filter Merkle Tree (BMT). Our comprehensive evaluations demonstrate that FlexIM outperforms the state-of-the-art blockchain query mechanism, vChain+, by achieving a 26.5% speedup while consuming 94.2% less storage, on average, over real-world Bitcoin datasets.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.