Qing Zhao, Le Sun, Mengxiang Zhang, Chengkui Zhang, Chenzhou Cui, Dongwei Fan
{"title":"Storage optimisation and distributed architecture for time series reconstruction of massive astronomical catalogues","authors":"Qing Zhao, Le Sun, Mengxiang Zhang, Chengkui Zhang, Chenzhou Cui, Dongwei Fan","doi":"10.1007/s10686-023-09913-9","DOIUrl":null,"url":null,"abstract":"<div><p>Time series reconstruction of astronomical catalogues is an important part of data archiving and a basis for time-domain astronomical analysis in the era of time-domain astronomy. As the field of view and sampling frequency of various time-domain telescopes increase, the amount of data to be processed becomes larger and larger. How to optimize the spatial and temporal efficiency of this process with the aid of computer technology becomes a hot issue. To address the problem of spatial efficiency, in this paper, we propose a time series data compression algorithm based on the negative database and dynamic programming, and on this basis, we design a multi-level storage and access query architecture for hot data and non-hot data, which greatly compresses the storage space of data while ensuring the query efficiency. To address the issue of time efficiency, this paper proposes a spatio-temporal data partitioning and layout algorithm suitable for distributed architecture, whose nested round-robin strategy has a wide range of load balancing effects on different spatial locations, temporal locations, and different ranges of temporal data queries, which can effectively ensure the execution efficiency of the distributed system. Experimental results show that the proposed optimization algorithm can keep the system at a low load skewness level of about 4% and save about 83% of storage space.</p></div>","PeriodicalId":551,"journal":{"name":"Experimental Astronomy","volume":"56 2-3","pages":"821 - 845"},"PeriodicalIF":2.7000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental Astronomy","FirstCategoryId":"101","ListUrlMain":"https://link.springer.com/article/10.1007/s10686-023-09913-9","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
Time series reconstruction of astronomical catalogues is an important part of data archiving and a basis for time-domain astronomical analysis in the era of time-domain astronomy. As the field of view and sampling frequency of various time-domain telescopes increase, the amount of data to be processed becomes larger and larger. How to optimize the spatial and temporal efficiency of this process with the aid of computer technology becomes a hot issue. To address the problem of spatial efficiency, in this paper, we propose a time series data compression algorithm based on the negative database and dynamic programming, and on this basis, we design a multi-level storage and access query architecture for hot data and non-hot data, which greatly compresses the storage space of data while ensuring the query efficiency. To address the issue of time efficiency, this paper proposes a spatio-temporal data partitioning and layout algorithm suitable for distributed architecture, whose nested round-robin strategy has a wide range of load balancing effects on different spatial locations, temporal locations, and different ranges of temporal data queries, which can effectively ensure the execution efficiency of the distributed system. Experimental results show that the proposed optimization algorithm can keep the system at a low load skewness level of about 4% and save about 83% of storage space.
期刊介绍:
Many new instruments for observing astronomical objects at a variety of wavelengths have been and are continually being developed. Furthermore, a vast amount of effort is being put into the development of new techniques for data analysis in order to cope with great streams of data collected by these instruments.
Experimental Astronomy acts as a medium for the publication of papers of contemporary scientific interest on astrophysical instrumentation and methods necessary for the conduct of astronomy at all wavelength fields.
Experimental Astronomy publishes full-length articles, research letters and reviews on developments in detection techniques, instruments, and data analysis and image processing techniques. Occasional special issues are published, giving an in-depth presentation of the instrumentation and/or analysis connected with specific projects, such as satellite experiments or ground-based telescopes, or of specialized techniques.