{"title":"Evaluation of Remote Data Compression Methods","authors":"Romina Druta, C. Druta, I. Silea","doi":"10.24846/v31i1y202206","DOIUrl":null,"url":null,"abstract":": The present era is one of Big Data, digitalization, Internet of Things and Internet of Everything, which imply the daily creation of an enormous amount of useful content with a very high number of producers and consumers for the online information. The ascending trend for Internet data, has made clear the necessity of defining and engineering innovative solutions for coping with redundant transfers, which led to performing smart data transfers for obtaining an increased throughput, data availability and resource utilization and implicitly to a cost reduction and to avoiding bottlenecks and denial of service issues. Internet data employed by an Internet user must be consistent, so distributed systems are gaining research interest with regard to concurrency control, atomic transfers, data replication and synchronization, compression and decompression, correction or other potential problems. Two different versions of a file have a high similarity and as synchronization is concerned, the delta between the second version and the initial version of the file applied to its initial version will provide a better transfer throughput, thus an efficient data deduplication technique is necessary and worth analyzing in order to minimize the cost of synchronization. This paper focuses on optimizing the bandwidth utilization for remote data synchronization, and proposes a prototype based on three classic open-source data compression methods. The experiments carried out show how these compression utilities along with the transfer of data perform the synchronization of large data sets between two remote sites and how the use of compression helps to reduce the data size on storage devices along with decreasing the network bandwidth significantly. The novelty of this paper lies in the fact that it combines two different compression algorithms in order to provide better compression rates.","PeriodicalId":49466,"journal":{"name":"Studies in Informatics and Control","volume":" ","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in Informatics and Control","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.24846/v31i1y202206","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
: The present era is one of Big Data, digitalization, Internet of Things and Internet of Everything, which imply the daily creation of an enormous amount of useful content with a very high number of producers and consumers for the online information. The ascending trend for Internet data, has made clear the necessity of defining and engineering innovative solutions for coping with redundant transfers, which led to performing smart data transfers for obtaining an increased throughput, data availability and resource utilization and implicitly to a cost reduction and to avoiding bottlenecks and denial of service issues. Internet data employed by an Internet user must be consistent, so distributed systems are gaining research interest with regard to concurrency control, atomic transfers, data replication and synchronization, compression and decompression, correction or other potential problems. Two different versions of a file have a high similarity and as synchronization is concerned, the delta between the second version and the initial version of the file applied to its initial version will provide a better transfer throughput, thus an efficient data deduplication technique is necessary and worth analyzing in order to minimize the cost of synchronization. This paper focuses on optimizing the bandwidth utilization for remote data synchronization, and proposes a prototype based on three classic open-source data compression methods. The experiments carried out show how these compression utilities along with the transfer of data perform the synchronization of large data sets between two remote sites and how the use of compression helps to reduce the data size on storage devices along with decreasing the network bandwidth significantly. The novelty of this paper lies in the fact that it combines two different compression algorithms in order to provide better compression rates.
期刊介绍:
Studies in Informatics and Control journal provides important perspectives on topics relevant to Information Technology, with an emphasis on useful applications in the most important areas of IT.
This journal is aimed at advanced practitioners and researchers in the field of IT and welcomes original contributions from scholars and professionals worldwide.
SIC is published both in print and online by the National Institute for R&D in Informatics, ICI Bucharest. Abstracts, full text and graphics of all articles in the online version of SIC are identical to the print version of the Journal.