{"title":"Compression Performance Analysis of Different File Formats","authors":"Han Yang, Guangjun Qin, Yongqing Hu","doi":"arxiv-2308.12275","DOIUrl":null,"url":null,"abstract":"In data storage and transmission, file compression is a common technique for\nreducing the volume of data, reducing data storage space and transmission time\nand bandwidth. However, there are significant differences in the compression\nperformance of different types of file formats, and the benefits vary. In this\npaper, 22 file formats with approximately 178GB of data were collected and the\nZlib algorithm was used for compression experiments to compare performance in\norder to investigate the compression gains of different file types. The\nexperimental results show that some file types are poorly compressed, with\nalmost constant file size and long compression time, resulting in lower gains;\nsome other file types are significantly reduced in file size and compression\ntime after compression, which can effectively reduce the data volume. Based on\nthe above experimental results, this paper will then selectively reduce the\ndata volume by compression in data storage and transmission for the file types\nin order to obtain the maximum compression yield.","PeriodicalId":501310,"journal":{"name":"arXiv - CS - Other Computer Science","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Other Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2308.12275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In data storage and transmission, file compression is a common technique for
reducing the volume of data, reducing data storage space and transmission time
and bandwidth. However, there are significant differences in the compression
performance of different types of file formats, and the benefits vary. In this
paper, 22 file formats with approximately 178GB of data were collected and the
Zlib algorithm was used for compression experiments to compare performance in
order to investigate the compression gains of different file types. The
experimental results show that some file types are poorly compressed, with
almost constant file size and long compression time, resulting in lower gains;
some other file types are significantly reduced in file size and compression
time after compression, which can effectively reduce the data volume. Based on
the above experimental results, this paper will then selectively reduce the
data volume by compression in data storage and transmission for the file types
in order to obtain the maximum compression yield.