Evolutionary Trends in a Supercomputing Tertiary Storage Environment

J. Frank, E. L. Miller, I. Adams, Daniel C. Rosenthal
{"title":"Evolutionary Trends in a Supercomputing Tertiary Storage Environment","authors":"J. Frank, E. L. Miller, I. Adams, Daniel C. Rosenthal","doi":"10.1109/MASCOTS.2012.53","DOIUrl":null,"url":null,"abstract":"Tracking archival usage and data migration in a long term supercomputing system is critical to understanding not only how users' needs and habits have changed over time, but also how the archive itself evolves in response to these external factors. Yet this type of study has not previously been performed. To address this need, we conducted an in-depth comparison of user initiated file activity on the mass storage system (MSS) at the National Center for Atmospheric Research (NCAR) during two periods, one in the early 1990s, and another nearly twenty years later. In addition to confirming earlier findings, our analysis turned up three surprising results. First, the read: write ratio went from 2:1 in the earlier trace to 1:2 in the later trace, a reduction of a factor of four in reads relative to writes. Second, only 30% of the current archive was accessed during the three year period of the study, in stark contrast to the 80% seen in the 1992 trace analysis. Third, access latency to the first byte of data actually got slower despite much faster computers and storage devices. These findings indicate that archival behavior has shifted towards a write-heavy workload, and that future archives can be more optimized for write activity than previously believed. Furthermore it may be worth considering the value of data being archived when it is stored, since later retrieval is increasingly less likely.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASCOTS.2012.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Tracking archival usage and data migration in a long term supercomputing system is critical to understanding not only how users' needs and habits have changed over time, but also how the archive itself evolves in response to these external factors. Yet this type of study has not previously been performed. To address this need, we conducted an in-depth comparison of user initiated file activity on the mass storage system (MSS) at the National Center for Atmospheric Research (NCAR) during two periods, one in the early 1990s, and another nearly twenty years later. In addition to confirming earlier findings, our analysis turned up three surprising results. First, the read: write ratio went from 2:1 in the earlier trace to 1:2 in the later trace, a reduction of a factor of four in reads relative to writes. Second, only 30% of the current archive was accessed during the three year period of the study, in stark contrast to the 80% seen in the 1992 trace analysis. Third, access latency to the first byte of data actually got slower despite much faster computers and storage devices. These findings indicate that archival behavior has shifted towards a write-heavy workload, and that future archives can be more optimized for write activity than previously believed. Furthermore it may be worth considering the value of data being archived when it is stored, since later retrieval is increasingly less likely.
超级计算三级存储环境的演化趋势
在一个长期的超级计算系统中跟踪档案的使用情况和数据迁移,不仅对于理解用户的需求和习惯是如何随时间变化的,而且对于理解档案本身是如何响应这些外部因素而演变的,都是至关重要的。然而,这种类型的研究以前还没有进行过。为了满足这一需求,我们在两个时期对国家大气研究中心(NCAR)的大容量存储系统(MSS)上用户发起的文件活动进行了深入的比较,一个是在20世纪90年代初,另一个是在近20年后。除了证实早期的发现外,我们的分析还发现了三个令人惊讶的结果。首先,读:写比率从早期跟踪中的2:1变为后来跟踪中的1:2,相对于写,读减少了四倍。其次,在研究的三年期间,只有30%的当前档案被访问,与1992年追踪分析中看到的80%形成鲜明对比。第三,尽管计算机和存储设备的速度更快,但对数据第一个字节的访问延迟实际上变慢了。这些发现表明,归档行为已经转向了写工作负载,未来的归档可以比以前认为的更适合写活动。此外,可能值得考虑数据在存储时归档的价值,因为以后检索的可能性越来越小。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信