{"title":"ESSA 2022特邀演讲者:科学工作流程中数据的奇怪事件","authors":"L. Ramakrishnan","doi":"10.1109/IPDPSW55747.2022.00181","DOIUrl":null,"url":null,"abstract":"The volume, veracity, and velocity of data generated by the accelerators, colliders, supercomputers, light sources and neutron sources have grown exponentially in the last decade. Data has fundamentally changed the scientific workflow running on high performance computing (HPC) systems. It is necessary that we develop appropriate capabilities and tools to understand, analyze, preserve, share, and make optimal use of data. Intertwined with data are complex human processes, policies and decisions that need to be accounted for when building software tools. In this talk, I will outline our work addressing data lifecycle challenges on HPC systems including effective use of storage hierarchy, managing complex scientific data processing, and enabling search on large-scale scientific data.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ESSA 2022 Invited Speaker: The Curious Incident of the Data in the Scientific Workflow\",\"authors\":\"L. Ramakrishnan\",\"doi\":\"10.1109/IPDPSW55747.2022.00181\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The volume, veracity, and velocity of data generated by the accelerators, colliders, supercomputers, light sources and neutron sources have grown exponentially in the last decade. Data has fundamentally changed the scientific workflow running on high performance computing (HPC) systems. It is necessary that we develop appropriate capabilities and tools to understand, analyze, preserve, share, and make optimal use of data. Intertwined with data are complex human processes, policies and decisions that need to be accounted for when building software tools. In this talk, I will outline our work addressing data lifecycle challenges on HPC systems including effective use of storage hierarchy, managing complex scientific data processing, and enabling search on large-scale scientific data.\",\"PeriodicalId\":286968,\"journal\":{\"name\":\"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW55747.2022.00181\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW55747.2022.00181","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ESSA 2022 Invited Speaker: The Curious Incident of the Data in the Scientific Workflow
The volume, veracity, and velocity of data generated by the accelerators, colliders, supercomputers, light sources and neutron sources have grown exponentially in the last decade. Data has fundamentally changed the scientific workflow running on high performance computing (HPC) systems. It is necessary that we develop appropriate capabilities and tools to understand, analyze, preserve, share, and make optimal use of data. Intertwined with data are complex human processes, policies and decisions that need to be accounted for when building software tools. In this talk, I will outline our work addressing data lifecycle challenges on HPC systems including effective use of storage hierarchy, managing complex scientific data processing, and enabling search on large-scale scientific data.