Mengying Yang, Xinyu Liu, W. Kroeger, A. Sim, Kesheng Wu
{"title":"识别LCLS工作流中的异常文件传输事件","authors":"Mengying Yang, Xinyu Liu, W. Kroeger, A. Sim, Kesheng Wu","doi":"10.1145/3217197.3217203","DOIUrl":null,"url":null,"abstract":"This short paper reports our on-going work to study and identify anomalous file transfers for a large scientific facility known as Linac Coherent Light Source (LCLS). We identify the anomalies based on the statistical models extracted from the recent observations of the file transfer events. This data-driven approach could be used in different use cases to identify unusual events. More specifically, we propose two different identification strategies based on the different properties of the observed file transfers. Because these methods capture key aspects of the two different segments of the data transfer pipeline, they are able to make accurate identifications for their respective workflow components. The current anomaly detection algorithms only make use of the file sizes as the primary feature. We anticipate that integrating more information will improve the prediction accuracy. Additional work is planned to validate the identification algorithms on more data and in different use cases.","PeriodicalId":118966,"journal":{"name":"Proceedings of the 1st International Workshop on Autonomous Infrastructure for Science","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Identifying Anomalous File Transfer Events in LCLS Workflow\",\"authors\":\"Mengying Yang, Xinyu Liu, W. Kroeger, A. Sim, Kesheng Wu\",\"doi\":\"10.1145/3217197.3217203\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This short paper reports our on-going work to study and identify anomalous file transfers for a large scientific facility known as Linac Coherent Light Source (LCLS). We identify the anomalies based on the statistical models extracted from the recent observations of the file transfer events. This data-driven approach could be used in different use cases to identify unusual events. More specifically, we propose two different identification strategies based on the different properties of the observed file transfers. Because these methods capture key aspects of the two different segments of the data transfer pipeline, they are able to make accurate identifications for their respective workflow components. The current anomaly detection algorithms only make use of the file sizes as the primary feature. We anticipate that integrating more information will improve the prediction accuracy. Additional work is planned to validate the identification algorithms on more data and in different use cases.\",\"PeriodicalId\":118966,\"journal\":{\"name\":\"Proceedings of the 1st International Workshop on Autonomous Infrastructure for Science\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st International Workshop on Autonomous Infrastructure for Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3217197.3217203\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st International Workshop on Autonomous Infrastructure for Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3217197.3217203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identifying Anomalous File Transfer Events in LCLS Workflow
This short paper reports our on-going work to study and identify anomalous file transfers for a large scientific facility known as Linac Coherent Light Source (LCLS). We identify the anomalies based on the statistical models extracted from the recent observations of the file transfer events. This data-driven approach could be used in different use cases to identify unusual events. More specifically, we propose two different identification strategies based on the different properties of the observed file transfers. Because these methods capture key aspects of the two different segments of the data transfer pipeline, they are able to make accurate identifications for their respective workflow components. The current anomaly detection algorithms only make use of the file sizes as the primary feature. We anticipate that integrating more information will improve the prediction accuracy. Additional work is planned to validate the identification algorithms on more data and in different use cases.