Thanarit Lertwuthikarn, V. C. Barroso, K. Akkarajitsakul
{"title":"大型测井系统中运日志者资源优化及预处理管道","authors":"Thanarit Lertwuthikarn, V. C. Barroso, K. Akkarajitsakul","doi":"10.1109/ICKII55100.2022.9983590","DOIUrl":null,"url":null,"abstract":"In resource management, resource optimization is a usual technique to proceed for most professional organizations in order to reduce expenses and to dispose unnecessary resource usages. The European Organization for Nuclear Research (CERN) intends to implement a logging system based on AI for A Large Ion Collider Experiment detector, or ALICE. This system has been being implemented by using the Elasticsearch, Kibana, Beats, and Logstash also called ELK Stack which gives us the capability for the logs aggregation from systems and applications. Log data are collected from involved servers at CERN called First Level Processors (FLPs) nodes by Beats. These nodes run a large number of services when tasks are executed and generate a large volume of log data. Filebeat is used as a log shipper to transfer the data to Logstash, a server-side preprocessing pipeline. When Filebeat and Logstash are working together, there are many configurable factors affecting their efficiency. We then apply a factorial experiment to identify the significant factors and their correlation. These parameters are also optimized to find the best possible values of their configurations. Then, the resource usage can be minimized while a suitable performance of the system is maintained. The results of this study show that we can increase the efficiency of the system thanks to the adjusted values of the parameters. This can be used as a guideline for tuning some configurable parameters to optimize resource usage when there is a large amount of log data to be handled.","PeriodicalId":352222,"journal":{"name":"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Resource Optimization for Log Shipper and Preprocessing Pipeline in a Large-Scale Logging System\",\"authors\":\"Thanarit Lertwuthikarn, V. C. Barroso, K. Akkarajitsakul\",\"doi\":\"10.1109/ICKII55100.2022.9983590\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In resource management, resource optimization is a usual technique to proceed for most professional organizations in order to reduce expenses and to dispose unnecessary resource usages. The European Organization for Nuclear Research (CERN) intends to implement a logging system based on AI for A Large Ion Collider Experiment detector, or ALICE. This system has been being implemented by using the Elasticsearch, Kibana, Beats, and Logstash also called ELK Stack which gives us the capability for the logs aggregation from systems and applications. Log data are collected from involved servers at CERN called First Level Processors (FLPs) nodes by Beats. These nodes run a large number of services when tasks are executed and generate a large volume of log data. Filebeat is used as a log shipper to transfer the data to Logstash, a server-side preprocessing pipeline. When Filebeat and Logstash are working together, there are many configurable factors affecting their efficiency. We then apply a factorial experiment to identify the significant factors and their correlation. These parameters are also optimized to find the best possible values of their configurations. Then, the resource usage can be minimized while a suitable performance of the system is maintained. The results of this study show that we can increase the efficiency of the system thanks to the adjusted values of the parameters. This can be used as a guideline for tuning some configurable parameters to optimize resource usage when there is a large amount of log data to be handled.\",\"PeriodicalId\":352222,\"journal\":{\"name\":\"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICKII55100.2022.9983590\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKII55100.2022.9983590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Resource Optimization for Log Shipper and Preprocessing Pipeline in a Large-Scale Logging System
In resource management, resource optimization is a usual technique to proceed for most professional organizations in order to reduce expenses and to dispose unnecessary resource usages. The European Organization for Nuclear Research (CERN) intends to implement a logging system based on AI for A Large Ion Collider Experiment detector, or ALICE. This system has been being implemented by using the Elasticsearch, Kibana, Beats, and Logstash also called ELK Stack which gives us the capability for the logs aggregation from systems and applications. Log data are collected from involved servers at CERN called First Level Processors (FLPs) nodes by Beats. These nodes run a large number of services when tasks are executed and generate a large volume of log data. Filebeat is used as a log shipper to transfer the data to Logstash, a server-side preprocessing pipeline. When Filebeat and Logstash are working together, there are many configurable factors affecting their efficiency. We then apply a factorial experiment to identify the significant factors and their correlation. These parameters are also optimized to find the best possible values of their configurations. Then, the resource usage can be minimized while a suitable performance of the system is maintained. The results of this study show that we can increase the efficiency of the system thanks to the adjusted values of the parameters. This can be used as a guideline for tuning some configurable parameters to optimize resource usage when there is a large amount of log data to be handled.