大型测井系统中运日志者资源优化及预处理管道

2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII ) Pub Date : 2022-07-22 DOI:10.1109/ICKII55100.2022.9983590

Thanarit Lertwuthikarn, V. C. Barroso, K. Akkarajitsakul

{"title":"大型测井系统中运日志者资源优化及预处理管道","authors":"Thanarit Lertwuthikarn, V. C. Barroso, K. Akkarajitsakul","doi":"10.1109/ICKII55100.2022.9983590","DOIUrl":null,"url":null,"abstract":"In resource management, resource optimization is a usual technique to proceed for most professional organizations in order to reduce expenses and to dispose unnecessary resource usages. The European Organization for Nuclear Research (CERN) intends to implement a logging system based on AI for A Large Ion Collider Experiment detector, or ALICE. This system has been being implemented by using the Elasticsearch, Kibana, Beats, and Logstash also called ELK Stack which gives us the capability for the logs aggregation from systems and applications. Log data are collected from involved servers at CERN called First Level Processors (FLPs) nodes by Beats. These nodes run a large number of services when tasks are executed and generate a large volume of log data. Filebeat is used as a log shipper to transfer the data to Logstash, a server-side preprocessing pipeline. When Filebeat and Logstash are working together, there are many configurable factors affecting their efficiency. We then apply a factorial experiment to identify the significant factors and their correlation. These parameters are also optimized to find the best possible values of their configurations. Then, the resource usage can be minimized while a suitable performance of the system is maintained. The results of this study show that we can increase the efficiency of the system thanks to the adjusted values of the parameters. This can be used as a guideline for tuning some configurable parameters to optimize resource usage when there is a large amount of log data to be handled.","PeriodicalId":352222,"journal":{"name":"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Resource Optimization for Log Shipper and Preprocessing Pipeline in a Large-Scale Logging System\",\"authors\":\"Thanarit Lertwuthikarn, V. C. Barroso, K. Akkarajitsakul\",\"doi\":\"10.1109/ICKII55100.2022.9983590\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In resource management, resource optimization is a usual technique to proceed for most professional organizations in order to reduce expenses and to dispose unnecessary resource usages. The European Organization for Nuclear Research (CERN) intends to implement a logging system based on AI for A Large Ion Collider Experiment detector, or ALICE. This system has been being implemented by using the Elasticsearch, Kibana, Beats, and Logstash also called ELK Stack which gives us the capability for the logs aggregation from systems and applications. Log data are collected from involved servers at CERN called First Level Processors (FLPs) nodes by Beats. These nodes run a large number of services when tasks are executed and generate a large volume of log data. Filebeat is used as a log shipper to transfer the data to Logstash, a server-side preprocessing pipeline. When Filebeat and Logstash are working together, there are many configurable factors affecting their efficiency. We then apply a factorial experiment to identify the significant factors and their correlation. These parameters are also optimized to find the best possible values of their configurations. Then, the resource usage can be minimized while a suitable performance of the system is maintained. The results of this study show that we can increase the efficiency of the system thanks to the adjusted values of the parameters. This can be used as a guideline for tuning some configurable parameters to optimize resource usage when there is a large amount of log data to be handled.\",\"PeriodicalId\":352222,\"journal\":{\"name\":\"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICKII55100.2022.9983590\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKII55100.2022.9983590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在资源管理中，资源优化是大多数专业组织为了减少开支和处理不必要的资源使用而进行的一种常用技术。欧洲核子研究组织(CERN)打算为大型离子对撞机实验探测器(ALICE)实现基于人工智能的日志记录系统。这个系统是通过使用Elasticsearch、Kibana、Beats和Logstash(也称为ELK Stack)实现的，它为我们提供了从系统和应用程序聚合日志的能力。日志数据由Beats从CERN的一级处理器(First Level Processors, FLPs)节点收集。这些节点在执行任务时会运行大量的业务，并且会产生大量的日志数据。Filebeat用作日志传送器，将数据传输到Logstash，这是一个服务器端预处理管道。当Filebeat和Logstash一起工作时，有许多可配置的因素会影响它们的效率。然后，我们应用一个析因实验来确定显著因素及其相关性。这些参数也经过优化，以找到其配置的最佳可能值。然后，在保持系统的适当性能的同时，可以最大限度地减少资源使用。研究结果表明，通过调整参数值可以提高系统的效率。当有大量日志数据需要处理时，这可以作为调优一些可配置参数以优化资源使用的指南。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Resource Optimization for Log Shipper and Preprocessing Pipeline in a Large-Scale Logging System

In resource management, resource optimization is a usual technique to proceed for most professional organizations in order to reduce expenses and to dispose unnecessary resource usages. The European Organization for Nuclear Research (CERN) intends to implement a logging system based on AI for A Large Ion Collider Experiment detector, or ALICE. This system has been being implemented by using the Elasticsearch, Kibana, Beats, and Logstash also called ELK Stack which gives us the capability for the logs aggregation from systems and applications. Log data are collected from involved servers at CERN called First Level Processors (FLPs) nodes by Beats. These nodes run a large number of services when tasks are executed and generate a large volume of log data. Filebeat is used as a log shipper to transfer the data to Logstash, a server-side preprocessing pipeline. When Filebeat and Logstash are working together, there are many configurable factors affecting their efficiency. We then apply a factorial experiment to identify the significant factors and their correlation. These parameters are also optimized to find the best possible values of their configurations. Then, the resource usage can be minimized while a suitable performance of the system is maintained. The results of this study show that we can increase the efficiency of the system thanks to the adjusted values of the parameters. This can be used as a guideline for tuning some configurable parameters to optimize resource usage when there is a large amount of log data to be handled.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )

自引率

0.00%

发文量