Resource Optimization for Log Shipper and Preprocessing Pipeline in a Large-Scale Logging System

Thanarit Lertwuthikarn, V. C. Barroso, K. Akkarajitsakul
{"title":"Resource Optimization for Log Shipper and Preprocessing Pipeline in a Large-Scale Logging System","authors":"Thanarit Lertwuthikarn, V. C. Barroso, K. Akkarajitsakul","doi":"10.1109/ICKII55100.2022.9983590","DOIUrl":null,"url":null,"abstract":"In resource management, resource optimization is a usual technique to proceed for most professional organizations in order to reduce expenses and to dispose unnecessary resource usages. The European Organization for Nuclear Research (CERN) intends to implement a logging system based on AI for A Large Ion Collider Experiment detector, or ALICE. This system has been being implemented by using the Elasticsearch, Kibana, Beats, and Logstash also called ELK Stack which gives us the capability for the logs aggregation from systems and applications. Log data are collected from involved servers at CERN called First Level Processors (FLPs) nodes by Beats. These nodes run a large number of services when tasks are executed and generate a large volume of log data. Filebeat is used as a log shipper to transfer the data to Logstash, a server-side preprocessing pipeline. When Filebeat and Logstash are working together, there are many configurable factors affecting their efficiency. We then apply a factorial experiment to identify the significant factors and their correlation. These parameters are also optimized to find the best possible values of their configurations. Then, the resource usage can be minimized while a suitable performance of the system is maintained. The results of this study show that we can increase the efficiency of the system thanks to the adjusted values of the parameters. This can be used as a guideline for tuning some configurable parameters to optimize resource usage when there is a large amount of log data to be handled.","PeriodicalId":352222,"journal":{"name":"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKII55100.2022.9983590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In resource management, resource optimization is a usual technique to proceed for most professional organizations in order to reduce expenses and to dispose unnecessary resource usages. The European Organization for Nuclear Research (CERN) intends to implement a logging system based on AI for A Large Ion Collider Experiment detector, or ALICE. This system has been being implemented by using the Elasticsearch, Kibana, Beats, and Logstash also called ELK Stack which gives us the capability for the logs aggregation from systems and applications. Log data are collected from involved servers at CERN called First Level Processors (FLPs) nodes by Beats. These nodes run a large number of services when tasks are executed and generate a large volume of log data. Filebeat is used as a log shipper to transfer the data to Logstash, a server-side preprocessing pipeline. When Filebeat and Logstash are working together, there are many configurable factors affecting their efficiency. We then apply a factorial experiment to identify the significant factors and their correlation. These parameters are also optimized to find the best possible values of their configurations. Then, the resource usage can be minimized while a suitable performance of the system is maintained. The results of this study show that we can increase the efficiency of the system thanks to the adjusted values of the parameters. This can be used as a guideline for tuning some configurable parameters to optimize resource usage when there is a large amount of log data to be handled.
大型测井系统中运日志者资源优化及预处理管道
在资源管理中,资源优化是大多数专业组织为了减少开支和处理不必要的资源使用而进行的一种常用技术。欧洲核子研究组织(CERN)打算为大型离子对撞机实验探测器(ALICE)实现基于人工智能的日志记录系统。这个系统是通过使用Elasticsearch、Kibana、Beats和Logstash(也称为ELK Stack)实现的,它为我们提供了从系统和应用程序聚合日志的能力。日志数据由Beats从CERN的一级处理器(First Level Processors, FLPs)节点收集。这些节点在执行任务时会运行大量的业务,并且会产生大量的日志数据。Filebeat用作日志传送器,将数据传输到Logstash,这是一个服务器端预处理管道。当Filebeat和Logstash一起工作时,有许多可配置的因素会影响它们的效率。然后,我们应用一个析因实验来确定显著因素及其相关性。这些参数也经过优化,以找到其配置的最佳可能值。然后,在保持系统的适当性能的同时,可以最大限度地减少资源使用。研究结果表明,通过调整参数值可以提高系统的效率。当有大量日志数据需要处理时,这可以作为调优一些可配置参数以优化资源使用的指南。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信