Antonin Sulc, Alex Bien, Annika Eichler, Daniel Ratner, Florian Rehm, Frank Mayet, Gregor Hartmann, Hayden Hoschouer, Henrik Tuennermann, Jan Kaiser, Jason St. John, Jennefer Maldonado, Kyle Hazelwood, Raimund Kammering, Thorsten Hellert, Tim Wilksen, Verena Kain, Wan-Lin Hu
{"title":"Towards Unlocking Insights from Logbooks Using AI","authors":"Antonin Sulc, Alex Bien, Annika Eichler, Daniel Ratner, Florian Rehm, Frank Mayet, Gregor Hartmann, Hayden Hoschouer, Henrik Tuennermann, Jan Kaiser, Jason St. John, Jennefer Maldonado, Kyle Hazelwood, Raimund Kammering, Thorsten Hellert, Tim Wilksen, Verena Kain, Wan-Lin Hu","doi":"arxiv-2406.12881","DOIUrl":null,"url":null,"abstract":"Electronic logbooks contain valuable information about activities and events\nconcerning their associated particle accelerator facilities. However, the\nhighly technical nature of logbook entries can hinder their usability and\nautomation. As natural language processing (NLP) continues advancing, it offers\nopportunities to address various challenges that logbooks present. This work\nexplores jointly testing a tailored Retrieval Augmented Generation (RAG) model\nfor enhancing the usability of particle accelerator logbooks at institutes like\nDESY, BESSY, Fermilab, BNL, SLAC, LBNL, and CERN. The RAG model uses a corpus\nbuilt on logbook contributions and aims to unlock insights from these logbooks\nby leveraging retrieval over facility datasets, including discussion about\npotential multimodal sources. Our goals are to increase the FAIR-ness\n(findability, accessibility, interoperability, and reusability) of logbooks by\nexploiting their information content to streamline everyday use, enable\nmacro-analysis for root cause analysis, and facilitate problem-solving\nautomation.","PeriodicalId":501318,"journal":{"name":"arXiv - PHYS - Accelerator Physics","volume":"59 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Accelerator Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.12881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Electronic logbooks contain valuable information about activities and events
concerning their associated particle accelerator facilities. However, the
highly technical nature of logbook entries can hinder their usability and
automation. As natural language processing (NLP) continues advancing, it offers
opportunities to address various challenges that logbooks present. This work
explores jointly testing a tailored Retrieval Augmented Generation (RAG) model
for enhancing the usability of particle accelerator logbooks at institutes like
DESY, BESSY, Fermilab, BNL, SLAC, LBNL, and CERN. The RAG model uses a corpus
built on logbook contributions and aims to unlock insights from these logbooks
by leveraging retrieval over facility datasets, including discussion about
potential multimodal sources. Our goals are to increase the FAIR-ness
(findability, accessibility, interoperability, and reusability) of logbooks by
exploiting their information content to streamline everyday use, enable
macro-analysis for root cause analysis, and facilitate problem-solving
automation.