Comprehensive Lipidomic Automation Workflow using Large Language Models

Connor Beveridge, Sanjay Iyer, Caitlin E. Randolph, Matthew Muhoberac, Palak Manchanda, Amy C. Clingenpeel, Shane Tichy, Gaurav Chopra
{"title":"Comprehensive Lipidomic Automation Workflow using Large Language Models","authors":"Connor Beveridge, Sanjay Iyer, Caitlin E. Randolph, Matthew Muhoberac, Palak Manchanda, Amy C. Clingenpeel, Shane Tichy, Gaurav Chopra","doi":"arxiv-2403.15076","DOIUrl":null,"url":null,"abstract":"Lipidomics generates large data that makes manual annotation and\ninterpretation challenging. Lipid chemical and structural diversity with\nstructural isomers further complicates annotation. Although, several commercial\nand open-source software for targeted lipid identification exists, it lacks\nautomated method generation workflows and integration with statistical and\nbioinformatics tools. We have developed the Comprehensive Lipidomic Automated\nWorkflow (CLAW) platform with integrated workflow for parsing, detailed\nstatistical analysis and lipid annotations based on custom multiple reaction\nmonitoring (MRM) precursor and product ion pair transitions. CLAW contains\nseveral modules including identification of carbon-carbon double bond\nposition(s) in unsaturated lipids when combined with ozone electrospray\nionization (OzESI)-MRM methodology. To demonstrate the utility of the automated\nworkflow in CLAW, large-scale lipidomics data was collected with traditional\nand OzESI-MRM profiling on biological and non-biological samples. Specifically,\na total of 1497 transitions organized into 10 MRM-based mass spectrometry\nmethods were used to profile lipid droplets isolated from different brain\nregions of 18-24 month-old Alzheimer's disease mice and age-matched wild-type\ncontrols. Additionally, triacyclglycerols (TGs) profiles with carbon-carbon\ndouble bond specificity were generated from canola oil samples using OzESI-MRM\nprofiling. We also developed an integrated language user interface with large\nlanguage models using artificially intelligent (AI) agents that permits users\nto interact with the CLAW platform using a chatbot terminal to perform\nstatistical and bioinformatic analyses. We envision CLAW pipeline to be used in\nhigh-throughput lipid structural identification tasks aiding users to generate\nautomated lipidomics workflows ranging from data acquisition to AI agent-based\nbioinformatic analysis.","PeriodicalId":501170,"journal":{"name":"arXiv - QuanBio - Subcellular Processes","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Subcellular Processes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.15076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Lipidomics generates large data that makes manual annotation and interpretation challenging. Lipid chemical and structural diversity with structural isomers further complicates annotation. Although, several commercial and open-source software for targeted lipid identification exists, it lacks automated method generation workflows and integration with statistical and bioinformatics tools. We have developed the Comprehensive Lipidomic Automated Workflow (CLAW) platform with integrated workflow for parsing, detailed statistical analysis and lipid annotations based on custom multiple reaction monitoring (MRM) precursor and product ion pair transitions. CLAW contains several modules including identification of carbon-carbon double bond position(s) in unsaturated lipids when combined with ozone electrospray ionization (OzESI)-MRM methodology. To demonstrate the utility of the automated workflow in CLAW, large-scale lipidomics data was collected with traditional and OzESI-MRM profiling on biological and non-biological samples. Specifically, a total of 1497 transitions organized into 10 MRM-based mass spectrometry methods were used to profile lipid droplets isolated from different brain regions of 18-24 month-old Alzheimer's disease mice and age-matched wild-type controls. Additionally, triacyclglycerols (TGs) profiles with carbon-carbon double bond specificity were generated from canola oil samples using OzESI-MRM profiling. We also developed an integrated language user interface with large language models using artificially intelligent (AI) agents that permits users to interact with the CLAW platform using a chatbot terminal to perform statistical and bioinformatic analyses. We envision CLAW pipeline to be used in high-throughput lipid structural identification tasks aiding users to generate automated lipidomics workflows ranging from data acquisition to AI agent-based bioinformatic analysis.
使用大型语言模型的全面脂质体自动化工作流程
脂质组学会产生大量数据,这使得人工标注和解释具有挑战性。脂质化学和结构的多样性以及结构异构体使注释工作更加复杂。虽然有一些商业和开源软件可用于脂质靶向鉴定,但它们缺乏自动化方法生成工作流以及与统计和生物信息学工具的集成。我们开发了综合脂质体自动工作流(CLAW)平台,该平台集成了解析、详细统计分析和基于定制多反应监测(MRM)前体和产物离子对转换的脂质注释工作流。CLAW 包含多个模块,包括结合臭氧电喷雾(OzESI)-MRM 方法鉴定不饱和脂质中的碳碳双键位置。为了证明 CLAW 自动工作流程的实用性,对生物和非生物样本进行了传统和 OzESI-MRM 分析,收集了大规模的脂质组学数据。具体来说,共使用了 10 种基于 MRM 的质谱方法中的 1497 个跃迁来分析从 18-24 个月大的阿尔茨海默病小鼠和年龄匹配的野生小鼠的不同脑区分离出来的脂滴。此外,还使用 OzESI-MRMprofiling 从菜籽油样品中生成了具有碳碳双键特异性的三重甘油(TGs)图谱。我们还利用人工智能(AI)代理开发了具有大型语言模型的集成语言用户界面,允许用户使用聊天机器人终端与 CLAW 平台进行交互,以执行统计和生物信息分析。我们设想将 CLAW 管道用于高通量脂质结构鉴定任务,帮助用户生成从数据采集到基于人工智能代理的生物信息分析的自动化脂质组学工作流程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信