Connor Beveridge, Sanjay Iyer, Caitlin E. Randolph, Matthew Muhoberac, Palak Manchanda, Amy C. Clingenpeel, Shane Tichy, Gaurav Chopra
{"title":"使用大型语言模型的全面脂质体自动化工作流程","authors":"Connor Beveridge, Sanjay Iyer, Caitlin E. Randolph, Matthew Muhoberac, Palak Manchanda, Amy C. Clingenpeel, Shane Tichy, Gaurav Chopra","doi":"arxiv-2403.15076","DOIUrl":null,"url":null,"abstract":"Lipidomics generates large data that makes manual annotation and\ninterpretation challenging. Lipid chemical and structural diversity with\nstructural isomers further complicates annotation. Although, several commercial\nand open-source software for targeted lipid identification exists, it lacks\nautomated method generation workflows and integration with statistical and\nbioinformatics tools. We have developed the Comprehensive Lipidomic Automated\nWorkflow (CLAW) platform with integrated workflow for parsing, detailed\nstatistical analysis and lipid annotations based on custom multiple reaction\nmonitoring (MRM) precursor and product ion pair transitions. CLAW contains\nseveral modules including identification of carbon-carbon double bond\nposition(s) in unsaturated lipids when combined with ozone electrospray\nionization (OzESI)-MRM methodology. To demonstrate the utility of the automated\nworkflow in CLAW, large-scale lipidomics data was collected with traditional\nand OzESI-MRM profiling on biological and non-biological samples. Specifically,\na total of 1497 transitions organized into 10 MRM-based mass spectrometry\nmethods were used to profile lipid droplets isolated from different brain\nregions of 18-24 month-old Alzheimer's disease mice and age-matched wild-type\ncontrols. Additionally, triacyclglycerols (TGs) profiles with carbon-carbon\ndouble bond specificity were generated from canola oil samples using OzESI-MRM\nprofiling. We also developed an integrated language user interface with large\nlanguage models using artificially intelligent (AI) agents that permits users\nto interact with the CLAW platform using a chatbot terminal to perform\nstatistical and bioinformatic analyses. We envision CLAW pipeline to be used in\nhigh-throughput lipid structural identification tasks aiding users to generate\nautomated lipidomics workflows ranging from data acquisition to AI agent-based\nbioinformatic analysis.","PeriodicalId":501170,"journal":{"name":"arXiv - QuanBio - Subcellular Processes","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comprehensive Lipidomic Automation Workflow using Large Language Models\",\"authors\":\"Connor Beveridge, Sanjay Iyer, Caitlin E. Randolph, Matthew Muhoberac, Palak Manchanda, Amy C. Clingenpeel, Shane Tichy, Gaurav Chopra\",\"doi\":\"arxiv-2403.15076\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lipidomics generates large data that makes manual annotation and\\ninterpretation challenging. Lipid chemical and structural diversity with\\nstructural isomers further complicates annotation. Although, several commercial\\nand open-source software for targeted lipid identification exists, it lacks\\nautomated method generation workflows and integration with statistical and\\nbioinformatics tools. We have developed the Comprehensive Lipidomic Automated\\nWorkflow (CLAW) platform with integrated workflow for parsing, detailed\\nstatistical analysis and lipid annotations based on custom multiple reaction\\nmonitoring (MRM) precursor and product ion pair transitions. CLAW contains\\nseveral modules including identification of carbon-carbon double bond\\nposition(s) in unsaturated lipids when combined with ozone electrospray\\nionization (OzESI)-MRM methodology. To demonstrate the utility of the automated\\nworkflow in CLAW, large-scale lipidomics data was collected with traditional\\nand OzESI-MRM profiling on biological and non-biological samples. Specifically,\\na total of 1497 transitions organized into 10 MRM-based mass spectrometry\\nmethods were used to profile lipid droplets isolated from different brain\\nregions of 18-24 month-old Alzheimer's disease mice and age-matched wild-type\\ncontrols. Additionally, triacyclglycerols (TGs) profiles with carbon-carbon\\ndouble bond specificity were generated from canola oil samples using OzESI-MRM\\nprofiling. We also developed an integrated language user interface with large\\nlanguage models using artificially intelligent (AI) agents that permits users\\nto interact with the CLAW platform using a chatbot terminal to perform\\nstatistical and bioinformatic analyses. We envision CLAW pipeline to be used in\\nhigh-throughput lipid structural identification tasks aiding users to generate\\nautomated lipidomics workflows ranging from data acquisition to AI agent-based\\nbioinformatic analysis.\",\"PeriodicalId\":501170,\"journal\":{\"name\":\"arXiv - QuanBio - Subcellular Processes\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Subcellular Processes\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2403.15076\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Subcellular Processes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.15076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comprehensive Lipidomic Automation Workflow using Large Language Models
Lipidomics generates large data that makes manual annotation and
interpretation challenging. Lipid chemical and structural diversity with
structural isomers further complicates annotation. Although, several commercial
and open-source software for targeted lipid identification exists, it lacks
automated method generation workflows and integration with statistical and
bioinformatics tools. We have developed the Comprehensive Lipidomic Automated
Workflow (CLAW) platform with integrated workflow for parsing, detailed
statistical analysis and lipid annotations based on custom multiple reaction
monitoring (MRM) precursor and product ion pair transitions. CLAW contains
several modules including identification of carbon-carbon double bond
position(s) in unsaturated lipids when combined with ozone electrospray
ionization (OzESI)-MRM methodology. To demonstrate the utility of the automated
workflow in CLAW, large-scale lipidomics data was collected with traditional
and OzESI-MRM profiling on biological and non-biological samples. Specifically,
a total of 1497 transitions organized into 10 MRM-based mass spectrometry
methods were used to profile lipid droplets isolated from different brain
regions of 18-24 month-old Alzheimer's disease mice and age-matched wild-type
controls. Additionally, triacyclglycerols (TGs) profiles with carbon-carbon
double bond specificity were generated from canola oil samples using OzESI-MRM
profiling. We also developed an integrated language user interface with large
language models using artificially intelligent (AI) agents that permits users
to interact with the CLAW platform using a chatbot terminal to perform
statistical and bioinformatic analyses. We envision CLAW pipeline to be used in
high-throughput lipid structural identification tasks aiding users to generate
automated lipidomics workflows ranging from data acquisition to AI agent-based
bioinformatic analysis.