利用大型语言模型和基于代理的系统进行科学数据分析：验证研究。

IF 5.8 2区医学 Q1 PSYCHIATRY

Jmir Mental Health Pub Date : 2025-02-13 DOI:10.2196/68135

Dale Peasley, Rayus Kuplicki, Sandip Sen, Martin Paulus

{"title":"利用大型语言模型和基于代理的系统进行科学数据分析：验证研究。","authors":"Dale Peasley, Rayus Kuplicki, Sandip Sen, Martin Paulus","doi":"10.2196/68135","DOIUrl":null,"url":null,"abstract":"Background: Large language models have shown promise in transforming how complex scientific data are analyzed and communicated, yet their application to scientific domains remains challenged by issues of factual accuracy and domain-specific precision. The Laureate Institute for Brain Research-Tulsa University (LIBR-TU) Research Agent (LITURAt) leverages a sophisticated agent-based architecture to mitigate these limitations, using external data retrieval and analysis tools to ensure reliable, context-aware outputs that make scientific information accessible to both experts and nonexperts.Objective: The objective of this study was to develop and evaluate LITURAt to enable efficient analysis and contextualization of complex scientific datasets for diverse user expertise levels.Methods: An agent-based system based on large language models was designed to analyze and contextualize complex scientific datasets using a \"plan-and-solve\" framework. The system dynamically retrieves local data and relevant PubMed literature, performs statistical analyses, and generates comprehensive, context-aware summaries to answer user queries with high accuracy and consistency.Results: Our experiments demonstrated that LITURAt achieved an internal consistency rate of 94.8% and an external consistency rate of 91.9% across repeated and rephrased queries. Additionally, GPT-4 evaluations rated 80.3% (171/213) of the system's answers as accurate and comprehensive, with 23.5% (50/213) receiving the highest rating of 5 for completeness and precision.Conclusions: These findings highlight the potential of LITURAt to significantly enhance the accessibility and accuracy of scientific data analysis, achieving high consistency and strong performance in complex query resolution. Despite existing limitations, such as model stability for highly variable queries, LITURAt demonstrates promise as a robust tool for democratizing data-driven insights across diverse scientific domains.","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e68135"},"PeriodicalIF":5.8000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11841814/pdf/","citationCount":"0","resultStr":"{\"title\":\"Leveraging Large Language Models and Agent-Based Systems for Scientific Data Analysis: Validation Study.\",\"authors\":\"Dale Peasley, Rayus Kuplicki, Sandip Sen, Martin Paulus\",\"doi\":\"10.2196/68135\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Large language models have shown promise in transforming how complex scientific data are analyzed and communicated, yet their application to scientific domains remains challenged by issues of factual accuracy and domain-specific precision. The Laureate Institute for Brain Research-Tulsa University (LIBR-TU) Research Agent (LITURAt) leverages a sophisticated agent-based architecture to mitigate these limitations, using external data retrieval and analysis tools to ensure reliable, context-aware outputs that make scientific information accessible to both experts and nonexperts.Objective: The objective of this study was to develop and evaluate LITURAt to enable efficient analysis and contextualization of complex scientific datasets for diverse user expertise levels.Methods: An agent-based system based on large language models was designed to analyze and contextualize complex scientific datasets using a \\\"plan-and-solve\\\" framework. The system dynamically retrieves local data and relevant PubMed literature, performs statistical analyses, and generates comprehensive, context-aware summaries to answer user queries with high accuracy and consistency.Results: Our experiments demonstrated that LITURAt achieved an internal consistency rate of 94.8% and an external consistency rate of 91.9% across repeated and rephrased queries. Additionally, GPT-4 evaluations rated 80.3% (171/213) of the system's answers as accurate and comprehensive, with 23.5% (50/213) receiving the highest rating of 5 for completeness and precision.Conclusions: These findings highlight the potential of LITURAt to significantly enhance the accessibility and accuracy of scientific data analysis, achieving high consistency and strong performance in complex query resolution. Despite existing limitations, such as model stability for highly variable queries, LITURAt demonstrates promise as a robust tool for democratizing data-driven insights across diverse scientific domains.\",\"PeriodicalId\":48616,\"journal\":{\"name\":\"Jmir Mental Health\",\"volume\":\"12 \",\"pages\":\"e68135\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11841814/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jmir Mental Health\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2196/68135\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jmir Mental Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/68135","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

摘要

背景：大型语言模型在改变复杂科学数据的分析和交流方式方面显示出了希望，但它们在科学领域的应用仍然受到事实准确性和领域特定精度问题的挑战。塔尔萨大学劳瑞德脑研究所（lib - tu）研究代理（LITURAt）利用复杂的基于代理的架构来缓解这些限制，使用外部数据检索和分析工具来确保可靠的、上下文感知的输出，使专家和非专家都可以访问科学信息。目的：本研究的目的是开发和评估LITURAt，以实现针对不同用户专业水平的复杂科学数据集的有效分析和情境化。方法：设计一个基于大型语言模型的智能体系统，采用“计划-解决”框架对复杂科学数据集进行分析和语境化。该系统动态检索本地数据和相关PubMed文献，进行统计分析，并生成全面的、上下文感知的摘要，以高精度和一致性回答用户的查询。结果：我们的实验表明，LITURAt在重复和改写查询中实现了94.8%的内部一致性率和91.9%的外部一致性率。此外，在GPT-4评估中，80.3%（171/213）的系统答案被评为准确和全面，23.5%（50/213）的系统答案在完整性和准确性方面获得了最高的5分。结论：这些发现突出了LITURAt在显著提高科学数据分析的可及性和准确性方面的潜力，在复杂查询解析方面实现了高一致性和强性能。尽管存在一些限制，比如高度可变查询的模型稳定性，LITURAt展示了作为一个强大的工具，在不同的科学领域实现数据驱动见解的民主化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Leveraging Large Language Models and Agent-Based Systems for Scientific Data Analysis: Validation Study.

查看原文本刊更多论文

Leveraging Large Language Models and Agent-Based Systems for Scientific Data Analysis: Validation Study.

Background: Large language models have shown promise in transforming how complex scientific data are analyzed and communicated, yet their application to scientific domains remains challenged by issues of factual accuracy and domain-specific precision. The Laureate Institute for Brain Research-Tulsa University (LIBR-TU) Research Agent (LITURAt) leverages a sophisticated agent-based architecture to mitigate these limitations, using external data retrieval and analysis tools to ensure reliable, context-aware outputs that make scientific information accessible to both experts and nonexperts.

Objective: The objective of this study was to develop and evaluate LITURAt to enable efficient analysis and contextualization of complex scientific datasets for diverse user expertise levels.

Methods: An agent-based system based on large language models was designed to analyze and contextualize complex scientific datasets using a "plan-and-solve" framework. The system dynamically retrieves local data and relevant PubMed literature, performs statistical analyses, and generates comprehensive, context-aware summaries to answer user queries with high accuracy and consistency.

Results: Our experiments demonstrated that LITURAt achieved an internal consistency rate of 94.8% and an external consistency rate of 91.9% across repeated and rephrased queries. Additionally, GPT-4 evaluations rated 80.3% (171/213) of the system's answers as accurate and comprehensive, with 23.5% (50/213) receiving the highest rating of 5 for completeness and precision.

Conclusions: These findings highlight the potential of LITURAt to significantly enhance the accessibility and accuracy of scientific data analysis, achieving high consistency and strong performance in complex query resolution. Despite existing limitations, such as model stability for highly variable queries, LITURAt demonstrates promise as a robust tool for democratizing data-driven insights across diverse scientific domains.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Jmir Mental Health Medicine-Psychiatry and Mental Health

CiteScore

10.80

自引率

3.80%

发文量

104

审稿时长

16 weeks

期刊介绍： JMIR Mental Health (JMH, ISSN 2368-7959) is a PubMed-indexed, peer-reviewed sister journal of JMIR, the leading eHealth journal (Impact Factor 2016: 5.175). JMIR Mental Health focusses on digital health and Internet interventions, technologies and electronic innovations (software and hardware) for mental health, addictions, online counselling and behaviour change. This includes formative evaluation and system descriptions, theoretical papers, review papers, viewpoint/vision papers, and rigorous evaluations.