Steven C Grambow, Manisha Desai, Kevin P Weinfurt, Christopher J Lindsell, Michael J Pencina, Lacey Rende, Gina-Maria Pomann
{"title":"在临床和转化研究的生物统计工作流程中集成大型语言模型。","authors":"Steven C Grambow, Manisha Desai, Kevin P Weinfurt, Christopher J Lindsell, Michael J Pencina, Lacey Rende, Gina-Maria Pomann","doi":"10.1017/cts.2025.10064","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Biostatisticians increasingly use large language models (LLMs) to enhance efficiency, yet practical guidance on responsible integration is limited. This study explores current LLM usage, challenges, and training needs to support biostatisticians.</p><p><strong>Methods: </strong>A cross-sectional survey was conducted across three biostatistics units at two academic medical centers. The survey assessed LLM usage across three key professional activities: communication and leadership, clinical and domain knowledge, and quantitative expertise. Responses were analyzed using descriptive statistics, while free-text responses underwent thematic analysis.</p><p><strong>Results: </strong>Of 208 eligible biostatisticians (162 staff and 46 faculty), 69 (33.2%) responded. Among them, 44 (63.8%) reported using LLMs; of the 43 who answered the frequency question, 20 (46.5%) used them daily and 16 (37.2%) weekly. LLMs improved productivity in coding, writing, and literature review; however, 29 of 41 respondents (70.7%) reported significant errors, including incorrect code, statistical misinterpretations, and hallucinated functions. Key verification strategies included expertise, external validation, debugging, and manual inspection. Among 58 respondents providing training feedback, 44 (75.9%) requested case studies, 40 (69.0%) sought interactive tutorials, and 37 (63.8%) desired structured training.</p><p><strong>Conclusions: </strong>LLM usage is notable among respondents at two academic medical centers, though response patterns likely reflect early adopters. While LLMs enhance productivity, challenges like errors and reliability concerns highlight the need for verification strategies and systematic validation. The strong interest in training underscores the need for structured guidance. As an initial step, we propose eight core principles for responsible LLM integration, offering a preliminary framework for structured usage, validation, and ethical considerations.</p>","PeriodicalId":15529,"journal":{"name":"Journal of Clinical and Translational Science","volume":"9 1","pages":"e131"},"PeriodicalIF":2.0000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12260977/pdf/","citationCount":"0","resultStr":"{\"title\":\"Integrating large language models in biostatistical workflows for clinical and translational research.\",\"authors\":\"Steven C Grambow, Manisha Desai, Kevin P Weinfurt, Christopher J Lindsell, Michael J Pencina, Lacey Rende, Gina-Maria Pomann\",\"doi\":\"10.1017/cts.2025.10064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Biostatisticians increasingly use large language models (LLMs) to enhance efficiency, yet practical guidance on responsible integration is limited. This study explores current LLM usage, challenges, and training needs to support biostatisticians.</p><p><strong>Methods: </strong>A cross-sectional survey was conducted across three biostatistics units at two academic medical centers. The survey assessed LLM usage across three key professional activities: communication and leadership, clinical and domain knowledge, and quantitative expertise. Responses were analyzed using descriptive statistics, while free-text responses underwent thematic analysis.</p><p><strong>Results: </strong>Of 208 eligible biostatisticians (162 staff and 46 faculty), 69 (33.2%) responded. Among them, 44 (63.8%) reported using LLMs; of the 43 who answered the frequency question, 20 (46.5%) used them daily and 16 (37.2%) weekly. LLMs improved productivity in coding, writing, and literature review; however, 29 of 41 respondents (70.7%) reported significant errors, including incorrect code, statistical misinterpretations, and hallucinated functions. Key verification strategies included expertise, external validation, debugging, and manual inspection. Among 58 respondents providing training feedback, 44 (75.9%) requested case studies, 40 (69.0%) sought interactive tutorials, and 37 (63.8%) desired structured training.</p><p><strong>Conclusions: </strong>LLM usage is notable among respondents at two academic medical centers, though response patterns likely reflect early adopters. While LLMs enhance productivity, challenges like errors and reliability concerns highlight the need for verification strategies and systematic validation. The strong interest in training underscores the need for structured guidance. As an initial step, we propose eight core principles for responsible LLM integration, offering a preliminary framework for structured usage, validation, and ethical considerations.</p>\",\"PeriodicalId\":15529,\"journal\":{\"name\":\"Journal of Clinical and Translational Science\",\"volume\":\"9 1\",\"pages\":\"e131\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12260977/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Clinical and Translational Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1017/cts.2025.10064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical and Translational Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/cts.2025.10064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
Integrating large language models in biostatistical workflows for clinical and translational research.
Introduction: Biostatisticians increasingly use large language models (LLMs) to enhance efficiency, yet practical guidance on responsible integration is limited. This study explores current LLM usage, challenges, and training needs to support biostatisticians.
Methods: A cross-sectional survey was conducted across three biostatistics units at two academic medical centers. The survey assessed LLM usage across three key professional activities: communication and leadership, clinical and domain knowledge, and quantitative expertise. Responses were analyzed using descriptive statistics, while free-text responses underwent thematic analysis.
Results: Of 208 eligible biostatisticians (162 staff and 46 faculty), 69 (33.2%) responded. Among them, 44 (63.8%) reported using LLMs; of the 43 who answered the frequency question, 20 (46.5%) used them daily and 16 (37.2%) weekly. LLMs improved productivity in coding, writing, and literature review; however, 29 of 41 respondents (70.7%) reported significant errors, including incorrect code, statistical misinterpretations, and hallucinated functions. Key verification strategies included expertise, external validation, debugging, and manual inspection. Among 58 respondents providing training feedback, 44 (75.9%) requested case studies, 40 (69.0%) sought interactive tutorials, and 37 (63.8%) desired structured training.
Conclusions: LLM usage is notable among respondents at two academic medical centers, though response patterns likely reflect early adopters. While LLMs enhance productivity, challenges like errors and reliability concerns highlight the need for verification strategies and systematic validation. The strong interest in training underscores the need for structured guidance. As an initial step, we propose eight core principles for responsible LLM integration, offering a preliminary framework for structured usage, validation, and ethical considerations.