Large language models for analyzing open text in global health surveys: why children are not accessing vaccine services in the Democratic Republic of the Congo.

IF 2.3 4区 医学 Q2 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Roy Burstein, Eric Mafuta, Joshua L Proctor
{"title":"Large language models for analyzing open text in global health surveys: why children are not accessing vaccine services in the Democratic Republic of the Congo.","authors":"Roy Burstein, Eric Mafuta, Joshua L Proctor","doi":"10.1093/inthealth/ihaf015","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>This study evaluates the use of large language models (LLMs) to analyze free-text responses from large-scale global health surveys, using data from the Enquête de Couverture Vaccinale (ECV) household coverage surveys from 2020, 2021, 2022 and 2023 as a case study.</p><p><strong>Methods: </strong>We tested several LLM approaches consisting of zero-shot and few-shot prompting, fine-tuning, and a natural language processing approach using semantic embeddings, to analyze responses on the reasons caregivers did not vaccinate their children.</p><p><strong>Results: </strong>Performance ranged from 61.5% to 96% based on testing against a curated benchmarking dataset drawn from the ECV surveys, with accuracy improving when LLMs were fine-tuned or provided examples for few-shot learning. We show that even with as few as 20-100 examples, LLMs can achieve high accuracy in categorizing free-text responses.</p><p><strong>Conclusions: </strong>This approach offers significant opportunities for reanalyzing existing datasets and designing surveys with more open-ended questions, providing a scalable, cost-effective solution for global health organizations. Despite challenges with closed-source models and computational costs, the study underscores LLMs' potential to enhance data analysis and inform global health policy.</p>","PeriodicalId":49060,"journal":{"name":"International Health","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/inthealth/ihaf015","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Background: This study evaluates the use of large language models (LLMs) to analyze free-text responses from large-scale global health surveys, using data from the Enquête de Couverture Vaccinale (ECV) household coverage surveys from 2020, 2021, 2022 and 2023 as a case study.

Methods: We tested several LLM approaches consisting of zero-shot and few-shot prompting, fine-tuning, and a natural language processing approach using semantic embeddings, to analyze responses on the reasons caregivers did not vaccinate their children.

Results: Performance ranged from 61.5% to 96% based on testing against a curated benchmarking dataset drawn from the ECV surveys, with accuracy improving when LLMs were fine-tuned or provided examples for few-shot learning. We show that even with as few as 20-100 examples, LLMs can achieve high accuracy in categorizing free-text responses.

Conclusions: This approach offers significant opportunities for reanalyzing existing datasets and designing surveys with more open-ended questions, providing a scalable, cost-effective solution for global health organizations. Despite challenges with closed-source models and computational costs, the study underscores LLMs' potential to enhance data analysis and inform global health policy.

求助全文
约1分钟内获得全文 求助全文
来源期刊
International Health
International Health PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH-
CiteScore
4.50
自引率
0.00%
发文量
83
审稿时长
>12 weeks
期刊介绍: International Health is an official journal of the Royal Society of Tropical Medicine and Hygiene. It publishes original, peer-reviewed articles and reviews on all aspects of global health including the social and economic aspects of communicable and non-communicable diseases, health systems research, policy and implementation, and the evaluation of disease control programmes and healthcare delivery solutions. It aims to stimulate scientific and policy debate and provide a forum for analysis and opinion sharing for individuals and organisations engaged in all areas of global health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信