Case reports unlocked: Leveraging retrieval-augmented generation with large language models to advance research on psychological child maltreatment

IF 3.4 2区 心理学 Q1 FAMILY STUDIES
Dragan Stoll , Andreas Jud , Samuel Wehrli , David Lätsch , Selina Steinmann , Meret Sophie Wallimann , Julia Quehenberger
{"title":"Case reports unlocked: Leveraging retrieval-augmented generation with large language models to advance research on psychological child maltreatment","authors":"Dragan Stoll ,&nbsp;Andreas Jud ,&nbsp;Samuel Wehrli ,&nbsp;David Lätsch ,&nbsp;Selina Steinmann ,&nbsp;Meret Sophie Wallimann ,&nbsp;Julia Quehenberger","doi":"10.1016/j.chiabu.2025.107653","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Research on psychological child maltreatment is impeded by a lack of high-quality structured data. Crucial information is often documented in child protective services (CPS) case files, but only in narrative form. Recent research on the utilization of retrieval-augmented generation (RAG) methods with large language models (LLMs) for extracting structured data from narratives has demonstrated significant potential. RAG methods can facilitate automated classification, thereby eliminating the need for laborious annotation.</div></div><div><h3>Objective</h3><div>We aimed to extract structured data from narrative casework reports by utilizing RAG and LLMs to classify mentions of 24 CPS case factors. These factors encompass child maltreatment indicators, risk factors associated with parental, family, and child characteristics, CPS interventions, and their outcomes. We focused on examining the extraction of psychological abuse due to its complex nature and difficulty in assessing this phenomenon. The results were compared with parental lack of cooperation, a factor with a presumed medium level of recognition difficulty, and a more straightforward factor of parental alcohol abuse.</div></div><div><h3>Methods</h3><div>We developed a four-stage workflow comprising of (1) case reports collection, (2) RAG based assessment of case factor mentions, (3) automated extraction of case factors from RAG assessments, and (4) case labeling. All CPS reports (<em>N</em> = 29,770) between 2008 and 2022 from Switzerland's largest CPS provider were collected. Model performance was evaluated compared against human-coded validation data on assessments. Two expert human reviewers independently classified weighted random samples of reports to validate the findings from which a consensus dataset was derived.</div></div><div><h3>Results</h3><div>The model classified psychological abuse, lack of parental cooperation and parental alcohol abuse compared to a consensus dataset, with an accuracy of 82 %, 83 %, and 95 %, respectively, surpassing the agreement rates between the two human reviewers (79 %, 80 %, and 93 %).</div></div><div><h3>Conclusions</h3><div>RAG based assessment can replicate human judgment even on complex CPS case factors. High accuracy and complete inter-rater agreement level was achieved for factors that are straightforward to classify, such as parental alcohol abuse. The effectiveness of these methods stems from the presence of contextual clues related to case factors within a few sentences across different sections of the text, rather than from characteristics inherent to the entire text. For case factors such as parental lack of cooperation, both supporting and refuting evidence needs to be assessed to achieve optimal accuracy. Careful consideration of potential biases and limitations in RAG methods is advised. These applications can serve as early warning systems, by identifying critical factors from extensive case notes that might otherwise be overlooked, supporting professionals in making informed decisions and improving outcomes for at-risk children.</div></div>","PeriodicalId":51343,"journal":{"name":"Child Abuse & Neglect","volume":"169 ","pages":"Article 107653"},"PeriodicalIF":3.4000,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Child Abuse & Neglect","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0145213425004090","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FAMILY STUDIES","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Research on psychological child maltreatment is impeded by a lack of high-quality structured data. Crucial information is often documented in child protective services (CPS) case files, but only in narrative form. Recent research on the utilization of retrieval-augmented generation (RAG) methods with large language models (LLMs) for extracting structured data from narratives has demonstrated significant potential. RAG methods can facilitate automated classification, thereby eliminating the need for laborious annotation.

Objective

We aimed to extract structured data from narrative casework reports by utilizing RAG and LLMs to classify mentions of 24 CPS case factors. These factors encompass child maltreatment indicators, risk factors associated with parental, family, and child characteristics, CPS interventions, and their outcomes. We focused on examining the extraction of psychological abuse due to its complex nature and difficulty in assessing this phenomenon. The results were compared with parental lack of cooperation, a factor with a presumed medium level of recognition difficulty, and a more straightforward factor of parental alcohol abuse.

Methods

We developed a four-stage workflow comprising of (1) case reports collection, (2) RAG based assessment of case factor mentions, (3) automated extraction of case factors from RAG assessments, and (4) case labeling. All CPS reports (N = 29,770) between 2008 and 2022 from Switzerland's largest CPS provider were collected. Model performance was evaluated compared against human-coded validation data on assessments. Two expert human reviewers independently classified weighted random samples of reports to validate the findings from which a consensus dataset was derived.

Results

The model classified psychological abuse, lack of parental cooperation and parental alcohol abuse compared to a consensus dataset, with an accuracy of 82 %, 83 %, and 95 %, respectively, surpassing the agreement rates between the two human reviewers (79 %, 80 %, and 93 %).

Conclusions

RAG based assessment can replicate human judgment even on complex CPS case factors. High accuracy and complete inter-rater agreement level was achieved for factors that are straightforward to classify, such as parental alcohol abuse. The effectiveness of these methods stems from the presence of contextual clues related to case factors within a few sentences across different sections of the text, rather than from characteristics inherent to the entire text. For case factors such as parental lack of cooperation, both supporting and refuting evidence needs to be assessed to achieve optimal accuracy. Careful consideration of potential biases and limitations in RAG methods is advised. These applications can serve as early warning systems, by identifying critical factors from extensive case notes that might otherwise be overlooked, supporting professionals in making informed decisions and improving outcomes for at-risk children.
案例报告解锁:利用检索增强生成与大语言模型推进儿童心理虐待研究
由于缺乏高质量的结构化数据,对儿童心理虐待的研究受到阻碍。关键信息通常记录在儿童保护服务(CPS)的案件档案中,但只是以叙述的形式。最近关于利用大型语言模型(llm)检索增强生成(RAG)方法从叙述中提取结构化数据的研究已经显示出巨大的潜力。RAG方法可以促进自动分类,从而消除了费力注释的需要。目的利用RAG和llm对24个CPS案例因子的提及进行分类,从叙述性案例报告中提取结构化数据。这些因素包括儿童虐待指标、与父母、家庭和儿童特征相关的风险因素、CPS干预措施及其结果。由于心理虐待的复杂性和评估这一现象的难度,我们重点研究了心理虐待的提取。结果与父母缺乏合作进行了比较,这是一种假定的中等水平的识别困难因素,以及父母滥用酒精的更直接的因素。方法我们开发了一个四阶段的工作流程,包括(1)病例报告收集,(2)基于RAG的病例因素提及评估,(3)从RAG评估中自动提取病例因素,以及(4)病例标记。收集了2008年至2022年间瑞士最大的CPS提供商的所有CPS报告(N = 29,770)。将模型性能与人工编码的评估验证数据进行比较。两名专家审稿人独立地对报告的加权随机样本进行分类,以验证得出共识数据集的结果。结果与共识数据集相比,该模型对心理虐待、缺乏父母合作和父母酗酒进行了分类,准确率分别为82%、83%和95%,超过了两位人类审稿人之间的一致性(79%、80%和93%)。结论即使在复杂的CPS病例因素中,基于rag的评估也能复制人的判断。对于直接分类的因素,如父母酗酒,获得了高准确性和完全的评分者一致性水平。这些方法的有效性源于在文本不同部分的几个句子中存在与案例因素相关的上下文线索,而不是来自整个文本固有的特征。对于父母缺乏合作等个案因素,支持证据和反驳证据都需要评估,以达到最佳的准确性。建议仔细考虑RAG方法的潜在偏差和局限性。这些应用程序可以作为早期预警系统,从大量的病例记录中识别出可能被忽视的关键因素,支持专业人员做出明智的决定,改善高危儿童的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.40
自引率
10.40%
发文量
397
期刊介绍: Official Publication of the International Society for Prevention of Child Abuse and Neglect. Child Abuse & Neglect The International Journal, provides an international, multidisciplinary forum on all aspects of child abuse and neglect, with special emphasis on prevention and treatment; the scope extends further to all those aspects of life which either favor or hinder child development. While contributions will primarily be from the fields of psychology, psychiatry, social work, medicine, nursing, law enforcement, legislature, education, and anthropology, the Journal encourages the concerned lay individual and child-oriented advocate organizations to contribute.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信