Dragan Stoll , Andreas Jud , Samuel Wehrli , David Lätsch , Selina Steinmann , Meret Sophie Wallimann , Julia Quehenberger
{"title":"案例报告解锁:利用检索增强生成与大语言模型推进儿童心理虐待研究","authors":"Dragan Stoll , Andreas Jud , Samuel Wehrli , David Lätsch , Selina Steinmann , Meret Sophie Wallimann , Julia Quehenberger","doi":"10.1016/j.chiabu.2025.107653","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Research on psychological child maltreatment is impeded by a lack of high-quality structured data. Crucial information is often documented in child protective services (CPS) case files, but only in narrative form. Recent research on the utilization of retrieval-augmented generation (RAG) methods with large language models (LLMs) for extracting structured data from narratives has demonstrated significant potential. RAG methods can facilitate automated classification, thereby eliminating the need for laborious annotation.</div></div><div><h3>Objective</h3><div>We aimed to extract structured data from narrative casework reports by utilizing RAG and LLMs to classify mentions of 24 CPS case factors. These factors encompass child maltreatment indicators, risk factors associated with parental, family, and child characteristics, CPS interventions, and their outcomes. We focused on examining the extraction of psychological abuse due to its complex nature and difficulty in assessing this phenomenon. The results were compared with parental lack of cooperation, a factor with a presumed medium level of recognition difficulty, and a more straightforward factor of parental alcohol abuse.</div></div><div><h3>Methods</h3><div>We developed a four-stage workflow comprising of (1) case reports collection, (2) RAG based assessment of case factor mentions, (3) automated extraction of case factors from RAG assessments, and (4) case labeling. All CPS reports (<em>N</em> = 29,770) between 2008 and 2022 from Switzerland's largest CPS provider were collected. Model performance was evaluated compared against human-coded validation data on assessments. Two expert human reviewers independently classified weighted random samples of reports to validate the findings from which a consensus dataset was derived.</div></div><div><h3>Results</h3><div>The model classified psychological abuse, lack of parental cooperation and parental alcohol abuse compared to a consensus dataset, with an accuracy of 82 %, 83 %, and 95 %, respectively, surpassing the agreement rates between the two human reviewers (79 %, 80 %, and 93 %).</div></div><div><h3>Conclusions</h3><div>RAG based assessment can replicate human judgment even on complex CPS case factors. High accuracy and complete inter-rater agreement level was achieved for factors that are straightforward to classify, such as parental alcohol abuse. The effectiveness of these methods stems from the presence of contextual clues related to case factors within a few sentences across different sections of the text, rather than from characteristics inherent to the entire text. For case factors such as parental lack of cooperation, both supporting and refuting evidence needs to be assessed to achieve optimal accuracy. Careful consideration of potential biases and limitations in RAG methods is advised. These applications can serve as early warning systems, by identifying critical factors from extensive case notes that might otherwise be overlooked, supporting professionals in making informed decisions and improving outcomes for at-risk children.</div></div>","PeriodicalId":51343,"journal":{"name":"Child Abuse & Neglect","volume":"169 ","pages":"Article 107653"},"PeriodicalIF":3.4000,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Case reports unlocked: Leveraging retrieval-augmented generation with large language models to advance research on psychological child maltreatment\",\"authors\":\"Dragan Stoll , Andreas Jud , Samuel Wehrli , David Lätsch , Selina Steinmann , Meret Sophie Wallimann , Julia Quehenberger\",\"doi\":\"10.1016/j.chiabu.2025.107653\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Research on psychological child maltreatment is impeded by a lack of high-quality structured data. Crucial information is often documented in child protective services (CPS) case files, but only in narrative form. Recent research on the utilization of retrieval-augmented generation (RAG) methods with large language models (LLMs) for extracting structured data from narratives has demonstrated significant potential. RAG methods can facilitate automated classification, thereby eliminating the need for laborious annotation.</div></div><div><h3>Objective</h3><div>We aimed to extract structured data from narrative casework reports by utilizing RAG and LLMs to classify mentions of 24 CPS case factors. These factors encompass child maltreatment indicators, risk factors associated with parental, family, and child characteristics, CPS interventions, and their outcomes. We focused on examining the extraction of psychological abuse due to its complex nature and difficulty in assessing this phenomenon. The results were compared with parental lack of cooperation, a factor with a presumed medium level of recognition difficulty, and a more straightforward factor of parental alcohol abuse.</div></div><div><h3>Methods</h3><div>We developed a four-stage workflow comprising of (1) case reports collection, (2) RAG based assessment of case factor mentions, (3) automated extraction of case factors from RAG assessments, and (4) case labeling. All CPS reports (<em>N</em> = 29,770) between 2008 and 2022 from Switzerland's largest CPS provider were collected. Model performance was evaluated compared against human-coded validation data on assessments. Two expert human reviewers independently classified weighted random samples of reports to validate the findings from which a consensus dataset was derived.</div></div><div><h3>Results</h3><div>The model classified psychological abuse, lack of parental cooperation and parental alcohol abuse compared to a consensus dataset, with an accuracy of 82 %, 83 %, and 95 %, respectively, surpassing the agreement rates between the two human reviewers (79 %, 80 %, and 93 %).</div></div><div><h3>Conclusions</h3><div>RAG based assessment can replicate human judgment even on complex CPS case factors. High accuracy and complete inter-rater agreement level was achieved for factors that are straightforward to classify, such as parental alcohol abuse. The effectiveness of these methods stems from the presence of contextual clues related to case factors within a few sentences across different sections of the text, rather than from characteristics inherent to the entire text. For case factors such as parental lack of cooperation, both supporting and refuting evidence needs to be assessed to achieve optimal accuracy. Careful consideration of potential biases and limitations in RAG methods is advised. These applications can serve as early warning systems, by identifying critical factors from extensive case notes that might otherwise be overlooked, supporting professionals in making informed decisions and improving outcomes for at-risk children.</div></div>\",\"PeriodicalId\":51343,\"journal\":{\"name\":\"Child Abuse & Neglect\",\"volume\":\"169 \",\"pages\":\"Article 107653\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Child Abuse & Neglect\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0145213425004090\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"FAMILY STUDIES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Child Abuse & Neglect","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0145213425004090","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FAMILY STUDIES","Score":null,"Total":0}
Case reports unlocked: Leveraging retrieval-augmented generation with large language models to advance research on psychological child maltreatment
Background
Research on psychological child maltreatment is impeded by a lack of high-quality structured data. Crucial information is often documented in child protective services (CPS) case files, but only in narrative form. Recent research on the utilization of retrieval-augmented generation (RAG) methods with large language models (LLMs) for extracting structured data from narratives has demonstrated significant potential. RAG methods can facilitate automated classification, thereby eliminating the need for laborious annotation.
Objective
We aimed to extract structured data from narrative casework reports by utilizing RAG and LLMs to classify mentions of 24 CPS case factors. These factors encompass child maltreatment indicators, risk factors associated with parental, family, and child characteristics, CPS interventions, and their outcomes. We focused on examining the extraction of psychological abuse due to its complex nature and difficulty in assessing this phenomenon. The results were compared with parental lack of cooperation, a factor with a presumed medium level of recognition difficulty, and a more straightforward factor of parental alcohol abuse.
Methods
We developed a four-stage workflow comprising of (1) case reports collection, (2) RAG based assessment of case factor mentions, (3) automated extraction of case factors from RAG assessments, and (4) case labeling. All CPS reports (N = 29,770) between 2008 and 2022 from Switzerland's largest CPS provider were collected. Model performance was evaluated compared against human-coded validation data on assessments. Two expert human reviewers independently classified weighted random samples of reports to validate the findings from which a consensus dataset was derived.
Results
The model classified psychological abuse, lack of parental cooperation and parental alcohol abuse compared to a consensus dataset, with an accuracy of 82 %, 83 %, and 95 %, respectively, surpassing the agreement rates between the two human reviewers (79 %, 80 %, and 93 %).
Conclusions
RAG based assessment can replicate human judgment even on complex CPS case factors. High accuracy and complete inter-rater agreement level was achieved for factors that are straightforward to classify, such as parental alcohol abuse. The effectiveness of these methods stems from the presence of contextual clues related to case factors within a few sentences across different sections of the text, rather than from characteristics inherent to the entire text. For case factors such as parental lack of cooperation, both supporting and refuting evidence needs to be assessed to achieve optimal accuracy. Careful consideration of potential biases and limitations in RAG methods is advised. These applications can serve as early warning systems, by identifying critical factors from extensive case notes that might otherwise be overlooked, supporting professionals in making informed decisions and improving outcomes for at-risk children.
期刊介绍:
Official Publication of the International Society for Prevention of Child Abuse and Neglect. Child Abuse & Neglect The International Journal, provides an international, multidisciplinary forum on all aspects of child abuse and neglect, with special emphasis on prevention and treatment; the scope extends further to all those aspects of life which either favor or hinder child development. While contributions will primarily be from the fields of psychology, psychiatry, social work, medicine, nursing, law enforcement, legislature, education, and anthropology, the Journal encourages the concerned lay individual and child-oriented advocate organizations to contribute.