Zohar Elyoseph, Inbar Levkovich, Eyal Rabin, Gal Shemo, Tal Szpiler, Dorit Hadar Shoval, Yossi Levi Belz
{"title":"Applying language models for suicide prevention: evaluating news article adherence to WHO reporting guidelines.","authors":"Zohar Elyoseph, Inbar Levkovich, Eyal Rabin, Gal Shemo, Tal Szpiler, Dorit Hadar Shoval, Yossi Levi Belz","doi":"10.1038/s44184-025-00139-5","DOIUrl":null,"url":null,"abstract":"<p><p>The responsible reporting of suicide in media is crucial for public health, as irresponsible coverage can potentially promote suicidal behaviors. This study examined the capability of generative artificial intelligence, specifically large language models, to evaluate news articles on suicide according to World Health Organization (WHO) guidelines, potentially offering a scalable solution to this critical issue. The research compared assessments of 40 suicide-related articles by two human reviewers and two large language models (ChatGPT-4 and Claude Opus). Results showed strong agreement between ChatGPT-4 and human reviewers (ICC = 0.81-0.87), with no significant differences in overall evaluations. Claude Opus demonstrated good agreement with human reviewers (ICC = 0.73-0.78) but tended to estimate lower compliance. These findings suggest large language models' potential in promoting responsible suicide reporting, with significant implications for public health. The technology could provide immediate feedback to journalists, encouraging adherence to best practices and potentially transforming public narratives around suicide.</p>","PeriodicalId":74321,"journal":{"name":"Npj mental health research","volume":"4 1","pages":"25"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12181428/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Npj mental health research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s44184-025-00139-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The responsible reporting of suicide in media is crucial for public health, as irresponsible coverage can potentially promote suicidal behaviors. This study examined the capability of generative artificial intelligence, specifically large language models, to evaluate news articles on suicide according to World Health Organization (WHO) guidelines, potentially offering a scalable solution to this critical issue. The research compared assessments of 40 suicide-related articles by two human reviewers and two large language models (ChatGPT-4 and Claude Opus). Results showed strong agreement between ChatGPT-4 and human reviewers (ICC = 0.81-0.87), with no significant differences in overall evaluations. Claude Opus demonstrated good agreement with human reviewers (ICC = 0.73-0.78) but tended to estimate lower compliance. These findings suggest large language models' potential in promoting responsible suicide reporting, with significant implications for public health. The technology could provide immediate feedback to journalists, encouraging adherence to best practices and potentially transforming public narratives around suicide.