Prediction of Recidivism and Detection of Risk Factors Under Different Time Windows Using Machine Learning Techniques

Di Mu, Simai Zhang, Ting Zhu, Yong Zhou, Wei Zhang
{"title":"Prediction of Recidivism and Detection of Risk Factors Under Different Time Windows Using Machine Learning Techniques","authors":"Di Mu, Simai Zhang, Ting Zhu, Yong Zhou, Wei Zhang","doi":"10.1177/08944393241226607","DOIUrl":null,"url":null,"abstract":"Following a comprehensive analysis of the initial three generations of prisoner risk assessment tools, the field has observed a notable prominence in the integration of fourth-generation tools and machine learning techniques. However, limited efforts have been made to address the explainability of data-driven prediction models and their connection with treatment recommendations. Our primary objective was to develop predictive models for assessing the likelihood of recidivism among prisoners released from their index incarceration within 1-year, 2-year, and 5-year timeframes. We aimed to enhance interpretability using SHapley Additive exPlanations (SHAP). We collected data from 20,457 in-prison records from February 10, 2005, to August 25, 2021, sourced from a Southwestern China prison’s data management system. Recidivism records were officially determined through data mining from an official website and combined identification data from neighboring prisons. We employed five machine learning algorithms, considering sociodemographic, physical health, psychological assessments, criminological characteristics, crime history, social support, and in-prison behaviors as factors. For interpretability, SHAP was applied to reveal feature contributions. Findings indicated that young prisoners accused of larceny, previous convictions, lower fines, and limited family support faced higher reoffending risk. Conversely, middle-aged and senior prisoners with no prior convictions, lower monthly supermarket expenses, and positive psychological test results had lower reoffending risk. We also explored interactions between significant predictive features, such as prisoner age at incarceration initiation and primary accusation, and the duration of current incarceration and cumulative prior incarcerations. Notably, our models consistently exhibited high performance, as shown by AUC on the test dataset across time windows. Interpretability results provided insights into evolving risk factors over time, valuable for intervention with high-risk individuals. These insights, with additional validation, could offer dynamic prisoner information for stakeholders. Moreover, interpretability results can be seamlessly integrated into prison and court management systems as a valuable risk assessment tool.","PeriodicalId":506768,"journal":{"name":"Social Science Computer Review","volume":"13 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Science Computer Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/08944393241226607","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Following a comprehensive analysis of the initial three generations of prisoner risk assessment tools, the field has observed a notable prominence in the integration of fourth-generation tools and machine learning techniques. However, limited efforts have been made to address the explainability of data-driven prediction models and their connection with treatment recommendations. Our primary objective was to develop predictive models for assessing the likelihood of recidivism among prisoners released from their index incarceration within 1-year, 2-year, and 5-year timeframes. We aimed to enhance interpretability using SHapley Additive exPlanations (SHAP). We collected data from 20,457 in-prison records from February 10, 2005, to August 25, 2021, sourced from a Southwestern China prison’s data management system. Recidivism records were officially determined through data mining from an official website and combined identification data from neighboring prisons. We employed five machine learning algorithms, considering sociodemographic, physical health, psychological assessments, criminological characteristics, crime history, social support, and in-prison behaviors as factors. For interpretability, SHAP was applied to reveal feature contributions. Findings indicated that young prisoners accused of larceny, previous convictions, lower fines, and limited family support faced higher reoffending risk. Conversely, middle-aged and senior prisoners with no prior convictions, lower monthly supermarket expenses, and positive psychological test results had lower reoffending risk. We also explored interactions between significant predictive features, such as prisoner age at incarceration initiation and primary accusation, and the duration of current incarceration and cumulative prior incarcerations. Notably, our models consistently exhibited high performance, as shown by AUC on the test dataset across time windows. Interpretability results provided insights into evolving risk factors over time, valuable for intervention with high-risk individuals. These insights, with additional validation, could offer dynamic prisoner information for stakeholders. Moreover, interpretability results can be seamlessly integrated into prison and court management systems as a valuable risk assessment tool.
利用机器学习技术预测累犯率并检测不同时间窗口下的风险因素
在对最初三代囚犯风险评估工具进行全面分析之后,该领域观察到第四代工具与机器学习技术的整合明显突出。然而,在解决数据驱动的预测模型的可解释性及其与治疗建议的联系方面所做的努力还很有限。我们的主要目标是开发预测模型,用于评估从指数监禁中释放的囚犯在 1 年、2 年和 5 年时间框架内重新犯罪的可能性。我们的目标是使用 SHapley Additive exPlanations (SHAP) 增强可解释性。我们从中国西南某监狱的数据管理系统中收集了 2005 年 2 月 10 日至 2021 年 8 月 25 日期间的 20457 条在狱记录。累犯记录是通过对官方网站的数据挖掘,并结合邻近监狱的身份识别数据正式确定的。我们采用了五种机器学习算法,将社会人口、身体健康、心理评估、犯罪学特征、犯罪史、社会支持和狱中行为作为考虑因素。为了便于解释,还采用了 SHAP 来揭示特征贡献。研究结果表明,被控盗窃、有前科、罚金较低和家庭支持有限的年轻囚犯面临较高的再犯罪风险。相反,没有前科、每月超市支出较少以及心理测试结果呈阳性的中年和老年囚犯的再犯罪风险较低。我们还探索了重要预测特征之间的交互作用,如囚犯入狱时的年龄和主要指控,以及当前监禁时间和累积前科。值得注意的是,从跨时间窗口测试数据集的 AUC 来看,我们的模型始终表现出很高的性能。可解释性结果提供了对随时间演变的风险因素的见解,这对干预高风险人群非常有价值。这些见解经过进一步验证后,可为利益相关者提供动态的囚犯信息。此外,可解释性结果还可以无缝集成到监狱和法院管理系统中,成为一种有价值的风险评估工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信