Transparent and bias-resilient AI framework for recidivism prediction using deep learning and clustering techniques in criminal justice

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-05-01 DOI:10.1016/j.asoc.2025.113160

Muhammed Cavus , Muhammed Nurullah Benli , Usame Altuntas , Mahmut Sari , Huseyin Ayan , Yusuf Furkan Ugurluoglu

{"title":"Transparent and bias-resilient AI framework for recidivism prediction using deep learning and clustering techniques in criminal justice","authors":"Muhammed Cavus , Muhammed Nurullah Benli , Usame Altuntas , Mahmut Sari , Huseyin Ayan , Yusuf Furkan Ugurluoglu","doi":"10.1016/j.asoc.2025.113160","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents the Recidivism Clustering Network (RCN), an effective approach for predicting repeat offenses using deep learning (DL), clustering, and explainable AI (XAI). The RCN improves offender profiling for more accurate and interpretable recidivism predictions, aligning with key legal principles like fair sentencing, transparency, and non-discrimination. The RCN employs machine learning (ML) models optimized with a Keras tuner, using the Synthetic Minority Over-sampling Technique (SMOTE) to handle class imbalance. With about 75% accuracy, the model shows strong recall, identifying 10,661 recidivists but producing 4,038 false positives—indicating a trade-off between sensitivity and specificity. Beyond predictions, RCN integrates clustering methods, including k-means, principal component analysis (PCA), and t-distributed Stochastic Neighbor Embedding (t-SNE), to identify hidden patterns within offender data. Visualizations reveal distinct clusters, linking characteristics, such as age, to recidivism behaviors. SHapley Additive exPlanations (SHAP) values enhance interpretability, showing that factors like time since the last conviction and age significantly impact predictions. The RCN approach offers substantial potential for criminal justice applications by combining predictive power with actionable insights, supporting a more ethical and accountable use of ML in offender profiling and aiding in fairer recidivism prevention strategies. The code and data are publicly available on GitHub at <span><span>https://github.com/cavusmuhammed68/Recidivism-Clustering-Network-RCN-</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"176 ","pages":"Article 113160"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625004715","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper presents the Recidivism Clustering Network (RCN), an effective approach for predicting repeat offenses using deep learning (DL), clustering, and explainable AI (XAI). The RCN improves offender profiling for more accurate and interpretable recidivism predictions, aligning with key legal principles like fair sentencing, transparency, and non-discrimination. The RCN employs machine learning (ML) models optimized with a Keras tuner, using the Synthetic Minority Over-sampling Technique (SMOTE) to handle class imbalance. With about 75% accuracy, the model shows strong recall, identifying 10,661 recidivists but producing 4,038 false positives—indicating a trade-off between sensitivity and specificity. Beyond predictions, RCN integrates clustering methods, including k-means, principal component analysis (PCA), and t-distributed Stochastic Neighbor Embedding (t-SNE), to identify hidden patterns within offender data. Visualizations reveal distinct clusters, linking characteristics, such as age, to recidivism behaviors. SHapley Additive exPlanations (SHAP) values enhance interpretability, showing that factors like time since the last conviction and age significantly impact predictions. The RCN approach offers substantial potential for criminal justice applications by combining predictive power with actionable insights, supporting a more ethical and accountable use of ML in offender profiling and aiding in fairer recidivism prevention strategies. The code and data are publicly available on GitHub at https://github.com/cavusmuhammed68/Recidivism-Clustering-Network-RCN-.

Abstract Image

查看原文本刊更多论文

在刑事司法中使用深度学习和聚类技术进行累犯预测的透明和抗偏见的人工智能框架

本文介绍了累犯聚类网络（RCN），这是一种使用深度学习（DL）、聚类和可解释人工智能（XAI）预测重复犯罪的有效方法。RCN改进了罪犯的特征分析，以更准确和可解释的累犯预测，与公平判决、透明度和非歧视等关键法律原则保持一致。RCN采用Keras调谐器优化的机器学习（ML）模型，使用合成少数过采样技术（SMOTE）来处理类不平衡。该模型的准确率约为75%，显示出很强的召回率，识别出10,661名累犯，但产生了4,038个假阳性——这表明了敏感性和特异性之间的权衡。除了预测之外，RCN还集成了聚类方法，包括k-means、主成分分析（PCA）和t分布随机邻居嵌入（t-SNE），以识别罪犯数据中的隐藏模式。可视化显示了不同的集群，将特征（如年龄）与累犯行为联系起来。SHapley加性解释（SHAP）值增强了可解释性，表明距离上次定罪的时间和年龄等因素显著影响预测。RCN方法通过将预测能力与可操作的见解相结合，为刑事司法应用提供了巨大的潜力，支持在罪犯分析中更加道德和负责任地使用ML，并帮助制定更公平的累犯预防策略。代码和数据可在GitHub上公开获取https://github.com/cavusmuhammed68/Recidivism-Clustering-Network-RCN-。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.