{"title":"Predicting parole hearing result using machine learning","authors":"Tribhuwan Singh, Yashvardhan Jain, Vaibhav Kumar","doi":"10.1109/ICETCCT.2017.8280342","DOIUrl":null,"url":null,"abstract":"Machine learning is the science of getting computers to learn without being explicitly programmed. Tom Mitchell gives a formal definition of machine learning according to which, “A computer program is said to learn from experience ‘E’ with respect to some class of tasks ‘T’ and performance measure ‘P’, if its performance at tasks in ‘T’, as measured by ‘P’, improves with experience ‘E’ “. Thousands of parole eligible criminals are denied release every year. We need to consider what effect different factors have on the decision of parole commissioners so that we can know the probability of a prisoner being released from prison. We aim to train a machine learning model with training data of about 20,000 criminals in New York State that would then be able to predict whether a criminal would be granted parole or not. The model would take into consideration various attributes of each criminal like age, gender, race, charges, interview facility, etc. We assume that all these factors are considered by the parole commissioners when taking a decision. The raw data is taken and cleaned before it can be fed to a machine learning algorithm. We convert the string values of our data into numeric values by assigning a unique numeric value to each unique string value. Once we have a dataset with only numeric values, we normalize each attribute. T h e n we feed the normalized data to the Neural Network backpropagation algorithm and the graphical visualizations of the results are obtained using MATLAB. We are able to develop a model that can predict the result based on the attributes with an accuracy of 76.8%.","PeriodicalId":436902,"journal":{"name":"2017 International Conference on Emerging Trends in Computing and Communication Technologies (ICETCCT)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Emerging Trends in Computing and Communication Technologies (ICETCCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICETCCT.2017.8280342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Machine learning is the science of getting computers to learn without being explicitly programmed. Tom Mitchell gives a formal definition of machine learning according to which, “A computer program is said to learn from experience ‘E’ with respect to some class of tasks ‘T’ and performance measure ‘P’, if its performance at tasks in ‘T’, as measured by ‘P’, improves with experience ‘E’ “. Thousands of parole eligible criminals are denied release every year. We need to consider what effect different factors have on the decision of parole commissioners so that we can know the probability of a prisoner being released from prison. We aim to train a machine learning model with training data of about 20,000 criminals in New York State that would then be able to predict whether a criminal would be granted parole or not. The model would take into consideration various attributes of each criminal like age, gender, race, charges, interview facility, etc. We assume that all these factors are considered by the parole commissioners when taking a decision. The raw data is taken and cleaned before it can be fed to a machine learning algorithm. We convert the string values of our data into numeric values by assigning a unique numeric value to each unique string value. Once we have a dataset with only numeric values, we normalize each attribute. T h e n we feed the normalized data to the Neural Network backpropagation algorithm and the graphical visualizations of the results are obtained using MATLAB. We are able to develop a model that can predict the result based on the attributes with an accuracy of 76.8%.
机器学习是一门让计算机在没有明确编程的情况下进行学习的科学。Tom Mitchell给出了一个机器学习的正式定义,根据这个定义,“一个计算机程序据说从经验' E '中学习一些任务' T '和性能指标' P ',如果它在任务' T '中的表现,根据' P '来衡量,随着经验' E '而提高”。每年有数千名符合假释条件的罪犯被拒绝释放。我们需要考虑不同的因素对假释专员的决定有什么影响,这样我们才能知道囚犯被释放的概率。我们的目标是用纽约州大约2万名罪犯的训练数据训练一个机器学习模型,然后能够预测罪犯是否会获得假释。该模型将考虑每个罪犯的各种属性,如年龄、性别、种族、指控、面谈设施等。我们假设假释专员在做出决定时考虑了所有这些因素。在将原始数据输入机器学习算法之前,会对其进行采集和清理。通过为每个唯一的字符串值赋一个唯一的数值,将数据的字符串值转换为数值。一旦我们有了一个只有数值的数据集,我们就对每个属性进行规范化。最后将归一化后的数据输入神经网络反向传播算法,并利用MATLAB实现了结果的图形化可视化。我们能够建立一个基于属性预测结果的模型,准确率为76.8%。