Aditi Jain, Amelia Norman, L. Alonzi, Michael C. Smith, Neal Goodloe, K. P. White
{"title":"通过在独立数据集之间匹配记录,将囚犯与心理健康服务联系起来","authors":"Aditi Jain, Amelia Norman, L. Alonzi, Michael C. Smith, Neal Goodloe, K. P. White","doi":"10.1109/sieds55548.2022.9799372","DOIUrl":null,"url":null,"abstract":"Officials in the United States correctional system have long been aware of the significant role that serious mental illness (SMI) plays in recidivism. In a 2011 study, Bronson reported that 68% of prison inmates with diagnosed SMI returned to custody at least once within 4 years, 8% higher than those without SMI [1]. This issue is especially prevalent in regional jails, where 63% of male inmates and 75% of female inmates in regional jails suffer from symptoms of serious mental illness every year, making immediate assistance to these individuals crucial [2]. In response, a team of University of Virginia (UVA) Systems Engineering students work in collaboration with an array of organizations in the Charlottesville-Albemarle region to identify and provide local jail inmates with the mental health services they need, and produce policy recommendations to improve conditions for individuals with SMI who are prone to exposure to the criminal justice system [3]. The current Capstone team consists of undergraduate UVA students who perform analysis using the data provided by the organizations, enabling the community to make informed decisions. However, these decisions are hindered because, since the data sets from different organizations are not linked with a unique identifier for individuals across the agencies that are responsible for the care and supervision of individuals suffering from SMI. This makes the matching of individuals between data sets difficult. This issue is exacerbated by recidivism, which results in multiple occurrences of similar (or identical) values, complicating typical record matching methods, which often rely on one-to-one matching methods. Moreover, the data include protected personal identifiers (PPI) and HIPPA protected data, which also restricts data sharing among the agencies. Thus, any effort to merge the data must adhere to applicable data security rules and non-disclosure agreements. To resolve these matching issues, we first condensed the reiterations of data within each dataset into one line per individual and included an internal consistency metric that reflects possible changes (i.e. preferred name, address, etc.) that could affect data matching. Then, we developed a matching algorithm using the Record Linkage package on Python that compares two data sets consisting of resident information from Region Ten Community Services (R10) and the Jail Management System (JMS) at the Albemarle-Charlottesville Regional Jail (ACRJ) [4]. As a result of this process, we identified over 95 additional matches and another 50 uncertain matches that required human spot-checking, which is an improvement of 10% to previous methods of record matching applied to the data set. Such results could have significant results to the Capstone team as well as to other fields of research, especially regarding medical, financial, or other forms of data that deal with changing data over time.","PeriodicalId":286724,"journal":{"name":"2022 Systems and Information Engineering Design Symposium (SIEDS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Linking Inmates to Mental Health Services by Matching Records Between Independent Data Sets\",\"authors\":\"Aditi Jain, Amelia Norman, L. Alonzi, Michael C. Smith, Neal Goodloe, K. P. White\",\"doi\":\"10.1109/sieds55548.2022.9799372\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Officials in the United States correctional system have long been aware of the significant role that serious mental illness (SMI) plays in recidivism. In a 2011 study, Bronson reported that 68% of prison inmates with diagnosed SMI returned to custody at least once within 4 years, 8% higher than those without SMI [1]. This issue is especially prevalent in regional jails, where 63% of male inmates and 75% of female inmates in regional jails suffer from symptoms of serious mental illness every year, making immediate assistance to these individuals crucial [2]. In response, a team of University of Virginia (UVA) Systems Engineering students work in collaboration with an array of organizations in the Charlottesville-Albemarle region to identify and provide local jail inmates with the mental health services they need, and produce policy recommendations to improve conditions for individuals with SMI who are prone to exposure to the criminal justice system [3]. The current Capstone team consists of undergraduate UVA students who perform analysis using the data provided by the organizations, enabling the community to make informed decisions. However, these decisions are hindered because, since the data sets from different organizations are not linked with a unique identifier for individuals across the agencies that are responsible for the care and supervision of individuals suffering from SMI. This makes the matching of individuals between data sets difficult. This issue is exacerbated by recidivism, which results in multiple occurrences of similar (or identical) values, complicating typical record matching methods, which often rely on one-to-one matching methods. Moreover, the data include protected personal identifiers (PPI) and HIPPA protected data, which also restricts data sharing among the agencies. Thus, any effort to merge the data must adhere to applicable data security rules and non-disclosure agreements. To resolve these matching issues, we first condensed the reiterations of data within each dataset into one line per individual and included an internal consistency metric that reflects possible changes (i.e. preferred name, address, etc.) that could affect data matching. Then, we developed a matching algorithm using the Record Linkage package on Python that compares two data sets consisting of resident information from Region Ten Community Services (R10) and the Jail Management System (JMS) at the Albemarle-Charlottesville Regional Jail (ACRJ) [4]. As a result of this process, we identified over 95 additional matches and another 50 uncertain matches that required human spot-checking, which is an improvement of 10% to previous methods of record matching applied to the data set. Such results could have significant results to the Capstone team as well as to other fields of research, especially regarding medical, financial, or other forms of data that deal with changing data over time.\",\"PeriodicalId\":286724,\"journal\":{\"name\":\"2022 Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/sieds55548.2022.9799372\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/sieds55548.2022.9799372","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Linking Inmates to Mental Health Services by Matching Records Between Independent Data Sets
Officials in the United States correctional system have long been aware of the significant role that serious mental illness (SMI) plays in recidivism. In a 2011 study, Bronson reported that 68% of prison inmates with diagnosed SMI returned to custody at least once within 4 years, 8% higher than those without SMI [1]. This issue is especially prevalent in regional jails, where 63% of male inmates and 75% of female inmates in regional jails suffer from symptoms of serious mental illness every year, making immediate assistance to these individuals crucial [2]. In response, a team of University of Virginia (UVA) Systems Engineering students work in collaboration with an array of organizations in the Charlottesville-Albemarle region to identify and provide local jail inmates with the mental health services they need, and produce policy recommendations to improve conditions for individuals with SMI who are prone to exposure to the criminal justice system [3]. The current Capstone team consists of undergraduate UVA students who perform analysis using the data provided by the organizations, enabling the community to make informed decisions. However, these decisions are hindered because, since the data sets from different organizations are not linked with a unique identifier for individuals across the agencies that are responsible for the care and supervision of individuals suffering from SMI. This makes the matching of individuals between data sets difficult. This issue is exacerbated by recidivism, which results in multiple occurrences of similar (or identical) values, complicating typical record matching methods, which often rely on one-to-one matching methods. Moreover, the data include protected personal identifiers (PPI) and HIPPA protected data, which also restricts data sharing among the agencies. Thus, any effort to merge the data must adhere to applicable data security rules and non-disclosure agreements. To resolve these matching issues, we first condensed the reiterations of data within each dataset into one line per individual and included an internal consistency metric that reflects possible changes (i.e. preferred name, address, etc.) that could affect data matching. Then, we developed a matching algorithm using the Record Linkage package on Python that compares two data sets consisting of resident information from Region Ten Community Services (R10) and the Jail Management System (JMS) at the Albemarle-Charlottesville Regional Jail (ACRJ) [4]. As a result of this process, we identified over 95 additional matches and another 50 uncertain matches that required human spot-checking, which is an improvement of 10% to previous methods of record matching applied to the data set. Such results could have significant results to the Capstone team as well as to other fields of research, especially regarding medical, financial, or other forms of data that deal with changing data over time.