利用机器学习对重大精神疾病的非编码自我伤害进行归因和表征

Praveen Kumar, A. Nestsiarovich, S. J. Nelson, B. Kerner, D. Perkins, Christophe Gerard Lambert
{"title":"利用机器学习对重大精神疾病的非编码自我伤害进行归因和表征","authors":"Praveen Kumar, A. Nestsiarovich, S. J. Nelson, B. Kerner, D. Perkins, Christophe Gerard Lambert","doi":"10.1093/jamia/ocz173","DOIUrl":null,"url":null,"abstract":"Abstract Objective We aimed to impute uncoded self-harm in administrative claims data of individuals with major mental illness (MMI), characterize self-harm incidence, and identify factors associated with coding bias. Materials and Methods The IBM MarketScan database (2003-2016) was used to analyze visit-level self-harm in 10 120 030 patients with ≥2 MMI codes. Five machine learning (ML) classifiers were tested on a balanced data subset, with XGBoost selected for the full dataset. Classification performance was validated via random data mislabeling and comparison with a clinician-derived “gold standard.” The incidence of coded and imputed self-harm was characterized by year, patient age, sex, U.S. state, and MMI diagnosis. Results Imputation identified 1 592 703 self-harm events vs 83 113 coded events, with areas under the curve >0.99 for the balanced and full datasets, and 83.5% agreement with the gold standard. The overall coded and imputed self-harm incidence were 0.28% and 5.34%, respectively, varied considerably by age and sex, and was highest in individuals with multiple MMI diagnoses. Self-harm undercoding was higher in male than in female individuals and increased with age. Substance abuse, injuries, poisoning, asphyxiation, brain disorders, harmful thoughts, and psychotherapy were the main features used by ML to classify visits. Discussion Only 1 of 19 self-harm events was coded for individuals with MMI. ML demonstrated excellent performance in recovering self-harm visits. Male individuals and seniors with MMI are particularly vulnerable to self-harm undercoding and may be at risk of not getting appropriate psychiatric care. Conclusions ML can effectively recover unrecorded self-harm in claims data and inform psychiatric epidemiological and observational studies.","PeriodicalId":236137,"journal":{"name":"Journal of the American Medical Informatics Association : JAMIA","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Imputation and characterization of uncoded self-harm in major mental illness using machine learning\",\"authors\":\"Praveen Kumar, A. Nestsiarovich, S. J. Nelson, B. Kerner, D. Perkins, Christophe Gerard Lambert\",\"doi\":\"10.1093/jamia/ocz173\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Objective We aimed to impute uncoded self-harm in administrative claims data of individuals with major mental illness (MMI), characterize self-harm incidence, and identify factors associated with coding bias. Materials and Methods The IBM MarketScan database (2003-2016) was used to analyze visit-level self-harm in 10 120 030 patients with ≥2 MMI codes. Five machine learning (ML) classifiers were tested on a balanced data subset, with XGBoost selected for the full dataset. Classification performance was validated via random data mislabeling and comparison with a clinician-derived “gold standard.” The incidence of coded and imputed self-harm was characterized by year, patient age, sex, U.S. state, and MMI diagnosis. Results Imputation identified 1 592 703 self-harm events vs 83 113 coded events, with areas under the curve >0.99 for the balanced and full datasets, and 83.5% agreement with the gold standard. The overall coded and imputed self-harm incidence were 0.28% and 5.34%, respectively, varied considerably by age and sex, and was highest in individuals with multiple MMI diagnoses. Self-harm undercoding was higher in male than in female individuals and increased with age. Substance abuse, injuries, poisoning, asphyxiation, brain disorders, harmful thoughts, and psychotherapy were the main features used by ML to classify visits. Discussion Only 1 of 19 self-harm events was coded for individuals with MMI. ML demonstrated excellent performance in recovering self-harm visits. Male individuals and seniors with MMI are particularly vulnerable to self-harm undercoding and may be at risk of not getting appropriate psychiatric care. Conclusions ML can effectively recover unrecorded self-harm in claims data and inform psychiatric epidemiological and observational studies.\",\"PeriodicalId\":236137,\"journal\":{\"name\":\"Journal of the American Medical Informatics Association : JAMIA\",\"volume\":\"92 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Medical Informatics Association : JAMIA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jamia/ocz173\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association : JAMIA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamia/ocz173","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

摘要目的研究重大精神疾病(MMI)患者行政索赔数据中未编码的自残行为,分析自残发生率,并确定编码偏差的相关因素。材料与方法采用IBM MarketScan数据库(2003-2016)对10 120 030例MMI码≥2例的访级自残患者进行分析。在平衡数据子集上测试了五个机器学习(ML)分类器,并为完整数据集选择了XGBoost。通过随机数据错误标记和与临床医生衍生的“黄金标准”的比较来验证分类性能。编码和归因自我伤害的发生率以年份、患者年龄、性别、美国州和MMI诊断为特征。结果归因鉴定出自残事件1 592 703件,编码事件83 113件,平衡和完整数据集曲线下面积>0.99,符合金标准83.5%。自我伤害的编码和估算发生率分别为0.28%和5.34%,不同年龄和性别的自我伤害发生率差异较大,在多次MMI诊断的个体中最高。自我伤害的底层编码在男性个体中高于女性个体,并且随着年龄的增长而增加。药物滥用、伤害、中毒、窒息、脑部疾病、有害思想和心理治疗是ML对就诊进行分类的主要特征。在19个自残事件中,只有1个对MMI患者进行了编码。ML在自我伤害访视中表现优异。男性和老年重度精神分裂症患者尤其容易产生自我伤害的潜在风险,并且可能面临得不到适当精神治疗的风险。结论ML可有效恢复索赔资料中未记录的自残行为,为精神病学和观察性研究提供参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Imputation and characterization of uncoded self-harm in major mental illness using machine learning
Abstract Objective We aimed to impute uncoded self-harm in administrative claims data of individuals with major mental illness (MMI), characterize self-harm incidence, and identify factors associated with coding bias. Materials and Methods The IBM MarketScan database (2003-2016) was used to analyze visit-level self-harm in 10 120 030 patients with ≥2 MMI codes. Five machine learning (ML) classifiers were tested on a balanced data subset, with XGBoost selected for the full dataset. Classification performance was validated via random data mislabeling and comparison with a clinician-derived “gold standard.” The incidence of coded and imputed self-harm was characterized by year, patient age, sex, U.S. state, and MMI diagnosis. Results Imputation identified 1 592 703 self-harm events vs 83 113 coded events, with areas under the curve >0.99 for the balanced and full datasets, and 83.5% agreement with the gold standard. The overall coded and imputed self-harm incidence were 0.28% and 5.34%, respectively, varied considerably by age and sex, and was highest in individuals with multiple MMI diagnoses. Self-harm undercoding was higher in male than in female individuals and increased with age. Substance abuse, injuries, poisoning, asphyxiation, brain disorders, harmful thoughts, and psychotherapy were the main features used by ML to classify visits. Discussion Only 1 of 19 self-harm events was coded for individuals with MMI. ML demonstrated excellent performance in recovering self-harm visits. Male individuals and seniors with MMI are particularly vulnerable to self-harm undercoding and may be at risk of not getting appropriate psychiatric care. Conclusions ML can effectively recover unrecorded self-harm in claims data and inform psychiatric epidemiological and observational studies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信