Development and validation of case-finding algorithms for identifying patients with systemic lupus erythematosus in an administrative claim database from tertiary care centers in Japan.

IF 1.9 4区 医学 Q3 RHEUMATOLOGY
Ken-Ei Sada, Yoshia Miyawaki, Ryo Yanai, Takashi Kida, Akira Ohnishi, Ryusuke Yoshimi, Kunihiro Ichinose, Yasuhiro Shimojima
{"title":"Development and validation of case-finding algorithms for identifying patients with systemic lupus erythematosus in an administrative claim database from tertiary care centers in Japan.","authors":"Ken-Ei Sada, Yoshia Miyawaki, Ryo Yanai, Takashi Kida, Akira Ohnishi, Ryusuke Yoshimi, Kunihiro Ichinose, Yasuhiro Shimojima","doi":"10.1093/mr/roaf091","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To develop and validate algorithms for identifying patients with systemic lupus erythematosus (SLE) in Japanese administrative claims databases from tertiary care centers using statistical and machine learning methods.</p><p><strong>Methods: </strong>This retrospective cross-sectional study included 13,538 patients from six hospitals. One-year claims data were linked to chart-confirmed SLE diagnoses. Patients were randomly assigned to training (n = 8,811) and test (n = 3,775) sets; an external validation set (n = 952) was drawn from another hospital. Feature selection used Least Absolute Shrinkage and Selection Operator (LASSO), Boruta, and Recursive Feature Elimination. Logistic regression, random forest, and decision tree models were trained with synthetic oversampling to address class imbalance. Model performance was evaluated using the Area Under the Receiver Operating Characteristic Curve (AUROC), and other standard performance metrics.</p><p><strong>Results: </strong>The random forest model achieved the best performance (AUROC: 0.995; sensitivity: 0.971; specificity: 0.969). A simplified rule based on diagnosis code and anti-double-stranded DNA antibody testing showed high accuracy in both test and validation sets. Adding urine sediment examination modestly improved sensitivity but reduced specificity.</p><p><strong>Conclusion: </strong>A claims-based algorithm incorporating diagnosis codes and standard laboratory tests accurately identified patients with SLE facilitating reliable use of administrative data in real-world research.</p>","PeriodicalId":18705,"journal":{"name":"Modern Rheumatology","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Modern Rheumatology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/mr/roaf091","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RHEUMATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: To develop and validate algorithms for identifying patients with systemic lupus erythematosus (SLE) in Japanese administrative claims databases from tertiary care centers using statistical and machine learning methods.

Methods: This retrospective cross-sectional study included 13,538 patients from six hospitals. One-year claims data were linked to chart-confirmed SLE diagnoses. Patients were randomly assigned to training (n = 8,811) and test (n = 3,775) sets; an external validation set (n = 952) was drawn from another hospital. Feature selection used Least Absolute Shrinkage and Selection Operator (LASSO), Boruta, and Recursive Feature Elimination. Logistic regression, random forest, and decision tree models were trained with synthetic oversampling to address class imbalance. Model performance was evaluated using the Area Under the Receiver Operating Characteristic Curve (AUROC), and other standard performance metrics.

Results: The random forest model achieved the best performance (AUROC: 0.995; sensitivity: 0.971; specificity: 0.969). A simplified rule based on diagnosis code and anti-double-stranded DNA antibody testing showed high accuracy in both test and validation sets. Adding urine sediment examination modestly improved sensitivity but reduced specificity.

Conclusion: A claims-based algorithm incorporating diagnosis codes and standard laboratory tests accurately identified patients with SLE facilitating reliable use of administrative data in real-world research.

在日本三级保健中心的行政索赔数据库中识别系统性红斑狼疮患者的病例查找算法的开发和验证。
目的:利用统计和机器学习方法,开发并验证日本三级医疗中心行政索赔数据库中系统性红斑狼疮(SLE)患者的识别算法。方法:回顾性横断面研究纳入6家医院13538例患者。一年的索赔数据与图表确认的SLE诊断有关。患者被随机分配到训练组(n = 8,811)和测试组(n = 3,775);外部验证组(n = 952)来自另一家医院。特征选择使用最小绝对收缩和选择算子(LASSO)、Boruta和递归特征消除。逻辑回归、随机森林和决策树模型使用合成过采样进行训练,以解决类别不平衡问题。使用受试者工作特征曲线下面积(AUROC)和其他标准性能指标评估模型性能。结果:随机森林模型表现最佳(AUROC: 0.995,灵敏度:0.971,特异度:0.969)。基于诊断代码和抗双链DNA抗体检测的简化规则在测试集和验证集上都具有较高的准确性。增加尿沉渣检查适度提高了敏感性,但降低了特异性。结论:结合诊断代码和标准实验室检查的基于索赔的算法准确识别SLE患者,促进了在现实世界研究中可靠地使用管理数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Modern Rheumatology
Modern Rheumatology RHEUMATOLOGY-
CiteScore
4.90
自引率
9.10%
发文量
146
审稿时长
1.5 months
期刊介绍: Modern Rheumatology publishes original papers in English on research pertinent to rheumatology and associated areas such as pathology, physiology, clinical immunology, microbiology, biochemistry, experimental animal models, pharmacology, and orthopedic surgery. Occasional reviews of topics which may be of wide interest to the readership will be accepted. In addition, concise papers of special scientific importance that represent definitive and original studies will be considered. Modern Rheumatology is currently indexed in Science Citation Index Expanded (SciSearch), Journal Citation Reports/Science Edition, PubMed/Medline, SCOPUS, EMBASE, Chemical Abstracts Service (CAS), Google Scholar, EBSCO, CSA, Academic OneFile, Current Abstracts, Elsevier Biobase, Gale, Health Reference Center Academic, OCLC, SCImago, Summon by Serial Solutions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信