量化和改善类风湿关节炎算法性能在生物库设置

IF 4.6 2区 医学 Q1 RHEUMATOLOGY
Vanessa L. Kronzer , Katrina A. Williamson , Andrew C. Hanson , Jennifer A. Sletten , Jeffrey A. Sparks , John M. Davis III , Cynthia S. Crowson
{"title":"量化和改善类风湿关节炎算法性能在生物库设置","authors":"Vanessa L. Kronzer ,&nbsp;Katrina A. Williamson ,&nbsp;Andrew C. Hanson ,&nbsp;Jennifer A. Sletten ,&nbsp;Jeffrey A. Sparks ,&nbsp;John M. Davis III ,&nbsp;Cynthia S. Crowson","doi":"10.1016/j.semarthrit.2025.152668","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To quantify and improve the performance of standard rheumatoid arthritis (RA) algorithms in a biobank setting.</div></div><div><h3>Methods</h3><div>This retrospective cohort study within the Mayo Clinic (MC) Biobank and MC Tapestry Study identified RA cases by presence of at least two RA codes OR positive anti-cyclic citrullinated peptide antibodies (CCP) plus disease-modifying anti-rheumatic drug (DMARD) prescription as of 7/18/2022. Rheumatology physicians manually verified all RA cases using RA criteria and/or rheumatology physician diagnosis plus DMARD use. All other biobank participants served as non-RA controls. We defined seropositivity as rheumatoid factor and/or anti-CCP positivity. We assessed rules-based and Electronic Medical Records and Genomics (eMERGE) RA algorithms using positive predictive value (PPV). Finally, we developed a novel RA algorithm using a LASSO-based machine learning approach with five-fold cross validation.</div></div><div><h3>Results</h3><div>We identified 1,316 confirmed RA cases (968 MC Biobank, 348 Tapestry, 70 % seropositive) and 82,123 non-RA controls (mean age 65, 61 % female). The PPV of 3 RA codes was 43 %, codes plus DMARD was 54 %, and codes plus DMARD plus seropositivity was 85 %. The PPV of eMERGE was 77 %. Available in the MC Biobank, self-reported RA (PPV 10 %) only minimally improved algorithm performance (PPV from 83 % to 85 %), whereas family history of RA (PPV 3 %) worsened performance. At 90 % PPV, the novel RA algorithm incorporating key variables such as anti-CCP and DMARD use increased sensitivity by 4–11 % compared to eMERGE.</div></div><div><h3>Conclusion</h3><div>Rules-based and eMERGE RA algorithms had worse performance in biobank than administrative settings. Our novel RA algorithm outperformed these standard algorithms.</div></div>","PeriodicalId":21715,"journal":{"name":"Seminars in arthritis and rheumatism","volume":"72 ","pages":"Article 152668"},"PeriodicalIF":4.6000,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quantifying and improving rheumatoid arthritis algorithm performance in biobank settings\",\"authors\":\"Vanessa L. Kronzer ,&nbsp;Katrina A. Williamson ,&nbsp;Andrew C. Hanson ,&nbsp;Jennifer A. Sletten ,&nbsp;Jeffrey A. Sparks ,&nbsp;John M. Davis III ,&nbsp;Cynthia S. Crowson\",\"doi\":\"10.1016/j.semarthrit.2025.152668\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>To quantify and improve the performance of standard rheumatoid arthritis (RA) algorithms in a biobank setting.</div></div><div><h3>Methods</h3><div>This retrospective cohort study within the Mayo Clinic (MC) Biobank and MC Tapestry Study identified RA cases by presence of at least two RA codes OR positive anti-cyclic citrullinated peptide antibodies (CCP) plus disease-modifying anti-rheumatic drug (DMARD) prescription as of 7/18/2022. Rheumatology physicians manually verified all RA cases using RA criteria and/or rheumatology physician diagnosis plus DMARD use. All other biobank participants served as non-RA controls. We defined seropositivity as rheumatoid factor and/or anti-CCP positivity. We assessed rules-based and Electronic Medical Records and Genomics (eMERGE) RA algorithms using positive predictive value (PPV). Finally, we developed a novel RA algorithm using a LASSO-based machine learning approach with five-fold cross validation.</div></div><div><h3>Results</h3><div>We identified 1,316 confirmed RA cases (968 MC Biobank, 348 Tapestry, 70 % seropositive) and 82,123 non-RA controls (mean age 65, 61 % female). The PPV of 3 RA codes was 43 %, codes plus DMARD was 54 %, and codes plus DMARD plus seropositivity was 85 %. The PPV of eMERGE was 77 %. Available in the MC Biobank, self-reported RA (PPV 10 %) only minimally improved algorithm performance (PPV from 83 % to 85 %), whereas family history of RA (PPV 3 %) worsened performance. At 90 % PPV, the novel RA algorithm incorporating key variables such as anti-CCP and DMARD use increased sensitivity by 4–11 % compared to eMERGE.</div></div><div><h3>Conclusion</h3><div>Rules-based and eMERGE RA algorithms had worse performance in biobank than administrative settings. Our novel RA algorithm outperformed these standard algorithms.</div></div>\",\"PeriodicalId\":21715,\"journal\":{\"name\":\"Seminars in arthritis and rheumatism\",\"volume\":\"72 \",\"pages\":\"Article 152668\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seminars in arthritis and rheumatism\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0049017225000393\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RHEUMATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seminars in arthritis and rheumatism","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0049017225000393","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RHEUMATOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的量化和改进标准类风湿性关节炎(RA)算法在生物库环境下的表现。方法:Mayo Clinic (MC)生物库和MC Tapestry研究中的回顾性队列研究通过存在至少两个RA代码或抗环瓜氨酸肽抗体(CCP)阳性以及截至2022年7月18日的疾病改善抗风湿药物(DMARD)处方来确定RA病例。风湿病医生使用RA标准和/或风湿病医生诊断加DMARD手动验证所有RA病例。所有其他生物银行参与者作为非ra对照组。我们将血清阳性定义为类风湿因子和/或抗ccp阳性。我们使用阳性预测值(PPV)评估了基于规则和电子医疗记录和基因组学(eMERGE) RA算法。最后,我们开发了一种新的RA算法,使用基于lasso的机器学习方法进行五重交叉验证。结果我们发现1316例确诊RA病例(968例MC Biobank, 348例Tapestry, 70%血清阳性)和82123例非RA对照(平均年龄65岁,61%为女性)。3种RA编码的PPV为43%,编码加DMARD为54%,编码加DMARD加血清阳性的PPV为85%。eMERGE的PPV为77%。在MC Biobank中,自我报告的RA (PPV为10%)仅能最低限度地提高算法性能(PPV从83%提高到85%),而RA家族史(PPV为3%)使算法性能恶化。在90%的PPV下,与eMERGE相比,结合关键变量(如anti-CCP和DMARD)的新型RA算法的灵敏度提高了4 - 11%。结论基于规则和eMERGE RA算法在生物库中的表现较差。我们的新RA算法优于这些标准算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Quantifying and improving rheumatoid arthritis algorithm performance in biobank settings

Objective

To quantify and improve the performance of standard rheumatoid arthritis (RA) algorithms in a biobank setting.

Methods

This retrospective cohort study within the Mayo Clinic (MC) Biobank and MC Tapestry Study identified RA cases by presence of at least two RA codes OR positive anti-cyclic citrullinated peptide antibodies (CCP) plus disease-modifying anti-rheumatic drug (DMARD) prescription as of 7/18/2022. Rheumatology physicians manually verified all RA cases using RA criteria and/or rheumatology physician diagnosis plus DMARD use. All other biobank participants served as non-RA controls. We defined seropositivity as rheumatoid factor and/or anti-CCP positivity. We assessed rules-based and Electronic Medical Records and Genomics (eMERGE) RA algorithms using positive predictive value (PPV). Finally, we developed a novel RA algorithm using a LASSO-based machine learning approach with five-fold cross validation.

Results

We identified 1,316 confirmed RA cases (968 MC Biobank, 348 Tapestry, 70 % seropositive) and 82,123 non-RA controls (mean age 65, 61 % female). The PPV of 3 RA codes was 43 %, codes plus DMARD was 54 %, and codes plus DMARD plus seropositivity was 85 %. The PPV of eMERGE was 77 %. Available in the MC Biobank, self-reported RA (PPV 10 %) only minimally improved algorithm performance (PPV from 83 % to 85 %), whereas family history of RA (PPV 3 %) worsened performance. At 90 % PPV, the novel RA algorithm incorporating key variables such as anti-CCP and DMARD use increased sensitivity by 4–11 % compared to eMERGE.

Conclusion

Rules-based and eMERGE RA algorithms had worse performance in biobank than administrative settings. Our novel RA algorithm outperformed these standard algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
9.20
自引率
4.00%
发文量
176
审稿时长
46 days
期刊介绍: Seminars in Arthritis and Rheumatism provides access to the highest-quality clinical, therapeutic and translational research about arthritis, rheumatology and musculoskeletal disorders that affect the joints and connective tissue. Each bimonthly issue includes articles giving you the latest diagnostic criteria, consensus statements, systematic reviews and meta-analyses as well as clinical and translational research studies. Read this journal for the latest groundbreaking research and to gain insights from scientists and clinicians on the management and treatment of musculoskeletal and autoimmune rheumatologic diseases. The journal is of interest to rheumatologists, orthopedic surgeons, internal medicine physicians, immunologists and specialists in bone and mineral metabolism.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信