Comparing broad and narrow phenotype algorithms: differences in performance characteristics and immortal time incurred

Journal of Pharmacy & Pharmaceutical Sciences Pub Date : 2024-01-03 DOI:10.3389/jpps.2023.12095

Joel N. Swerdel, Mitchell M. Conover

{"title":"Comparing broad and narrow phenotype algorithms: differences in performance characteristics and immortal time incurred","authors":"Joel N. Swerdel, Mitchell M. Conover","doi":"10.3389/jpps.2023.12095","DOIUrl":null,"url":null,"abstract":"Introduction: When developing phenotype algorithms for observational research, there is usually a trade-off between definitions that are sensitive or specific. The objective of this study was to estimate the performance characteristics of phenotype algorithms designed for increasing specificity and to estimate the immortal time associated with each algorithm.Materials and methods: We examined algorithms for 11 chronic health conditions. The analyses were from data from five databases. For each health condition, we created five algorithms to examine performance (sensitivity and positive predictive value (PPV)) differences: one broad algorithm using a single code for the health condition and four narrow algorithms where a second diagnosis code was required 1–30 days, 1–90 days, 1–365 days, or 1- all days in a subject’s continuous observation period after the first code. We also examined the proportion of immortal time relative to time-at-risk (TAR) for four outcomes. The TAR’s were: 0–30 days after the first condition occurrence (the index date), 0–90 days post-index, 0–365 days post-index, and 0–1,095 days post-index. Performance of algorithms for chronic health conditions was estimated using PheValuator (V2.1.4) from the OHDSI toolstack. Immortal time was calculated as the time from the index date until the first of the following: 1) the outcome; 2) the end of the outcome TAR; 3) the occurrence of the second code for the chronic health condition.Results: In the first analysis, the narrow phenotype algorithms, i.e., those requiring a second condition code, produced higher estimates for PPV and lower estimates for sensitivity compared to the single code algorithm. In all conditions, increasing the time to the required second code increased the sensitivity of the algorithm. In the second analysis, the amount of immortal time increased as the window used to identify the second diagnosis code increased. The proportion of TAR that was immortal was highest in the 30 days TAR analyses compared to the 1,095 days TAR analyses.Conclusion: Attempting to increase the specificity of a health condition algorithm by adding a second code is a potentially valid approach to increase specificity, albeit at the cost of incurring immortal time.","PeriodicalId":503670,"journal":{"name":"Journal of Pharmacy & Pharmaceutical Sciences","volume":"67 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pharmacy & Pharmaceutical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/jpps.2023.12095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: When developing phenotype algorithms for observational research, there is usually a trade-off between definitions that are sensitive or specific. The objective of this study was to estimate the performance characteristics of phenotype algorithms designed for increasing specificity and to estimate the immortal time associated with each algorithm.Materials and methods: We examined algorithms for 11 chronic health conditions. The analyses were from data from five databases. For each health condition, we created five algorithms to examine performance (sensitivity and positive predictive value (PPV)) differences: one broad algorithm using a single code for the health condition and four narrow algorithms where a second diagnosis code was required 1–30 days, 1–90 days, 1–365 days, or 1- all days in a subject’s continuous observation period after the first code. We also examined the proportion of immortal time relative to time-at-risk (TAR) for four outcomes. The TAR’s were: 0–30 days after the first condition occurrence (the index date), 0–90 days post-index, 0–365 days post-index, and 0–1,095 days post-index. Performance of algorithms for chronic health conditions was estimated using PheValuator (V2.1.4) from the OHDSI toolstack. Immortal time was calculated as the time from the index date until the first of the following: 1) the outcome; 2) the end of the outcome TAR; 3) the occurrence of the second code for the chronic health condition.Results: In the first analysis, the narrow phenotype algorithms, i.e., those requiring a second condition code, produced higher estimates for PPV and lower estimates for sensitivity compared to the single code algorithm. In all conditions, increasing the time to the required second code increased the sensitivity of the algorithm. In the second analysis, the amount of immortal time increased as the window used to identify the second diagnosis code increased. The proportion of TAR that was immortal was highest in the 30 days TAR analyses compared to the 1,095 days TAR analyses.Conclusion: Attempting to increase the specificity of a health condition algorithm by adding a second code is a potentially valid approach to increase specificity, albeit at the cost of incurring immortal time.

查看原文本刊更多论文

广义表型算法和狭义表型算法的比较：性能特征和不朽时间发生的差异

前言在为观察性研究开发表型算法时，通常需要在敏感性或特异性定义之间进行权衡。本研究的目的是估算为提高特异性而设计的表型算法的性能特征，并估算与每种算法相关的永恒时间：我们研究了 11 种慢性疾病的算法。分析数据来自五个数据库。针对每种健康状况，我们创建了五种算法来检查性能（灵敏度和阳性预测值 (PPV)）差异：一种是使用单一健康状况代码的广义算法，另一种是四种狭义算法，即在第一个代码之后的受试者连续观察期的 1-30 天、1-90 天、1-365 天或 1- 所有天数内需要第二个诊断代码。我们还检查了四种结果中永生时间相对于风险时间（TAR）的比例。风险时间为首次病症发生后 0-30 天（指数日期）、指数后 0-90 天、指数后 0-365 天和指数后 0-1,095 天。使用 OHDSI 工具包中的 PheValuator（V2.1.4）对慢性健康状况算法的性能进行了估算。不死时间是指从指数日期到以下第一项出现的时间：结果：在第一项分析中，与单一代码算法相比，狭义表型算法（即需要第二个病情代码的算法）的PPV估计值较高，而灵敏度估计值较低。在所有条件下，增加所需的第二次编码时间都会提高算法的灵敏度。在第二项分析中，随着用于识别第二个诊断代码的窗口增加，不死时间量也随之增加。与 1,095 天的 TAR 分析相比，30 天的 TAR 分析中永存 TAR 的比例最高：结论：试图通过增加第二个代码来提高健康状况算法的特异性是一种潜在的有效方法，尽管这种方法的代价是耗费大量时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Pharmacy & Pharmaceutical Sciences

自引率

0.00%

发文量