开发校准良好的二元结果预测模型的受限最大似然法。

IF 1.2 3区 数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS
Lifetime Data Analysis Pub Date : 2024-07-01 Epub Date: 2024-05-08 DOI:10.1007/s10985-024-09628-9
Yaqi Cao, Weidong Ma, Ge Zhao, Anne Marie McCarthy, Jinbo Chen
{"title":"开发校准良好的二元结果预测模型的受限最大似然法。","authors":"Yaqi Cao, Weidong Ma, Ge Zhao, Anne Marie McCarthy, Jinbo Chen","doi":"10.1007/s10985-024-09628-9","DOIUrl":null,"url":null,"abstract":"<p><p>The added value of candidate predictors for risk modeling is routinely evaluated by comparing the performance of models with or without including candidate predictors. Such comparison is most meaningful when the estimated risk by the two models are both unbiased in the target population. Very often data for candidate predictors are sourced from nonrepresentative convenience samples. Updating the base model using the study data without acknowledging the discrepancy between the underlying distribution of the study data and that in the target population can lead to biased risk estimates and therefore an unfair evaluation of candidate predictors. To address this issue assuming access to a well-calibrated base model, we propose a semiparametric method for model fitting that enforces good calibration. The central idea is to calibrate the fitted model against the base model by enforcing suitable constraints in maximizing the likelihood function. This approach enables unbiased assessment of model improvement offered by candidate predictors without requiring a representative sample from the target population, thus overcoming a significant practical challenge. We study theoretical properties for model parameter estimates, and demonstrate improvement in model calibration via extensive simulation studies. Finally, we apply the proposed method to data extracted from Penn Medicine Biobank to inform the added value of breast density for breast cancer risk assessment in the Caucasian woman population.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A constrained maximum likelihood approach to developing well-calibrated models for predicting binary outcomes.\",\"authors\":\"Yaqi Cao, Weidong Ma, Ge Zhao, Anne Marie McCarthy, Jinbo Chen\",\"doi\":\"10.1007/s10985-024-09628-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The added value of candidate predictors for risk modeling is routinely evaluated by comparing the performance of models with or without including candidate predictors. Such comparison is most meaningful when the estimated risk by the two models are both unbiased in the target population. Very often data for candidate predictors are sourced from nonrepresentative convenience samples. Updating the base model using the study data without acknowledging the discrepancy between the underlying distribution of the study data and that in the target population can lead to biased risk estimates and therefore an unfair evaluation of candidate predictors. To address this issue assuming access to a well-calibrated base model, we propose a semiparametric method for model fitting that enforces good calibration. The central idea is to calibrate the fitted model against the base model by enforcing suitable constraints in maximizing the likelihood function. This approach enables unbiased assessment of model improvement offered by candidate predictors without requiring a representative sample from the target population, thus overcoming a significant practical challenge. We study theoretical properties for model parameter estimates, and demonstrate improvement in model calibration via extensive simulation studies. Finally, we apply the proposed method to data extracted from Penn Medicine Biobank to inform the added value of breast density for breast cancer risk assessment in the Caucasian woman population.</p>\",\"PeriodicalId\":49908,\"journal\":{\"name\":\"Lifetime Data Analysis\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Lifetime Data Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s10985-024-09628-9\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/5/8 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lifetime Data Analysis","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10985-024-09628-9","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/8 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

候选预测因子对风险建模的附加值通常是通过比较包含或不包含候选预测因子的模型的性能来评估的。当两个模型在目标人群中估计的风险都无偏时,这种比较才最有意义。候选预测因子的数据往往来自非代表性的便利样本。使用研究数据更新基础模型时,如果不承认研究数据的基本分布与目标人群的分布之间存在差异,就会导致风险估计值存在偏差,从而对候选预测因子进行不公平的评估。为了解决这个问题,我们提出了一种半参数方法,在获得校准良好的基础模型的前提下进行模型拟合。其核心思想是通过在最大化似然函数时强制执行适当的约束条件,根据基础模型校准拟合模型。这种方法无需目标人群的代表性样本,就能对候选预测因子对模型的改进进行无偏评估,从而克服了一个重大的实际挑战。我们研究了模型参数估计的理论属性,并通过大量模拟研究证明了模型校准的改进。最后,我们将所提出的方法应用于从宾夕法尼亚医学生物库中提取的数据,以告知乳腺密度对白种女性乳腺癌风险评估的附加价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

A constrained maximum likelihood approach to developing well-calibrated models for predicting binary outcomes.

A constrained maximum likelihood approach to developing well-calibrated models for predicting binary outcomes.

The added value of candidate predictors for risk modeling is routinely evaluated by comparing the performance of models with or without including candidate predictors. Such comparison is most meaningful when the estimated risk by the two models are both unbiased in the target population. Very often data for candidate predictors are sourced from nonrepresentative convenience samples. Updating the base model using the study data without acknowledging the discrepancy between the underlying distribution of the study data and that in the target population can lead to biased risk estimates and therefore an unfair evaluation of candidate predictors. To address this issue assuming access to a well-calibrated base model, we propose a semiparametric method for model fitting that enforces good calibration. The central idea is to calibrate the fitted model against the base model by enforcing suitable constraints in maximizing the likelihood function. This approach enables unbiased assessment of model improvement offered by candidate predictors without requiring a representative sample from the target population, thus overcoming a significant practical challenge. We study theoretical properties for model parameter estimates, and demonstrate improvement in model calibration via extensive simulation studies. Finally, we apply the proposed method to data extracted from Penn Medicine Biobank to inform the added value of breast density for breast cancer risk assessment in the Caucasian woman population.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Lifetime Data Analysis
Lifetime Data Analysis 数学-数学跨学科应用
CiteScore
2.30
自引率
7.70%
发文量
43
审稿时长
3 months
期刊介绍: The objective of Lifetime Data Analysis is to advance and promote statistical science in the various applied fields that deal with lifetime data, including: Actuarial Science – Economics – Engineering Sciences – Environmental Sciences – Management Science – Medicine – Operations Research – Public Health – Social and Behavioral Sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信