不适合所有人的量身定制的拟合：诊断研究中的阈值过拟合问题。

IF 2 Q2 MEDICINE, GENERAL & INTERNAL

Diagnosis Pub Date : 2025-09-16 DOI:10.1515/dx-2025-0096

Javier Arredondo Montero

{"title":"不适合所有人的量身定制的拟合：诊断研究中的阈值过拟合问题。","authors":"Javier Arredondo Montero","doi":"10.1515/dx-2025-0096","DOIUrl":null,"url":null,"abstract":"Objectives: To critically examine the phenomenon of threshold overfitting in diagnostic accuracy research and evaluate its methodological implications through a structured review of relevant literature.Methods: This article presents a narrative and critical review of methodological studies and reporting guidelines related to threshold selection in diagnostic test accuracy. It focuses on the misuse of post hoc thresholds, the misapplication of bias assessment tools such as QUADAS-2, and the frequent absence of independent validation. In addition to identifying these structural flaws, the article proposes a set of five concrete safeguards - ranging from transparent reporting to rigorous risk of bias classification - designed to mitigate threshold-related bias in future diagnostic studies.Results: Thresholds are frequently derived and evaluated within the same dataset, inflating sensitivity and specificity estimates. This overfitting is seldom acknowledged and is often misclassified as low risk of bias. QUADAS-2 is frequently misapplied, with reviewers mistaking the mere presence of a threshold for proper pre-specification. The article identifies five key safeguards to mitigate this bias: (1) clear declaration of pre-specification, (2) justification of threshold choice, (3) independent validation, (4) full performance reporting across thresholds, and (5) rigorous application of bias assessment tools.Conclusions: Threshold overfitting remains an underrecognized but methodologically critical source of bias in diagnostic accuracy studies. Addressing it requires more than awareness - it demands transparent reporting, proper validation, and stricter adherence to methodological standards.","PeriodicalId":11273,"journal":{"name":"Diagnosis","volume":" ","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A tailored fit that doesn't fit all: the problem of threshold overfitting in diagnostic studies.\",\"authors\":\"Javier Arredondo Montero\",\"doi\":\"10.1515/dx-2025-0096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objectives: To critically examine the phenomenon of threshold overfitting in diagnostic accuracy research and evaluate its methodological implications through a structured review of relevant literature.Methods: This article presents a narrative and critical review of methodological studies and reporting guidelines related to threshold selection in diagnostic test accuracy. It focuses on the misuse of post hoc thresholds, the misapplication of bias assessment tools such as QUADAS-2, and the frequent absence of independent validation. In addition to identifying these structural flaws, the article proposes a set of five concrete safeguards - ranging from transparent reporting to rigorous risk of bias classification - designed to mitigate threshold-related bias in future diagnostic studies.Results: Thresholds are frequently derived and evaluated within the same dataset, inflating sensitivity and specificity estimates. This overfitting is seldom acknowledged and is often misclassified as low risk of bias. QUADAS-2 is frequently misapplied, with reviewers mistaking the mere presence of a threshold for proper pre-specification. The article identifies five key safeguards to mitigate this bias: (1) clear declaration of pre-specification, (2) justification of threshold choice, (3) independent validation, (4) full performance reporting across thresholds, and (5) rigorous application of bias assessment tools.Conclusions: Threshold overfitting remains an underrecognized but methodologically critical source of bias in diagnostic accuracy studies. Addressing it requires more than awareness - it demands transparent reporting, proper validation, and stricter adherence to methodological standards.\",\"PeriodicalId\":11273,\"journal\":{\"name\":\"Diagnosis\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Diagnosis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1515/dx-2025-0096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnosis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/dx-2025-0096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

摘要

目的：通过对相关文献的结构化回顾，批判性地检查诊断准确性研究中的阈值过拟合现象，并评估其方法学意义。方法：这篇文章提出了一个叙述性和批判性的回顾方法研究和报告指南有关的阈值选择诊断测试的准确性。它侧重于误用事后阈值，误用偏倚评估工具，如QUADAS-2，以及经常缺乏独立验证。除了确定这些结构性缺陷之外，这篇文章还提出了一套五个具体的保障措施——从透明的报告到严格的偏见分类风险——旨在减轻未来诊断研究中与阈值相关的偏见。结果：阈值经常在同一数据集内推导和评估，夸大了敏感性和特异性估计。这种过度拟合很少被承认，并且经常被错误地归类为低偏倚风险。QUADAS-2经常被误用，审稿人把仅仅存在一个阈值误认为是适当的预规范。本文确定了减轻这种偏差的五个关键保障措施：(1)明确声明预规范，(2)阈值选择的正当性，(3)独立验证，(4)跨阈值的完整性能报告，以及(5)严格应用偏差评估工具。结论：阈值过拟合仍然是诊断准确性研究中一个未被充分认识但在方法学上至关重要的偏倚来源。解决这个问题需要的不仅仅是意识——它需要透明的报告、适当的验证和更严格地遵守方法标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A tailored fit that doesn't fit all: the problem of threshold overfitting in diagnostic studies.

Objectives: To critically examine the phenomenon of threshold overfitting in diagnostic accuracy research and evaluate its methodological implications through a structured review of relevant literature.

Methods: This article presents a narrative and critical review of methodological studies and reporting guidelines related to threshold selection in diagnostic test accuracy. It focuses on the misuse of post hoc thresholds, the misapplication of bias assessment tools such as QUADAS-2, and the frequent absence of independent validation. In addition to identifying these structural flaws, the article proposes a set of five concrete safeguards - ranging from transparent reporting to rigorous risk of bias classification - designed to mitigate threshold-related bias in future diagnostic studies.

Results: Thresholds are frequently derived and evaluated within the same dataset, inflating sensitivity and specificity estimates. This overfitting is seldom acknowledged and is often misclassified as low risk of bias. QUADAS-2 is frequently misapplied, with reviewers mistaking the mere presence of a threshold for proper pre-specification. The article identifies five key safeguards to mitigate this bias: (1) clear declaration of pre-specification, (2) justification of threshold choice, (3) independent validation, (4) full performance reporting across thresholds, and (5) rigorous application of bias assessment tools.

Conclusions: Threshold overfitting remains an underrecognized but methodologically critical source of bias in diagnostic accuracy studies. Addressing it requires more than awareness - it demands transparent reporting, proper validation, and stricter adherence to methodological standards.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Diagnosis MEDICINE, GENERAL & INTERNAL-

CiteScore

7.20

自引率

5.70%

发文量

期刊介绍： Diagnosis focuses on how diagnosis can be advanced, how it is taught, and how and why it can fail, leading to diagnostic errors. The journal welcomes both fundamental and applied works, improvement initiatives, opinions, and debates to encourage new thinking on improving this critical aspect of healthcare quality.　 Topics: -Factors that promote diagnostic quality and safety -Clinical reasoning -Diagnostic errors in medicine -The factors that contribute to diagnostic error: human factors, cognitive issues, and system-related breakdowns -Improving the value of diagnosis – eliminating waste and unnecessary testing -How culture and removing blame promote awareness of diagnostic errors -Training and education related to clinical reasoning and diagnostic skills -Advances in laboratory testing and imaging that improve diagnostic capability -Local, national and international initiatives to reduce diagnostic error