对比较研究中方法失败处理的再思考。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine Pub Date : 2025-10-01 DOI:10.1002/sim.70257

Milena Wünsch, Moritz Herrmann, Elisa Noltenius, Mattia Mohr, Tim P Morris, Anne-Laure Boulesteix

{"title":"对比较研究中方法失败处理的再思考。","authors":"Milena Wünsch, Moritz Herrmann, Elisa Noltenius, Mattia Mohr, Tim P Morris, Anne-Laure Boulesteix","doi":"10.1002/sim.70257","DOIUrl":null,"url":null,"abstract":"Comparison studies in methodological research are intended to compare methods in an evidence-based manner to help data analysts select a suitable method for their application. To provide trustworthy evidence, they must be carefully designed, implemented, and reported, especially given the many decisions made in planning and running. A common challenge in comparison studies is to handle the \"failure\" of one or more methods to produce a result for some (real or simulated) data sets, such that their performances cannot be measured in those instances. Despite an increasing emphasis on this topic in recent literature (focusing on non-convergence as a common manifestation), there is little guidance on proper handling and interpretation, and reporting of the chosen approach is often neglected. This paper aims to fill this gap and offers practical guidance on handling method failure in comparison studies. After exploring common handlings across various published comparison studies from classical statistics and predictive modeling, we show that the popular approaches of discarding data sets yielding failure (either for all or the failing methods only) and imputing are inappropriate in most cases. We then recommend a different perspective on method failure-viewing it as the result of a complex interplay of several factors rather than just its manifestation. Building on this, we provide recommendations on more adequate handling of method failure derived from realistic considerations. In particular, we propose considering fallback strategies that directly reflect the behavior of real-world users. Finally, we illustrate our recommendations and the dangers of inadequate handling of method failure through two exemplary comparison studies.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70257"},"PeriodicalIF":1.8000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12509789/pdf/","citationCount":"0","resultStr":"{\"title\":\"Rethinking the Handling of Method Failure in Comparison Studies.\",\"authors\":\"Milena Wünsch, Moritz Herrmann, Elisa Noltenius, Mattia Mohr, Tim P Morris, Anne-Laure Boulesteix\",\"doi\":\"10.1002/sim.70257\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Comparison studies in methodological research are intended to compare methods in an evidence-based manner to help data analysts select a suitable method for their application. To provide trustworthy evidence, they must be carefully designed, implemented, and reported, especially given the many decisions made in planning and running. A common challenge in comparison studies is to handle the \\\"failure\\\" of one or more methods to produce a result for some (real or simulated) data sets, such that their performances cannot be measured in those instances. Despite an increasing emphasis on this topic in recent literature (focusing on non-convergence as a common manifestation), there is little guidance on proper handling and interpretation, and reporting of the chosen approach is often neglected. This paper aims to fill this gap and offers practical guidance on handling method failure in comparison studies. After exploring common handlings across various published comparison studies from classical statistics and predictive modeling, we show that the popular approaches of discarding data sets yielding failure (either for all or the failing methods only) and imputing are inappropriate in most cases. We then recommend a different perspective on method failure-viewing it as the result of a complex interplay of several factors rather than just its manifestation. Building on this, we provide recommendations on more adequate handling of method failure derived from realistic considerations. In particular, we propose considering fallback strategies that directly reflect the behavior of real-world users. Finally, we illustrate our recommendations and the dangers of inadequate handling of method failure through two exemplary comparison studies.\",\"PeriodicalId\":21879,\"journal\":{\"name\":\"Statistics in Medicine\",\"volume\":\"44 23-24\",\"pages\":\"e70257\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12509789/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistics in Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/sim.70257\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/sim.70257","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

方法学研究中的比较研究旨在以证据为基础的方式比较方法，以帮助数据分析师选择适合其应用的方法。为了提供值得信赖的证据，必须仔细设计、实施和报告它们，特别是考虑到在计划和运行中做出的许多决策。比较研究中的一个常见挑战是处理一种或多种方法对某些（真实或模拟）数据集产生结果的“失败”，从而在这些情况下无法测量它们的性能。尽管在最近的文献中越来越强调这一主题（关注非收敛性作为一种常见的表现），但很少有关于正确处理和解释的指导，并且所选择的方法的报告经常被忽视。本文旨在填补这一空白，并为比较研究中处理方法失败提供实践指导。在从经典统计和预测建模的各种已发表的比较研究中探索了常见的处理方法后，我们表明，在大多数情况下，丢弃数据集（要么是所有失败的方法，要么只是失败的方法）和推算的流行方法是不合适的。然后我们推荐一种不同的方法失败的观点——把它看作是几个因素复杂的相互作用的结果，而不仅仅是它的表现。在此基础上，我们提供了基于现实考虑的更适当的方法失败处理建议。特别是，我们建议考虑直接反映现实世界用户行为的回退策略。最后，我们通过两个典型的比较研究说明了我们的建议和方法失败处理不当的危险。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Rethinking the Handling of Method Failure in Comparison Studies.

Comparison studies in methodological research are intended to compare methods in an evidence-based manner to help data analysts select a suitable method for their application. To provide trustworthy evidence, they must be carefully designed, implemented, and reported, especially given the many decisions made in planning and running. A common challenge in comparison studies is to handle the "failure" of one or more methods to produce a result for some (real or simulated) data sets, such that their performances cannot be measured in those instances. Despite an increasing emphasis on this topic in recent literature (focusing on non-convergence as a common manifestation), there is little guidance on proper handling and interpretation, and reporting of the chosen approach is often neglected. This paper aims to fill this gap and offers practical guidance on handling method failure in comparison studies. After exploring common handlings across various published comparison studies from classical statistics and predictive modeling, we show that the popular approaches of discarding data sets yielding failure (either for all or the failing methods only) and imputing are inappropriate in most cases. We then recommend a different perspective on method failure-viewing it as the result of a complex interplay of several factors rather than just its manifestation. Building on this, we provide recommendations on more adequate handling of method failure derived from realistic considerations. In particular, we propose considering fallback strategies that directly reflect the behavior of real-world users. Finally, we illustrate our recommendations and the dangers of inadequate handling of method failure through two exemplary comparison studies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistics in Medicine 医学-公共卫生、环境卫生与职业卫生

CiteScore

3.40

自引率

10.00%

发文量

334

审稿时长

2-4 weeks

期刊介绍： The journal aims to influence practice in medicine and its associated sciences through the publication of papers on statistical and other quantitative methods. Papers will explain new methods and demonstrate their application, preferably through a substantive, real, motivating example or a comprehensive evaluation based on an illustrative example. Alternatively, papers will report on case-studies where creative use or technical generalizations of established methodology is directed towards a substantive application. Reviews of, and tutorials on, general topics relevant to the application of statistics to medicine will also be published. The main criteria for publication are appropriateness of the statistical methods to a particular medical problem and clarity of exposition. Papers with primarily mathematical content will be excluded. The journal aims to enhance communication between statisticians, clinicians and medical researchers.