研究项目反应理论模型在临床试验临床结果评估中的表现。

IF 3.3 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Quality of Life Research Pub Date : 2025-04-01 Epub Date: 2024-12-12 DOI:10.1007/s11136-024-03873-z

Nicolai D Ayasse, Cheryl D Coon

{"title":"研究项目反应理论模型在临床试验临床结果评估中的表现。","authors":"Nicolai D Ayasse, Cheryl D Coon","doi":"10.1007/s11136-024-03873-z","DOIUrl":null,"url":null,"abstract":"Purpose: Item response theory (IRT) models are an increasingly popular method choice for evaluating clinical outcome assessments (COAs) for use in clinical trials. Given common constraints in clinical trial design, such as limits on sample size and assessment lengths, the current study aimed to examine the appropriateness of commonly used polytomous IRT models, specifically the graded response model (GRM) and partial credit model (PCM), in the context of how they are frequently used for psychometric evaluation of COAs in clinical trials.Methods: Data were simulated under varying sample sizes, measure lengths, response category numbers, and slope strengths, as well as under conditions that violated some model assumptions, namely, unidimensionality and equality of item slopes. Model fit, detection of item local dependence, and detection of item misfit were all examined to identify conditions where one model may be preferable or results may contain a degree of bias.Results: For unidimensional item sets and equal item slopes, the PCM and GRM performed similarly, and GRM performance remained consistent as slope variability increased. For not-unidimensional item sets, the PCM was somewhat more sensitive to this unidimensionality violation. Looking across conditions, the PCM did not demonstrate a clear advantage over the GRM for small sample sizes or shorter measure lengths.Conclusion: Overall, the GRM and the PCM each demonstrated advantages and disadvantages depending on underlying data conditions and the model outcome investigated. We recommend careful consideration of the known, or expected, data characteristics when choosing a model and interpreting its results.","PeriodicalId":20748,"journal":{"name":"Quality of Life Research","volume":" ","pages":"1125-1136"},"PeriodicalIF":3.3000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigating item response theory model performance in the context of evaluating clinical outcome assessments in clinical trials.\",\"authors\":\"Nicolai D Ayasse, Cheryl D Coon\",\"doi\":\"10.1007/s11136-024-03873-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Item response theory (IRT) models are an increasingly popular method choice for evaluating clinical outcome assessments (COAs) for use in clinical trials. Given common constraints in clinical trial design, such as limits on sample size and assessment lengths, the current study aimed to examine the appropriateness of commonly used polytomous IRT models, specifically the graded response model (GRM) and partial credit model (PCM), in the context of how they are frequently used for psychometric evaluation of COAs in clinical trials.Methods: Data were simulated under varying sample sizes, measure lengths, response category numbers, and slope strengths, as well as under conditions that violated some model assumptions, namely, unidimensionality and equality of item slopes. Model fit, detection of item local dependence, and detection of item misfit were all examined to identify conditions where one model may be preferable or results may contain a degree of bias.Results: For unidimensional item sets and equal item slopes, the PCM and GRM performed similarly, and GRM performance remained consistent as slope variability increased. For not-unidimensional item sets, the PCM was somewhat more sensitive to this unidimensionality violation. Looking across conditions, the PCM did not demonstrate a clear advantage over the GRM for small sample sizes or shorter measure lengths.Conclusion: Overall, the GRM and the PCM each demonstrated advantages and disadvantages depending on underlying data conditions and the model outcome investigated. We recommend careful consideration of the known, or expected, data characteristics when choosing a model and interpreting its results.\",\"PeriodicalId\":20748,\"journal\":{\"name\":\"Quality of Life Research\",\"volume\":\" \",\"pages\":\"1125-1136\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quality of Life Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s11136-024-03873-z\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quality of Life Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11136-024-03873-z","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/12 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

目的：项目反应理论（IRT）模型是临床试验中临床结果评估（COAs）的一种日益流行的方法选择。考虑到临床试验设计中常见的限制因素，如样本量和评估长度的限制，本研究旨在检查常用的多重IRT模型的适用性，特别是分级反应模型（GRM）和部分信用模型（PCM），在临床试验中如何经常用于coa的心理测量评估。方法：在不同的样本量、测量长度、反应类别数和坡度强度条件下，以及违反某些模型假设的条件下，即项目坡度的单维性和相等性，对数据进行模拟。模型拟合、项目局部依赖性检测和项目不拟合检测都经过检查，以确定一个模型可能更可取或结果可能包含一定程度偏差的条件。结果：对于单维项目集和相等的项目坡度，PCM和GRM的表现相似，并且随着坡度变异性的增加，GRM的表现保持一致。对于非单维项目集，PCM对这种单维性违反更为敏感。在不同的条件下，对于小样本大小或较短的测量长度，PCM没有表现出比GRM明显的优势。结论：总体而言，GRM和PCM各自显示出优势和劣势，这取决于潜在的数据条件和模型调查结果。我们建议在选择模型和解释其结果时仔细考虑已知或预期的数据特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Investigating item response theory model performance in the context of evaluating clinical outcome assessments in clinical trials.

Purpose: Item response theory (IRT) models are an increasingly popular method choice for evaluating clinical outcome assessments (COAs) for use in clinical trials. Given common constraints in clinical trial design, such as limits on sample size and assessment lengths, the current study aimed to examine the appropriateness of commonly used polytomous IRT models, specifically the graded response model (GRM) and partial credit model (PCM), in the context of how they are frequently used for psychometric evaluation of COAs in clinical trials.

Methods: Data were simulated under varying sample sizes, measure lengths, response category numbers, and slope strengths, as well as under conditions that violated some model assumptions, namely, unidimensionality and equality of item slopes. Model fit, detection of item local dependence, and detection of item misfit were all examined to identify conditions where one model may be preferable or results may contain a degree of bias.

Results: For unidimensional item sets and equal item slopes, the PCM and GRM performed similarly, and GRM performance remained consistent as slope variability increased. For not-unidimensional item sets, the PCM was somewhat more sensitive to this unidimensionality violation. Looking across conditions, the PCM did not demonstrate a clear advantage over the GRM for small sample sizes or shorter measure lengths.

Conclusion: Overall, the GRM and the PCM each demonstrated advantages and disadvantages depending on underlying data conditions and the model outcome investigated. We recommend careful consideration of the known, or expected, data characteristics when choosing a model and interpreting its results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Quality of Life Research 医学-公共卫生、环境卫生与职业卫生

CiteScore

6.50

自引率

8.60%

发文量

224

审稿时长

3-8 weeks

期刊介绍： Quality of Life Research is an international, multidisciplinary journal devoted to the rapid communication of original research, theoretical articles and methodological reports related to the field of quality of life, in all the health sciences. The journal also offers editorials, literature, book and software reviews, correspondence and abstracts of conferences. Quality of life has become a prominent issue in biometry, philosophy, social science, clinical medicine, health services and outcomes research. The journal''s scope reflects the wide application of quality of life assessment and research in the biological and social sciences. All original work is subject to peer review for originality, scientific quality and relevance to a broad readership. This is an official journal of the International Society of Quality of Life Research.