SUPERVISED LEARNING OF OUTCOME-RELEVANT ITEMS FROM A QUESTIONNAIRE VIA MIXED INTEGER OPTIMIZATION.

IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY
Annals of Applied Statistics Pub Date : 2025-12-01 Epub Date: 2025-12-05 DOI:10.1214/25-AOAS2093
Leyao Zhang, Wen Wang, Mengtong Hu, Alan P Baptist, Peng Wang, Peter X K Song
{"title":"SUPERVISED LEARNING OF OUTCOME-RELEVANT ITEMS FROM A QUESTIONNAIRE VIA MIXED INTEGER OPTIMIZATION.","authors":"Leyao Zhang, Wen Wang, Mengtong Hu, Alan P Baptist, Peng Wang, Peter X K Song","doi":"10.1214/25-AOAS2093","DOIUrl":null,"url":null,"abstract":"<p><p>Questionnaires are among the oldest and most widely used instruments in practice to measure variables relevant to traits of interest that cannot be easily measured by physical devices, for example, depression. In many clinical settings, the scope of an existing questionnaire is often unfit to apply to a new study population, whose underlying characteristics are different from those of the original population used for the questionnaire's development and/or validation. Motivated by a cohort study of elderly asthma patients, we aim to examine associations between clinical outcomes and quality of life (QoL) measured by a QoL questionnaire. To increase comparability, we consider a supervised learning method to identify a subset of questions whose summary score is strongly associated with a specific clinical outcome under investigation. The resultant set of selected items gives an optimal summary metric of the questionnaire, which improves both statistical power and clinical interpretation. Our item extraction procedure is built upon the best subset algorithm implemented by a mixed integer programming, which enjoys both theoretical guarantee of selection consistency and flexibility of handling nonresponse missing data. Moreover, estimation uncertainty is analyzed by the means of noise perturbation. Our methodology is first evaluated by extensive simulation studies with comparisons to existing methods and then applied to derive tailored QoL scores adaptive to two clinical outcomes of lung function measure (FEV1) and asthma control test (ACT), respectively, among elderly people with persistent asthma.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"19 4","pages":"3157-3178"},"PeriodicalIF":1.4000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12869357/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Applied Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/25-AOAS2093","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/12/5 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Questionnaires are among the oldest and most widely used instruments in practice to measure variables relevant to traits of interest that cannot be easily measured by physical devices, for example, depression. In many clinical settings, the scope of an existing questionnaire is often unfit to apply to a new study population, whose underlying characteristics are different from those of the original population used for the questionnaire's development and/or validation. Motivated by a cohort study of elderly asthma patients, we aim to examine associations between clinical outcomes and quality of life (QoL) measured by a QoL questionnaire. To increase comparability, we consider a supervised learning method to identify a subset of questions whose summary score is strongly associated with a specific clinical outcome under investigation. The resultant set of selected items gives an optimal summary metric of the questionnaire, which improves both statistical power and clinical interpretation. Our item extraction procedure is built upon the best subset algorithm implemented by a mixed integer programming, which enjoys both theoretical guarantee of selection consistency and flexibility of handling nonresponse missing data. Moreover, estimation uncertainty is analyzed by the means of noise perturbation. Our methodology is first evaluated by extensive simulation studies with comparisons to existing methods and then applied to derive tailored QoL scores adaptive to two clinical outcomes of lung function measure (FEV1) and asthma control test (ACT), respectively, among elderly people with persistent asthma.

基于混合整数优化的问卷结果相关项的监督学习。
问卷调查是实践中最古老和最广泛使用的工具之一,用于测量与无法通过物理设备轻松测量的感兴趣特征相关的变量,例如抑郁症。在许多临床环境中,现有问卷的范围通常不适合应用于新的研究人群,其潜在特征与用于问卷开发和/或验证的原始人群不同。受一项老年哮喘患者队列研究的启发,我们旨在通过生活质量问卷调查临床结果与生活质量(QoL)之间的关系。为了增加可比性,我们考虑了一种监督学习方法来识别问题子集,这些问题的总结性得分与正在调查的特定临床结果密切相关。所选项目的结果集给出了问卷的最佳总结度量,这提高了统计能力和临床解释。我们的项目提取过程建立在混合整数规划实现的最佳子集算法的基础上,既具有选择一致性的理论保证,又具有处理无响应缺失数据的灵活性。此外,采用噪声扰动的方法分析了估计的不确定性。我们的方法首先通过广泛的模拟研究进行评估,并与现有方法进行比较,然后应用于在患有持续性哮喘的老年人中分别获得适合肺功能测量(FEV1)和哮喘控制测试(ACT)两种临床结果的量身定制的生活质量评分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Annals of Applied Statistics
Annals of Applied Statistics 社会科学-统计学与概率论
CiteScore
3.10
自引率
5.60%
发文量
131
审稿时长
6-12 weeks
期刊介绍: Statistical research spans an enormous range from direct subject-matter collaborations to pure mathematical theory. The Annals of Applied Statistics, the newest journal from the IMS, is aimed at papers in the applied half of this range. Published quarterly in both print and electronic form, our goal is to provide a timely and unified forum for all areas of applied statistics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书