使用机器学习进行人口建模可以增强心理健康指标-开放数据复制

Q4 Neuroscience

Neuroimage. Reports Pub Date : 2023-06-01 DOI:10.1016/j.ynirp.2023.100163

Ty Easley , Ruiqi Chen , Kayla Hannon , Rosie Dutt , Janine Bijsterbosch

{"title":"使用机器学习进行人口建模可以增强心理健康指标-开放数据复制","authors":"Ty Easley , Ruiqi Chen , Kayla Hannon , Rosie Dutt , Janine Bijsterbosch","doi":"10.1016/j.ynirp.2023.100163","DOIUrl":null,"url":null,"abstract":"<div><p>Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic traits such as neuroticism and fluid intelligence. We aim to replicate previous work and additionally test multiple data manipulations that may improve prediction accuracy by addressing data pollution challenges. Specifically, we added additional fMRI features, averaged the target phenotype across multiple measurements to obtain more accurate estimates of the underlying trait, balanced the target phenotype's distribution through undersampling of majority scores, and identified data-driven subtypes to investigate the impact of between-participant heterogeneity. Our results replicated prior results from Dadi et al. (2021) in a larger sample. Each data manipulation further led to small but consistent improvements in prediction accuracy, which were largely additive when combining multiple data manipulations. Combining data manipulations (i.e., extended fMRI features, averaged target phenotype, balanced target phenotype distribution) led to a three-fold increase in prediction accuracy for fluid intelligence compared to prior work. These findings highlight the benefit of several relatively easy and low-cost data manipulations, which may positively impact future work.</p></div>","PeriodicalId":74277,"journal":{"name":"Neuroimage. Reports","volume":"3 2","pages":"Article 100163"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Population modeling with machine learning can enhance measures of mental health - Open-data replication\",\"authors\":\"Ty Easley , Ruiqi Chen , Kayla Hannon , Rosie Dutt , Janine Bijsterbosch\",\"doi\":\"10.1016/j.ynirp.2023.100163\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic traits such as neuroticism and fluid intelligence. We aim to replicate previous work and additionally test multiple data manipulations that may improve prediction accuracy by addressing data pollution challenges. Specifically, we added additional fMRI features, averaged the target phenotype across multiple measurements to obtain more accurate estimates of the underlying trait, balanced the target phenotype's distribution through undersampling of majority scores, and identified data-driven subtypes to investigate the impact of between-participant heterogeneity. Our results replicated prior results from Dadi et al. (2021) in a larger sample. Each data manipulation further led to small but consistent improvements in prediction accuracy, which were largely additive when combining multiple data manipulations. Combining data manipulations (i.e., extended fMRI features, averaged target phenotype, balanced target phenotype distribution) led to a three-fold increase in prediction accuracy for fluid intelligence compared to prior work. These findings highlight the benefit of several relatively easy and low-cost data manipulations, which may positively impact future work.</p></div>\",\"PeriodicalId\":74277,\"journal\":{\"name\":\"Neuroimage. Reports\",\"volume\":\"3 2\",\"pages\":\"Article 100163\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neuroimage. Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666956023000089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Neuroscience\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuroimage. Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666956023000089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Neuroscience","Score":null,"Total":0}

引用次数: 1

摘要

基于来自大队列的功能性MRI数据预测性状表型的努力受到了预测准确性低和/或效应大小小的阻碍。尽管这些发现是高度可复制的，但考虑到神经质和流体智力等表型特征的假定大脑基础，这种小的效应大小有些令人惊讶。我们的目标是复制以前的工作，并额外测试多种数据操作，这些操作可以通过解决数据污染挑战来提高预测准确性。具体而言，我们添加了额外的功能磁共振成像特征，在多个测量中对目标表型进行平均，以获得对潜在特征的更准确估计，通过对多数分数的欠采样来平衡目标表型的分布，并确定数据驱动的亚型，以调查参与者之间异质性的影响。我们的结果在更大的样本中复制了Dadi等人之前的结果。（2021）。每一次数据操作都进一步导致了预测精度的微小但一致的提高，当组合多个数据操作时，这在很大程度上是相加的。与先前的工作相比，结合数据处理（即扩展的fMRI特征、平均目标表型、平衡的目标表型分布）使流体智能的预测准确性提高了三倍。这些发现突出了几种相对简单和低成本的数据操作的好处，这可能会对未来的工作产生积极影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Population modeling with machine learning can enhance measures of mental health - Open-data replication

Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic traits such as neuroticism and fluid intelligence. We aim to replicate previous work and additionally test multiple data manipulations that may improve prediction accuracy by addressing data pollution challenges. Specifically, we added additional fMRI features, averaged the target phenotype across multiple measurements to obtain more accurate estimates of the underlying trait, balanced the target phenotype's distribution through undersampling of majority scores, and identified data-driven subtypes to investigate the impact of between-participant heterogeneity. Our results replicated prior results from Dadi et al. (2021) in a larger sample. Each data manipulation further led to small but consistent improvements in prediction accuracy, which were largely additive when combining multiple data manipulations. Combining data manipulations (i.e., extended fMRI features, averaged target phenotype, balanced target phenotype distribution) led to a three-fold increase in prediction accuracy for fluid intelligence compared to prior work. These findings highlight the benefit of several relatively easy and low-cost data manipulations, which may positively impact future work.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neuroimage. Reports Neuroscience (General)

CiteScore

1.90

自引率

0.00%

发文量

审稿时长

87 days