On the Search for Data-Driven and Reproducible Schizophrenia Subtypes Using Resting State fMRI Data From Multiple Sites

IF 2.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation Pub Date : 2024-08-19 DOI:10.1162/neco_a_01689

Lærke Gebser Krohne;Ingeborg Helbech Hansen;Kristoffer H. Madsen

{"title":"On the Search for Data-Driven and Reproducible Schizophrenia Subtypes Using Resting State fMRI Data From Multiple Sites","authors":"Lærke Gebser Krohne;Ingeborg Helbech Hansen;Kristoffer H. Madsen","doi":"10.1162/neco_a_01689","DOIUrl":null,"url":null,"abstract":"For decades, fMRI data have been used to search for biomarkers for patients with schizophrenia. Still, firm conclusions are yet to be made, which is often attributed to the high internal heterogeneity of the disorder. A promising way to disentangle the heterogeneity is to search for subgroups of patients with more homogeneous biological profiles. We applied an unsupervised multiple co-clustering (MCC) method to identify subtypes using functional connectivity data from a multisite resting-state data set. We merged data from two publicly available databases and split the data into a discovery data set (143 patients and 143 healthy controls (HC)) and an external test data set (63 patients and 63 HC) from independent sites. On the discovery data, we investigated the stability of the clustering toward data splits and initializations. Subsequently we searched for cluster solutions, also called “views,” with a significant diagnosis association and evaluated these based on their subject and feature cluster separability, and correlation to clinical manifestations as measured with the positive and negative syndrome scale (PANSS). Finally, we validated our findings by testing the diagnosis association on the external test data. A major finding of our study was that the stability of the clustering was highly dependent on variations in the data set, and even across initializations, we found only a moderate subject clustering stability. Nevertheless, we still discovered one view with a significant diagnosis association. This view reproducibly showed an overrepresentation of schizophrenia patients in three subject clusters, and one feature cluster showed a continuous trend, ranging from positive to negative connectivity values, when sorted according to the proportions of patients with schizophrenia. When investigating all patients, none of the feature clusters in the view were associated with severity of positive, negative, and generalized symptoms, indicating that the cluster solutions reflect other disease related mechanisms.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 9","pages":"1799-1831"},"PeriodicalIF":2.1000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10661276/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

For decades, fMRI data have been used to search for biomarkers for patients with schizophrenia. Still, firm conclusions are yet to be made, which is often attributed to the high internal heterogeneity of the disorder. A promising way to disentangle the heterogeneity is to search for subgroups of patients with more homogeneous biological profiles. We applied an unsupervised multiple co-clustering (MCC) method to identify subtypes using functional connectivity data from a multisite resting-state data set. We merged data from two publicly available databases and split the data into a discovery data set (143 patients and 143 healthy controls (HC)) and an external test data set (63 patients and 63 HC) from independent sites. On the discovery data, we investigated the stability of the clustering toward data splits and initializations. Subsequently we searched for cluster solutions, also called “views,” with a significant diagnosis association and evaluated these based on their subject and feature cluster separability, and correlation to clinical manifestations as measured with the positive and negative syndrome scale (PANSS). Finally, we validated our findings by testing the diagnosis association on the external test data. A major finding of our study was that the stability of the clustering was highly dependent on variations in the data set, and even across initializations, we found only a moderate subject clustering stability. Nevertheless, we still discovered one view with a significant diagnosis association. This view reproducibly showed an overrepresentation of schizophrenia patients in three subject clusters, and one feature cluster showed a continuous trend, ranging from positive to negative connectivity values, when sorted according to the proportions of patients with schizophrenia. When investigating all patients, none of the feature clusters in the view were associated with severity of positive, negative, and generalized symptoms, indicating that the cluster solutions reflect other disease related mechanisms.

查看原文本刊更多论文

利用多部位静息态 fMRI 数据寻找数据驱动和可重复的精神分裂症亚型。

几十年来，fMRI 数据一直被用于寻找精神分裂症患者的生物标志物。然而，由于精神分裂症具有高度的内部异质性，至今仍未得出确切的结论。消除异质性的一个可行方法是寻找生物特征更为同质的患者亚群。我们采用了一种无监督多重协同聚类（MCC）方法，利用来自多点静息态数据集的功能连接数据来识别亚型。我们合并了两个公开数据库中的数据，并将数据分为发现数据集（143 名患者和 143 名健康对照（HC））和外部测试数据集（63 名患者和 63 名健康对照）。在发现数据上，我们研究了聚类对数据分割和初始化的稳定性。随后，我们搜索了具有显著诊断关联性的聚类解决方案（也称为 "观点"），并根据其主题和特征聚类的可分离性，以及与临床表现的相关性进行了评估，这些临床表现是用正负综合征量表（PANSS）测量的。最后，我们在外部测试数据上测试了诊断关联，从而验证了我们的研究结果。我们研究的一个主要发现是，聚类的稳定性高度依赖于数据集的变化，即使在不同的初始化过程中，我们也只发现了中等程度的主体聚类稳定性。尽管如此，我们仍然发现了一个与诊断有显著关联的视图。在三个研究对象聚类中，该视图重复性地显示出精神分裂症患者的比例过高，而且根据精神分裂症患者的比例进行排序时，一个特征聚类显示出从正连接值到负连接值的连续趋势。在调查所有患者时，视图中没有一个特征群与阳性症状、阴性症状和全身症状的严重程度相关，这表明群解反映了其他疾病相关机制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Computation 工程技术-计算机：人工智能

CiteScore

6.30

自引率

3.40%

发文量

审稿时长

3.0 months

期刊介绍： Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.