{"title":"基于向量自回归模型的时间顺序选择对密集纵向数据聚类的影响。","authors":"Yaqi Li, Hairong Song, Bertus Jeronimus","doi":"10.1037/met0000747","DOIUrl":null,"url":null,"abstract":"<p><p>When multivariate intensive longitudinal data are collected from a sample of individuals, the model-based clustering (e.g., vector autoregressive [VAR] based) approach can be used to cluster the individuals based on the (dis)similarity of their person-specific dynamics of the studied processes. To implement such clustering procedures, one needs to set the temporal order to be identical for all individuals; however, between-individual differences on temporal order have been evident for psychological and behavioral processes. One existing method is to apply the most complex structure or the highest order (HO) for all processes, while the other is to use the most parsimonious structure or the lowest order (LO). Up to date, the impact of these methods has not been well studied. In our simulation study, we examined the performance of HO and LO in conjunction with Gaussian mixture model (GMM) and k-means algorithms when a two-step VAR-based clustering procedure is implemented across various data conditions. We found that (a) the LO outperformed the HO in cluster identification, (b) the HO was more favorable than the LO in estimation of cluster-specific dynamics, (c) the GMM generally outperformed the <i>k</i>-means, and (d) the LO in conjunction with the GMM produced the best cluster identification outcome. We demonstrated the uses of the VAR-based clustering technique using the data collected from the \"How Nuts are the Dutch\" project. We then discussed the results from all our analyses, limitations of our study, and direction for future research, and meanwhile offered our recommendations on the empirical uses of the model-based clustering techniques. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Impact of temporal order selection on clustering intensive longitudinal data based on vector autoregressive models.\",\"authors\":\"Yaqi Li, Hairong Song, Bertus Jeronimus\",\"doi\":\"10.1037/met0000747\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>When multivariate intensive longitudinal data are collected from a sample of individuals, the model-based clustering (e.g., vector autoregressive [VAR] based) approach can be used to cluster the individuals based on the (dis)similarity of their person-specific dynamics of the studied processes. To implement such clustering procedures, one needs to set the temporal order to be identical for all individuals; however, between-individual differences on temporal order have been evident for psychological and behavioral processes. One existing method is to apply the most complex structure or the highest order (HO) for all processes, while the other is to use the most parsimonious structure or the lowest order (LO). Up to date, the impact of these methods has not been well studied. In our simulation study, we examined the performance of HO and LO in conjunction with Gaussian mixture model (GMM) and k-means algorithms when a two-step VAR-based clustering procedure is implemented across various data conditions. We found that (a) the LO outperformed the HO in cluster identification, (b) the HO was more favorable than the LO in estimation of cluster-specific dynamics, (c) the GMM generally outperformed the <i>k</i>-means, and (d) the LO in conjunction with the GMM produced the best cluster identification outcome. We demonstrated the uses of the VAR-based clustering technique using the data collected from the \\\"How Nuts are the Dutch\\\" project. We then discussed the results from all our analyses, limitations of our study, and direction for future research, and meanwhile offered our recommendations on the empirical uses of the model-based clustering techniques. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>\",\"PeriodicalId\":20782,\"journal\":{\"name\":\"Psychological methods\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological methods\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/met0000747\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000747","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
当从个体样本中收集到多变量密集的纵向数据时,可以使用基于模型的聚类(例如,基于向量自回归[VAR]的)方法根据所研究过程的个人特定动态的(非)相似性对个体进行聚类。为了实现这样的聚类过程,我们需要为所有个体设置相同的时间顺序;然而,个体间在时间顺序上的差异在心理和行为过程中是明显的。现有的一种方法是对所有过程采用最复杂的结构或最高阶(HO),而另一种方法是使用最简洁的结构或最低阶(LO)。到目前为止,这些方法的影响还没有得到很好的研究。在我们的模拟研究中,当在各种数据条件下实现基于var的两步聚类过程时,我们检查了HO和LO结合高斯混合模型(GMM)和k-means算法的性能。我们发现(a) LO在聚类识别方面优于HO, (b) HO在估计聚类特定动态方面比LO更有利,(c) GMM通常优于k-means, (d) LO与GMM结合产生最佳聚类识别结果。我们使用从“How Nuts are the Dutch”项目收集的数据演示了基于var的聚类技术的使用。然后讨论了所有分析的结果、研究的局限性和未来的研究方向,同时对基于模型的聚类技术的实证应用提出了建议。(PsycInfo Database Record (c) 2025 APA,版权所有)。
Impact of temporal order selection on clustering intensive longitudinal data based on vector autoregressive models.
When multivariate intensive longitudinal data are collected from a sample of individuals, the model-based clustering (e.g., vector autoregressive [VAR] based) approach can be used to cluster the individuals based on the (dis)similarity of their person-specific dynamics of the studied processes. To implement such clustering procedures, one needs to set the temporal order to be identical for all individuals; however, between-individual differences on temporal order have been evident for psychological and behavioral processes. One existing method is to apply the most complex structure or the highest order (HO) for all processes, while the other is to use the most parsimonious structure or the lowest order (LO). Up to date, the impact of these methods has not been well studied. In our simulation study, we examined the performance of HO and LO in conjunction with Gaussian mixture model (GMM) and k-means algorithms when a two-step VAR-based clustering procedure is implemented across various data conditions. We found that (a) the LO outperformed the HO in cluster identification, (b) the HO was more favorable than the LO in estimation of cluster-specific dynamics, (c) the GMM generally outperformed the k-means, and (d) the LO in conjunction with the GMM produced the best cluster identification outcome. We demonstrated the uses of the VAR-based clustering technique using the data collected from the "How Nuts are the Dutch" project. We then discussed the results from all our analyses, limitations of our study, and direction for future research, and meanwhile offered our recommendations on the empirical uses of the model-based clustering techniques. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
期刊介绍:
Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.