{"title":"聚类方法:优化还是不优化?","authors":"Michael Brusco,Douglas Steinley,Ashley L Watts","doi":"10.1037/met0000688","DOIUrl":null,"url":null,"abstract":"Many clustering problems are associated with a particular objective criterion that is sought to be optimized. There are often several methods that can be used to tackle the optimization problem, and one or more of them might guarantee a globally optimal solution. However, it is quite possible that, relative to one or more suboptimal solutions, a globally optimal solution might be less interpretable from the standpoint of psychological theory or be less in accordance with some known (i.e., true) cluster structure. For example, in simulation experiments, it has sometimes been observed that there is not a perfect correspondence between the optimized clustering criterion and recovery of the underlying known cluster structure. This can lead to the misconception that clustering methods with a tendency to produce suboptimal solutions might, in some instances, be preferable to superior methods that provide globally optimal (or at least better locally optimal) solutions. In this article, we present results from simulation studies in the context of K-median clustering where departure from global optimality was carefully controlled. Although the results showed that suboptimal solutions sometimes produced marginally better recovery for experimental cells where the known cluster structure was less well-defined, capriciously accepting inferior solutions is an unwise practice. However, there are instances in which some sacrifice in the optimization criterion value to meet certain desirable constraints or to improve the value of one or more other relevant criteria is principled. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":null,"pages":null},"PeriodicalIF":7.6000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering methods: To optimize or to not optimize?\",\"authors\":\"Michael Brusco,Douglas Steinley,Ashley L Watts\",\"doi\":\"10.1037/met0000688\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many clustering problems are associated with a particular objective criterion that is sought to be optimized. There are often several methods that can be used to tackle the optimization problem, and one or more of them might guarantee a globally optimal solution. However, it is quite possible that, relative to one or more suboptimal solutions, a globally optimal solution might be less interpretable from the standpoint of psychological theory or be less in accordance with some known (i.e., true) cluster structure. For example, in simulation experiments, it has sometimes been observed that there is not a perfect correspondence between the optimized clustering criterion and recovery of the underlying known cluster structure. This can lead to the misconception that clustering methods with a tendency to produce suboptimal solutions might, in some instances, be preferable to superior methods that provide globally optimal (or at least better locally optimal) solutions. In this article, we present results from simulation studies in the context of K-median clustering where departure from global optimality was carefully controlled. Although the results showed that suboptimal solutions sometimes produced marginally better recovery for experimental cells where the known cluster structure was less well-defined, capriciously accepting inferior solutions is an unwise practice. However, there are instances in which some sacrifice in the optimization criterion value to meet certain desirable constraints or to improve the value of one or more other relevant criteria is principled. (PsycInfo Database Record (c) 2024 APA, all rights reserved).\",\"PeriodicalId\":20782,\"journal\":{\"name\":\"Psychological methods\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological methods\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/met0000688\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000688","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
许多聚类问题都与需要优化的特定目标标准有关。通常有几种方法可以用来解决优化问题,其中一种或多种方法可以保证获得全局最优解。然而,相对于一个或多个次优解,全局最优解很可能从心理学理论的角度来看不那么好解释,或不那么符合某些已知(即真实)的群组结构。例如,在模拟实验中,有时会发现优化后的聚类标准与基本已知聚类结构的恢复之间并不完全对应。这可能导致一种误解,即在某些情况下,倾向于产生次优解的聚类方法可能优于提供全局最优解(或至少更好的局部最优解)的优越方法。在本文中,我们介绍了 K-中值聚类模拟研究的结果,其中对偏离全局最优的情况进行了严格控制。虽然研究结果表明,对于已知聚类结构不太明确的实验单元,次优解有时能产生稍好的恢复效果,但任性地接受劣质解是不明智的做法。不过,在某些情况下,牺牲优化标准值以满足某些理想的约束条件或提高一个或多个其他相关标准值的做法是有原则的。(PsycInfo Database Record (c) 2024 APA,保留所有权利)。
Clustering methods: To optimize or to not optimize?
Many clustering problems are associated with a particular objective criterion that is sought to be optimized. There are often several methods that can be used to tackle the optimization problem, and one or more of them might guarantee a globally optimal solution. However, it is quite possible that, relative to one or more suboptimal solutions, a globally optimal solution might be less interpretable from the standpoint of psychological theory or be less in accordance with some known (i.e., true) cluster structure. For example, in simulation experiments, it has sometimes been observed that there is not a perfect correspondence between the optimized clustering criterion and recovery of the underlying known cluster structure. This can lead to the misconception that clustering methods with a tendency to produce suboptimal solutions might, in some instances, be preferable to superior methods that provide globally optimal (or at least better locally optimal) solutions. In this article, we present results from simulation studies in the context of K-median clustering where departure from global optimality was carefully controlled. Although the results showed that suboptimal solutions sometimes produced marginally better recovery for experimental cells where the known cluster structure was less well-defined, capriciously accepting inferior solutions is an unwise practice. However, there are instances in which some sacrifice in the optimization criterion value to meet certain desirable constraints or to improve the value of one or more other relevant criteria is principled. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
期刊介绍:
Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.