区间数据加权广义关联系数两种计算算法的比较

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS
Stats Pub Date : 2023-10-13 DOI:10.3390/stats6040068
Áurea Sousa, Osvaldo Silva, Leonor Bacelar-Nicolau, João Cabral, Helena Bacelar-Nicolau
{"title":"区间数据加权广义关联系数两种计算算法的比较","authors":"Áurea Sousa, Osvaldo Silva, Leonor Bacelar-Nicolau, João Cabral, Helena Bacelar-Nicolau","doi":"10.3390/stats6040068","DOIUrl":null,"url":null,"abstract":"From the affinity coefficient between two discrete probability distributions proposed by Matusita, Bacelar-Nicolau introduced the affinity coefficient in a cluster analysis context and extended it to different types of data, including for the case of complex and heterogeneous data within the scope of symbolic data analysis (SDA). In this study, we refer to the most significant partitions obtained using the hierarchical cluster analysis (h.c.a.) of two well-known datasets that were taken from the literature on complex (symbolic) data analysis. h.c.a. is based on the weighted generalized affinity coefficient for the case of interval data and on probabilistic aggregation criteria from a VL parametric family. To calculate the values of this coefficient, two alternative algorithms were used and compared. Both algorithms were able to detect clusters of macrodata (aggregated data into groups of interest) that were consistent and consonant with those reported in the literature, but one performed better than the other in some specific cases. Moreover, both approaches allow for the treatment of large microdatabases (non-aggregated data) after their transformation into macrodata from the huge microdata.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"55 1","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparison between Two Algorithms for Computing the Weighted Generalized Affinity Coefficient in the Case of Interval Data\",\"authors\":\"Áurea Sousa, Osvaldo Silva, Leonor Bacelar-Nicolau, João Cabral, Helena Bacelar-Nicolau\",\"doi\":\"10.3390/stats6040068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"From the affinity coefficient between two discrete probability distributions proposed by Matusita, Bacelar-Nicolau introduced the affinity coefficient in a cluster analysis context and extended it to different types of data, including for the case of complex and heterogeneous data within the scope of symbolic data analysis (SDA). In this study, we refer to the most significant partitions obtained using the hierarchical cluster analysis (h.c.a.) of two well-known datasets that were taken from the literature on complex (symbolic) data analysis. h.c.a. is based on the weighted generalized affinity coefficient for the case of interval data and on probabilistic aggregation criteria from a VL parametric family. To calculate the values of this coefficient, two alternative algorithms were used and compared. Both algorithms were able to detect clusters of macrodata (aggregated data into groups of interest) that were consistent and consonant with those reported in the literature, but one performed better than the other in some specific cases. Moreover, both approaches allow for the treatment of large microdatabases (non-aggregated data) after their transformation into macrodata from the huge microdata.\",\"PeriodicalId\":93142,\"journal\":{\"name\":\"Stats\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2023-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Stats\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/stats6040068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stats","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/stats6040068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 1

摘要

Bacelar-Nicolau从Matusita提出的两个离散概率分布之间的亲和系数出发,将亲和系数引入到聚类分析环境中,并将其扩展到不同类型的数据,包括符号数据分析(SDA)范围内的复杂和异构数据。在本研究中,我们参考了使用层次聚类分析(h.c.a.)从复杂(符号)数据分析文献中获取的两个知名数据集获得的最显著分区。该方法基于区间数据的加权广义亲和系数和VL参数族的概率聚合准则。为了计算该系数的值,使用了两种替代算法并进行了比较。这两种算法都能够检测与文献中报道的一致和一致的宏观数据簇(将数据聚合到感兴趣的组中),但在某些特定情况下,一种算法比另一种算法表现得更好。此外,这两种方法都允许将大型微数据库(非聚合数据)从庞大的微数据转换为宏数据后进行处理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparison between Two Algorithms for Computing the Weighted Generalized Affinity Coefficient in the Case of Interval Data
From the affinity coefficient between two discrete probability distributions proposed by Matusita, Bacelar-Nicolau introduced the affinity coefficient in a cluster analysis context and extended it to different types of data, including for the case of complex and heterogeneous data within the scope of symbolic data analysis (SDA). In this study, we refer to the most significant partitions obtained using the hierarchical cluster analysis (h.c.a.) of two well-known datasets that were taken from the literature on complex (symbolic) data analysis. h.c.a. is based on the weighted generalized affinity coefficient for the case of interval data and on probabilistic aggregation criteria from a VL parametric family. To calculate the values of this coefficient, two alternative algorithms were used and compared. Both algorithms were able to detect clusters of macrodata (aggregated data into groups of interest) that were consistent and consonant with those reported in the literature, but one performed better than the other in some specific cases. Moreover, both approaches allow for the treatment of large microdatabases (non-aggregated data) after their transformation into macrodata from the huge microdata.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
0.60
自引率
0.00%
发文量
0
审稿时长
7 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信