Impact of Varying Distance-Based Fingerprint Similarity Metrics on Affinity Propagation Clustering Performance in Received Signal Strength-Based Fingerprint Databases
{"title":"Impact of Varying Distance-Based Fingerprint Similarity Metrics on Affinity Propagation Clustering Performance in Received Signal Strength-Based Fingerprint Databases","authors":"Abdulmalik Shehu Yaro;Filip Maly;Karel Maly;Pavel Prazak","doi":"10.1109/OJSP.2024.3449816","DOIUrl":null,"url":null,"abstract":"The affinity propagation clustering (APC) algorithm is popular for fingerprint database clustering because it can cluster without pre-defining the number of clusters. However, the clustering performance of the APC algorithm heavily depends on the chosen fingerprint similarity metric, with distance-based metrics being the most commonly used. Despite its popularity, the APC algorithm lacks comprehensive research on how distance-based metrics affect clustering performance. This emphasizes the need for a better understanding of how these metrics influence its clustering performance, particularly in fingerprint databases. This paper investigates the impact of various distance-based fingerprint similarity metrics on the clustering performance of the APC algorithm. It identifies the best fingerprint similarity metric for optimal clustering performance for a given fingerprint database. The analysis is conducted across five experimentally generated online fingerprint databases, utilizing seven distance-based metrics: Euclidean, squared Euclidean, Manhattan, Spearman, cosine, Canberra, and Chebyshev distances. Using the silhouette score as the performance metric, the simulation results indicate that structural characteristics of the fingerprint database, such as the distribution of fingerprint vectors, play a key role in selecting the best fingerprint similarity metric. However, Euclidean and Manhattan distances are generally the preferable choices for use as fingerprint similarity metrics for the APC algorithm across most fingerprint databases, regardless of their structural characteristics. It is recommended that other factors, such as computational intensity and the presence or absence of outliers, be considered alongside the structural characteristics of the fingerprint database when choosing the appropriate fingerprint similarity metric for maximum clustering performance.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"5 ","pages":"1005-1014"},"PeriodicalIF":2.9000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10646489","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of signal processing","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10646489/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The affinity propagation clustering (APC) algorithm is popular for fingerprint database clustering because it can cluster without pre-defining the number of clusters. However, the clustering performance of the APC algorithm heavily depends on the chosen fingerprint similarity metric, with distance-based metrics being the most commonly used. Despite its popularity, the APC algorithm lacks comprehensive research on how distance-based metrics affect clustering performance. This emphasizes the need for a better understanding of how these metrics influence its clustering performance, particularly in fingerprint databases. This paper investigates the impact of various distance-based fingerprint similarity metrics on the clustering performance of the APC algorithm. It identifies the best fingerprint similarity metric for optimal clustering performance for a given fingerprint database. The analysis is conducted across five experimentally generated online fingerprint databases, utilizing seven distance-based metrics: Euclidean, squared Euclidean, Manhattan, Spearman, cosine, Canberra, and Chebyshev distances. Using the silhouette score as the performance metric, the simulation results indicate that structural characteristics of the fingerprint database, such as the distribution of fingerprint vectors, play a key role in selecting the best fingerprint similarity metric. However, Euclidean and Manhattan distances are generally the preferable choices for use as fingerprint similarity metrics for the APC algorithm across most fingerprint databases, regardless of their structural characteristics. It is recommended that other factors, such as computational intensity and the presence or absence of outliers, be considered alongside the structural characteristics of the fingerprint database when choosing the appropriate fingerprint similarity metric for maximum clustering performance.