Principal Component Analysis in Space Forms

IF 4.6 2区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Puoya Tabaghi;Michael Khanzadeh;Yusu Wang;Siavash Mirarab
{"title":"Principal Component Analysis in Space Forms","authors":"Puoya Tabaghi;Michael Khanzadeh;Yusu Wang;Siavash Mirarab","doi":"10.1109/TSP.2024.3457529","DOIUrl":null,"url":null,"abstract":"Principal Component Analysis (PCA) is a workhorse of modern data science. While PCA assumes the data conforms to Euclidean geometry, for specific data types, such as hierarchical and cyclic data structures, other spaces are more appropriate. We study PCA in space forms; that is, those with constant curvatures. At a point on a Riemannian manifold, we can define a Riemannian affine subspace based on a set of tangent vectors. Finding the optimal low-dimensional affine subspace for given points in a space form amounts to dimensionality reduction. Our Space Form PCA (SFPCA) seeks the affine subspace that best represents a set of manifold-valued points with the minimum projection cost. We propose proper cost functions that enjoy two properties: (1) their optimal affine subspace is the solution to an eigenequation, and (2) optimal affine subspaces of different dimensions form a nested set. These properties provide advances over existing methods, which are mostly iterative algorithms with slow convergence and weaker theoretical guarantees. We evaluate the proposed SFPCA on real and simulated data in spherical and hyperbolic spaces. We show that it outperforms alternative methods in estimating true subspaces (in simulated data) with respect to convergence speed or accuracy, often both.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"72 ","pages":"4428-4443"},"PeriodicalIF":4.6000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10670424","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10670424/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Principal Component Analysis (PCA) is a workhorse of modern data science. While PCA assumes the data conforms to Euclidean geometry, for specific data types, such as hierarchical and cyclic data structures, other spaces are more appropriate. We study PCA in space forms; that is, those with constant curvatures. At a point on a Riemannian manifold, we can define a Riemannian affine subspace based on a set of tangent vectors. Finding the optimal low-dimensional affine subspace for given points in a space form amounts to dimensionality reduction. Our Space Form PCA (SFPCA) seeks the affine subspace that best represents a set of manifold-valued points with the minimum projection cost. We propose proper cost functions that enjoy two properties: (1) their optimal affine subspace is the solution to an eigenequation, and (2) optimal affine subspaces of different dimensions form a nested set. These properties provide advances over existing methods, which are mostly iterative algorithms with slow convergence and weaker theoretical guarantees. We evaluate the proposed SFPCA on real and simulated data in spherical and hyperbolic spaces. We show that it outperforms alternative methods in estimating true subspaces (in simulated data) with respect to convergence speed or accuracy, often both.
空间形式的主成分分析
主成分分析(PCA)是现代数据科学的主要工具。虽然 PCA 假设数据符合欧几里得几何学,但对于特定的数据类型,如分层和循环数据结构,其他空间更为合适。我们研究的是空间形式的 PCA,即具有恒定曲率的空间形式。在黎曼流形上的某一点,我们可以根据一组切向量定义一个黎曼仿射子空间。为空间形式中的给定点找到最佳低维仿射子空间相当于降维。我们的空间形式 PCA(Space Form PCA,SFPCA)寻求的是以最小投影成本最好地代表一组流形值点的仿射子空间。我们提出的适当成本函数具有两个特性:(1)其最优仿射子空间是一个特征方程的解,(2)不同维度的最优仿射子空间形成一个嵌套集。这些特性是现有方法的进步所在,现有方法大多是迭代算法,收敛速度慢,理论保证较弱。我们在球面空间和双曲空间的真实数据和模拟数据上对所提出的 SFPCA 进行了评估。结果表明,在估计真实子空间(模拟数据)方面,SFPCA 在收敛速度或准确性(通常两者兼而有之)方面优于其他方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing 工程技术-工程:电子与电气
CiteScore
11.20
自引率
9.30%
发文量
310
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Signal Processing covers novel theory, algorithms, performance analyses and applications of techniques for the processing, understanding, learning, retrieval, mining, and extraction of information from signals. The term “signal” includes, among others, audio, video, speech, image, communication, geophysical, sonar, radar, medical and musical signals. Examples of topics of interest include, but are not limited to, information processing and the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信