Harmonic Alignment.

Jay S Stanley, Scott Gigante, Guy Wolf, Smita Krishnaswamy
{"title":"Harmonic Alignment.","authors":"Jay S Stanley,&nbsp;Scott Gigante,&nbsp;Guy Wolf,&nbsp;Smita Krishnaswamy","doi":"10.1137/1.9781611976236.36","DOIUrl":null,"url":null,"abstract":"<p><p>We propose a novel framework for combining datasets via alignment of their intrinsic geometry. This alignment can be used to fuse data originating from disparate modalities, or to correct batch effects while preserving intrinsic data structure. Importantly, we do not assume any pointwise correspondence between datasets, but instead rely on correspondence between a (possibly unknown) subset of data features. We leverage this assumption to construct an isometric alignment between the data. This alignment is obtained by relating the expansion of data features in harmonics derived from diffusion operators defined over each dataset. These expansions encode each feature as a function of the data geometry. We use this to relate the diffusion coordinates of each dataset through our assumption of partial feature correspondence. Then, a unified diffusion geometry is constructed over the aligned data, which can also be used to correct the original data measurements. We demonstrate our method on several datasets, showing in particular its effectiveness in biological applications including fusion of single-cell RNA sequencing (scRNA-seq) and single-cell ATAC sequencing (scATAC-seq) data measured on the same population of cells, and removal of batch effect between biological samples.</p>","PeriodicalId":74533,"journal":{"name":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1137/1.9781611976236.36","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/1.9781611976236.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

We propose a novel framework for combining datasets via alignment of their intrinsic geometry. This alignment can be used to fuse data originating from disparate modalities, or to correct batch effects while preserving intrinsic data structure. Importantly, we do not assume any pointwise correspondence between datasets, but instead rely on correspondence between a (possibly unknown) subset of data features. We leverage this assumption to construct an isometric alignment between the data. This alignment is obtained by relating the expansion of data features in harmonics derived from diffusion operators defined over each dataset. These expansions encode each feature as a function of the data geometry. We use this to relate the diffusion coordinates of each dataset through our assumption of partial feature correspondence. Then, a unified diffusion geometry is constructed over the aligned data, which can also be used to correct the original data measurements. We demonstrate our method on several datasets, showing in particular its effectiveness in biological applications including fusion of single-cell RNA sequencing (scRNA-seq) and single-cell ATAC sequencing (scATAC-seq) data measured on the same population of cells, and removal of batch effect between biological samples.

谐波对齐。
我们提出了一种新的框架,通过对数据集的内在几何形状进行对齐来组合数据集。这种对齐可以用于融合来自不同模式的数据,或者在保留固有数据结构的同时纠正批处理效果。重要的是,我们不假设数据集之间有任何点向对应,而是依赖于(可能未知的)数据特征子集之间的对应。我们利用这个假设来构建数据之间的等距对齐。这种对齐是通过在每个数据集上定义的扩散算子派生的谐波中关联数据特征的扩展而获得的。这些扩展将每个特征编码为数据几何的函数。我们通过部分特征对应的假设来关联每个数据集的扩散坐标。然后,在对齐的数据上构造统一的扩散几何,该几何也可以用于校正原始数据测量。我们在几个数据集上展示了我们的方法,特别显示了它在生物学应用中的有效性,包括在同一细胞群上测量的单细胞RNA测序(scRNA-seq)和单细胞ATAC测序(scATAC-seq)数据的融合,以及去除生物样品之间的批效应。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信