SIAM journal on mathematics of data science最新文献_第7页

Test Data Reuse for the Evaluation of Continuously Evolving Classification Algorithms Using the Area under the Receiver Operating Characteristic Curve 基于接收者工作特征曲线下面积的连续进化分类算法评估测试数据重用

SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/20M1333110

Alexej Gossmann, Aria Pezeshk, Yu-ping Wang, B. Sahiner

引用次数: 6

Error Bounds for Dynamical Spectral Estimation. 动态频谱估算的误差限。

SIAM journal on mathematics of data science Pub Date : 2021-01-01 Epub Date: 2021-02-11 DOI: 10.1137/20m1335984

Robert J Webber, Erik H Thiede, Douglas Dow, Aaron R Dinner, Jonathan Weare

引用次数: 0

Global Minima of Overparameterized Neural Networks 过参数化神经网络的全局最小值

SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/19M1308943

Y. Cooper

引用次数: 20

Approximation Properties of Ridge Functions and Extreme Learning Machines 岭函数的近似性质与极限学习机

SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/20m1356348

P. Jorgensen, D. Stewart

引用次数: 2

Nonbacktracking Eigenvalues under Node Removal: X-Centrality and Targeted Immunization 节点去除下的非回溯特征值:x中心性和靶向免疫

SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/20M1352132

Leonardo A. B. Tôrres, Kevin S. Chan, Hanghang Tong, T. Eliassi-Rad

{"title":"Nonbacktracking Eigenvalues under Node Removal: X-Centrality and Targeted Immunization","authors":"Leonardo A. B. Tôrres, Kevin S. Chan, Hanghang Tong, T. Eliassi-Rad","doi":"10.1137/20M1352132","DOIUrl":"https://doi.org/10.1137/20M1352132","url":null,"abstract":". The non-backtracking matrix and its eigenvalues have many applications in network science and 5 graph mining, such as node and edge centrality, community detection, length spectrum theory, 6 graph distance, and epidemic and percolation thresholds. In network epidemiology, the reciprocal 7 of the largest eigenvalue of the non-backtracking matrix is a good approximation for the epidemic 8 threshold of certain network dynamics. In this work, we develop techniques that identify which 9 nodes have the largest impact on this leading eigenvalue. We do so by studying the behavior of 10 the spectrum of the non-backtracking matrix after a node is removed from the graph. From this 11 analysis we derive two new centrality measures: X -degree and X-non-backtracking centrality . We 12 perform extensive experimentation with targeted immunization strategies derived from these two 13 centrality measures. Our spectral analysis and centrality measures can be broadly applied, and will 14 be of interest to both theorists and practitioners alike. the perturbation of quadratic eigenvalue problems, with applications to the NB- eigenvalues of the stochastic block","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"31 1","pages":"656-675"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85132979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Spectral neighbor joining for reconstruction of latent tree Models. 用于潜在树模型重建的光谱邻居连接。

SIAM journal on mathematics of data science Pub Date : 2021-01-01 Epub Date: 2021-02-01 DOI: 10.1137/20m1365715

Ariel Jaffe, Noah Amsel, Yariv Aizenbud, Boaz Nadler, Joseph T Chang, Yuval Kluger

{"title":"Spectral neighbor joining for reconstruction of latent tree Models.","authors":"Ariel Jaffe, Noah Amsel, Yariv Aizenbud, Boaz Nadler, Joseph T Chang, Yuval Kluger","doi":"10.1137/20m1365715","DOIUrl":"https://doi.org/10.1137/20m1365715","url":null,"abstract":"A common assumption in multiple scientific applications is that the distribution of observed data can be modeled by a latent tree graphical model. An important example is phylogenetics, where the tree models the evolutionary lineages of a set of observed organisms. Given a set of independent realizations of the random variables at the leaves of the tree, a key challenge is to infer the underlying tree topology. In this work we develop Spectral Neighbor Joining (SNJ), a novel method to recover the structure of latent tree graphical models. Given a matrix that contains a measure of similarity between all pairs of observed variables, SNJ computes a spectral measure of cohesion between groups of observed variables. We prove that SNJ is consistent, and derive a sufficient condition for correct tree recovery from an estimated similarity matrix. Combining this condition with a concentration of measure result on the similarity matrix, we bound the number of samples required to recover the tree with high probability. We illustrate via extensive simulations that in comparison to several other reconstruction methods, SNJ requires fewer samples to accurately recover trees with a large number of leaves or long edges.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"3 1","pages":"113-141"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8194222/pdf/nihms-1702804.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39091867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Doubly Stochastic Normalization of the Gaussian Kernel Is Robust to Heteroskedastic Noise. 高斯核的双随机归一化对异方差噪声具有鲁棒性。

SIAM journal on mathematics of data science Pub Date : 2021-01-01 Epub Date: 2021-03-23 DOI: 10.1137/20M1342124

Boris Landa, Ronald R Coifman, Yuval Kluger

{"title":"Doubly Stochastic Normalization of the Gaussian Kernel Is Robust to Heteroskedastic Noise.","authors":"Boris Landa, Ronald R Coifman, Yuval Kluger","doi":"10.1137/20M1342124","DOIUrl":"https://doi.org/10.1137/20M1342124","url":null,"abstract":"A fundamental step in many data-analysis techniques is the construction of an affinity matrix describing similarities between data points. When the data points reside in Euclidean space, a widespread approach is to from an affinity matrix by the Gaussian kernel with pairwise distances, and to follow with a certain normalization (e.g. the row-stochastic normalization or its symmetric variant). We demonstrate that the doubly-stochastic normalization of the Gaussian kernel with zero main diagonal (i.e., no self loops) is robust to heteroskedastic noise. That is, the doubly-stochastic normalization is advantageous in that it automatically accounts for observations with different noise variances. Specifically, we prove that in a suitable high-dimensional setting where heteroskedastic noise does not concentrate too much in any particular direction in space, the resulting (doubly-stochastic) noisy affinity matrix converges to its clean counterpart with rate m -1/2, where m is the ambient dimension. We demonstrate this result numerically, and show that in contrast, the popular row-stochastic and symmetric normalizations behave unfavorably under heteroskedastic noise. Furthermore, we provide examples of simulated and experimental single-cell RNA sequence data with intrinsic heteroskedasticity, where the advantage of the doubly-stochastic normalization for exploratory analysis is evident.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"3 1","pages":"388-413"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8194191/pdf/nihms-1702812.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39091868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

MR-GAN: Manifold Regularized Generative Adversarial Networks for Scientific Data gan:科学数据的流形正则化生成对抗网络

SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/20m1344299

Qunwei Li, B. Kailkhura, R. Anirudh, Jize Zhang, Yi Zhou, Yingbin Liang, T. Y. Han, P. Varshney

引用次数: 1

An Optimal Algorithm for Strict Circular Seriation 严格循环序列的最优算法

SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/21m139356x

Santiago Armstrong, Crist'obal Guzm'an, C. Sing-Long

引用次数: 4

k-Variance: A Clustered Notion of Variance k-方差:方差的聚类概念

SIAM journal on mathematics of data science Pub Date : 2020-12-13 DOI: 10.1137/20m1385895

J. Solomon, Kristjan H. Greenewald, H. Nagaraja

引用次数: 3