SIAM journal on mathematics of data science最新文献

筛选
英文 中文
Test Data Reuse for the Evaluation of Continuously Evolving Classification Algorithms Using the Area under the Receiver Operating Characteristic Curve 基于接收者工作特征曲线下面积的连续进化分类算法评估测试数据重用
SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/20M1333110
Alexej Gossmann, Aria Pezeshk, Yu-ping Wang, B. Sahiner
{"title":"Test Data Reuse for the Evaluation of Continuously Evolving Classification Algorithms Using the Area under the Receiver Operating Characteristic Curve","authors":"Alexej Gossmann, Aria Pezeshk, Yu-ping Wang, B. Sahiner","doi":"10.1137/20M1333110","DOIUrl":"https://doi.org/10.1137/20M1333110","url":null,"abstract":"","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"43 1","pages":"692-714"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91101484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Error Bounds for Dynamical Spectral Estimation. 动态频谱估算的误差限。
SIAM journal on mathematics of data science Pub Date : 2021-01-01 Epub Date: 2021-02-11 DOI: 10.1137/20m1335984
Robert J Webber, Erik H Thiede, Douglas Dow, Aaron R Dinner, Jonathan Weare
{"title":"Error Bounds for Dynamical Spectral Estimation.","authors":"Robert J Webber, Erik H Thiede, Douglas Dow, Aaron R Dinner, Jonathan Weare","doi":"10.1137/20m1335984","DOIUrl":"10.1137/20m1335984","url":null,"abstract":"<p><p>Dynamical spectral estimation is a well-established numerical approach for estimating eigenvalues and eigenfunctions of the Markov transition operator from trajectory data. Although the approach has been widely applied in biomolecular simulations, its error properties remain poorly understood. Here we analyze the error of a dynamical spectral estimation method called \"the variational approach to conformational dynamics\" (VAC). We bound the approximation error and estimation error for VAC estimates. Our analysis establishes VAC's convergence properties and suggests new strategies for tuning VAC to improve accuracy.</p>","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"3 1","pages":"225-252"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8336423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39281319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Global Minima of Overparameterized Neural Networks 过参数化神经网络的全局最小值
SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/19M1308943
Y. Cooper
{"title":"Global Minima of Overparameterized Neural Networks","authors":"Y. Cooper","doi":"10.1137/19M1308943","DOIUrl":"https://doi.org/10.1137/19M1308943","url":null,"abstract":"","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"3 2","pages":"676-691"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72476727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Approximation Properties of Ridge Functions and Extreme Learning Machines 岭函数的近似性质与极限学习机
SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/20m1356348
P. Jorgensen, D. Stewart
{"title":"Approximation Properties of Ridge Functions and Extreme Learning Machines","authors":"P. Jorgensen, D. Stewart","doi":"10.1137/20m1356348","DOIUrl":"https://doi.org/10.1137/20m1356348","url":null,"abstract":"For a compact set $Dsubsetmathbb{R}^{m}$ we consider the problem of approximating a function $f$ over $D$ by sums of ridge functions ${x}mapstovarphi({w}^{T}{x})$ with ${w}$ in a given set $ma...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"100 1","pages":"815-832"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75679201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Nonbacktracking Eigenvalues under Node Removal: X-Centrality and Targeted Immunization 节点去除下的非回溯特征值:x中心性和靶向免疫
SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/20M1352132
Leonardo A. B. Tôrres, Kevin S. Chan, Hanghang Tong, T. Eliassi-Rad
{"title":"Nonbacktracking Eigenvalues under Node Removal: X-Centrality and Targeted Immunization","authors":"Leonardo A. B. Tôrres, Kevin S. Chan, Hanghang Tong, T. Eliassi-Rad","doi":"10.1137/20M1352132","DOIUrl":"https://doi.org/10.1137/20M1352132","url":null,"abstract":". The non-backtracking matrix and its eigenvalues have many applications in network science and 5 graph mining, such as node and edge centrality, community detection, length spectrum theory, 6 graph distance, and epidemic and percolation thresholds. In network epidemiology, the reciprocal 7 of the largest eigenvalue of the non-backtracking matrix is a good approximation for the epidemic 8 threshold of certain network dynamics. In this work, we develop techniques that identify which 9 nodes have the largest impact on this leading eigenvalue. We do so by studying the behavior of 10 the spectrum of the non-backtracking matrix after a node is removed from the graph. From this 11 analysis we derive two new centrality measures: X -degree and X-non-backtracking centrality . We 12 perform extensive experimentation with targeted immunization strategies derived from these two 13 centrality measures. Our spectral analysis and centrality measures can be broadly applied, and will 14 be of interest to both theorists and practitioners alike. the perturbation of quadratic eigenvalue problems, with applications to the NB- eigenvalues of the stochastic block","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"31 1","pages":"656-675"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85132979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Spectral neighbor joining for reconstruction of latent tree Models. 用于潜在树模型重建的光谱邻居连接。
SIAM journal on mathematics of data science Pub Date : 2021-01-01 Epub Date: 2021-02-01 DOI: 10.1137/20m1365715
Ariel Jaffe, Noah Amsel, Yariv Aizenbud, Boaz Nadler, Joseph T Chang, Yuval Kluger
{"title":"Spectral neighbor joining for reconstruction of latent tree Models.","authors":"Ariel Jaffe,&nbsp;Noah Amsel,&nbsp;Yariv Aizenbud,&nbsp;Boaz Nadler,&nbsp;Joseph T Chang,&nbsp;Yuval Kluger","doi":"10.1137/20m1365715","DOIUrl":"https://doi.org/10.1137/20m1365715","url":null,"abstract":"<p><p>A common assumption in multiple scientific applications is that the distribution of observed data can be modeled by a latent tree graphical model. An important example is phylogenetics, where the tree models the evolutionary lineages of a set of observed organisms. Given a set of independent realizations of the random variables at the leaves of the tree, a key challenge is to infer the underlying tree topology. In this work we develop Spectral Neighbor Joining (SNJ), a novel method to recover the structure of latent tree graphical models. Given a matrix that contains a measure of similarity between all pairs of observed variables, SNJ computes a spectral measure of cohesion between groups of observed variables. We prove that SNJ is consistent, and derive a sufficient condition for correct tree recovery from an estimated similarity matrix. Combining this condition with a concentration of measure result on the similarity matrix, we bound the number of samples required to recover the tree with high probability. We illustrate via extensive simulations that in comparison to several other reconstruction methods, SNJ requires fewer samples to accurately recover trees with a large number of leaves or long edges.</p>","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"3 1","pages":"113-141"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8194222/pdf/nihms-1702804.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39091867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Doubly Stochastic Normalization of the Gaussian Kernel Is Robust to Heteroskedastic Noise. 高斯核的双随机归一化对异方差噪声具有鲁棒性。
SIAM journal on mathematics of data science Pub Date : 2021-01-01 Epub Date: 2021-03-23 DOI: 10.1137/20M1342124
Boris Landa, Ronald R Coifman, Yuval Kluger
{"title":"Doubly Stochastic Normalization of the Gaussian Kernel Is Robust to Heteroskedastic Noise.","authors":"Boris Landa,&nbsp;Ronald R Coifman,&nbsp;Yuval Kluger","doi":"10.1137/20M1342124","DOIUrl":"https://doi.org/10.1137/20M1342124","url":null,"abstract":"<p><p>A fundamental step in many data-analysis techniques is the construction of an affinity matrix describing similarities between data points. When the data points reside in Euclidean space, a widespread approach is to from an affinity matrix by the Gaussian kernel with pairwise distances, and to follow with a certain normalization (e.g. the row-stochastic normalization or its symmetric variant). We demonstrate that the doubly-stochastic normalization of the Gaussian kernel with zero main diagonal (i.e., no self loops) is robust to heteroskedastic noise. That is, the doubly-stochastic normalization is advantageous in that it automatically accounts for observations with different noise variances. Specifically, we prove that in a suitable high-dimensional setting where heteroskedastic noise does not concentrate too much in any particular direction in space, the resulting (doubly-stochastic) noisy affinity matrix converges to its clean counterpart with rate <i>m</i> <sup>-1/2</sup>, where <i>m</i> is the ambient dimension. We demonstrate this result numerically, and show that in contrast, the popular row-stochastic and symmetric normalizations behave unfavorably under heteroskedastic noise. Furthermore, we provide examples of simulated and experimental single-cell RNA sequence data with intrinsic heteroskedasticity, where the advantage of the doubly-stochastic normalization for exploratory analysis is evident.</p>","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"3 1","pages":"388-413"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8194191/pdf/nihms-1702812.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39091868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
MR-GAN: Manifold Regularized Generative Adversarial Networks for Scientific Data gan:科学数据的流形正则化生成对抗网络
SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/20m1344299
Qunwei Li, B. Kailkhura, R. Anirudh, Jize Zhang, Yi Zhou, Yingbin Liang, T. Y. Han, P. Varshney
{"title":"MR-GAN: Manifold Regularized Generative Adversarial Networks for Scientific Data","authors":"Qunwei Li, B. Kailkhura, R. Anirudh, Jize Zhang, Yi Zhou, Yingbin Liang, T. Y. Han, P. Varshney","doi":"10.1137/20m1344299","DOIUrl":"https://doi.org/10.1137/20m1344299","url":null,"abstract":"","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"15 1","pages":"1197-1222"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75872271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Optimal Algorithm for Strict Circular Seriation 严格循环序列的最优算法
SIAM journal on mathematics of data science Pub Date : 2021-01-01 DOI: 10.1137/21m139356x
Santiago Armstrong, Crist'obal Guzm'an, C. Sing-Long
{"title":"An Optimal Algorithm for Strict Circular Seriation","authors":"Santiago Armstrong, Crist'obal Guzm'an, C. Sing-Long","doi":"10.1137/21m139356x","DOIUrl":"https://doi.org/10.1137/21m139356x","url":null,"abstract":"We study the problem of circular seriation, where we are given a matrix of pairwise dissimilarities between $n$ objects, and the goal is to find a {em circular order} of the objects in a manner that is consistent with their dissimilarity. This problem is a generalization of the classical {em linear seriation} problem where the goal is to find a {em linear order}, and for which optimal ${cal O}(n^2)$ algorithms are known. Our contributions can be summarized as follows. First, we introduce {em circular Robinson matrices} as the natural class of dissimilarity matrices for the circular seriation problem. Second, for the case of {em strict circular Robinson dissimilarity matrices} we provide an optimal ${cal O}(n^2)$ algorithm for the circular seriation problem. Finally, we propose a statistical model to analyze the well-posedness of the circular seriation problem for large $n$. In particular, we establish ${cal O}(log(n)/n)$ rates on the distance between any circular ordering found by solving the circular seriation problem to the underlying order of the model, in the Kendall-tau metric.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"25 1","pages":"1223-1250"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84685942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
k-Variance: A Clustered Notion of Variance k-方差:方差的聚类概念
SIAM journal on mathematics of data science Pub Date : 2020-12-13 DOI: 10.1137/20m1385895
J. Solomon, Kristjan H. Greenewald, H. Nagaraja
{"title":"k-Variance: A Clustered Notion of Variance","authors":"J. Solomon, Kristjan H. Greenewald, H. Nagaraja","doi":"10.1137/20m1385895","DOIUrl":"https://doi.org/10.1137/20m1385895","url":null,"abstract":"We introduce $k$-variance, a generalization of variance built on the machinery of random bipartite matchings. $K$-variance measures the expected cost of matching two sets of $k$ samples from a distribution to each other, capturing local rather than global information about a measure as $k$ increases; it is easily approximated stochastically using sampling and linear programming. In addition to defining $k$-variance and proving its basic properties, we provide in-depth analysis of this quantity in several key cases, including one-dimensional measures, clustered measures, and measures concentrated on low-dimensional subsets of $mathbb R^n$. We conclude with experiments and open problems motivated by this new way to summarize distributional shape.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"59 1","pages":"957-978"},"PeriodicalIF":0.0,"publicationDate":"2020-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90851468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信