Foundations of data science (Springfield, Mo.)最新文献

筛选
英文 中文
Random Walks and Markov Chains 随机漫步和马尔可夫链
Foundations of data science (Springfield, Mo.) Pub Date : 2020-01-01 DOI: 10.1017/9781108755528.004
Avrim Blum, J. Hopcroft, R. Kannan
{"title":"Random Walks and Markov Chains","authors":"Avrim Blum, J. Hopcroft, R. Kannan","doi":"10.1017/9781108755528.004","DOIUrl":"https://doi.org/10.1017/9781108755528.004","url":null,"abstract":"","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/9781108755528.004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"56925823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Stability of non-linear filter for deterministic dynamics 确定性动力学中非线性滤波器的稳定性
Foundations of data science (Springfield, Mo.) Pub Date : 2019-10-31 DOI: 10.3934/fods.2021025
A. Reddy, A. Apte
{"title":"Stability of non-linear filter for deterministic dynamics","authors":"A. Reddy, A. Apte","doi":"10.3934/fods.2021025","DOIUrl":"https://doi.org/10.3934/fods.2021025","url":null,"abstract":"This papers shows that nonlinear filter in the case of deterministic dynamics is stable with respect to the initial conditions under the conditions that observations are sufficiently rich, both in the context of continuous and discrete time filters. Earlier works on the stability of the nonlinear filters are in the context of stochastic dynamics and assume conditions like compact state space or time independent observation model, whereas we prove filter stability for deterministic dynamics with more general assumptions on the state space and observation process. We give several examples of systems that satisfy these assumptions. We also show that the asymptotic structure of the filtering distribution is related to the dynamical properties of the signal.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46089690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Bayesian nonparametric test for conditional independence 条件独立性的贝叶斯非参数检验
Foundations of data science (Springfield, Mo.) Pub Date : 2019-10-24 DOI: 10.3934/FODS.2020009
Onur Teymur, S. Filippi
{"title":"A Bayesian nonparametric test for conditional independence","authors":"Onur Teymur, S. Filippi","doi":"10.3934/FODS.2020009","DOIUrl":"https://doi.org/10.3934/FODS.2020009","url":null,"abstract":"This article introduces a Bayesian nonparametric method for quantifying the relative evidence in a dataset in favour of the dependence or independence of two variables conditional on a third. The approach uses Polya tree priors on spaces of conditional probability densities, accounting for uncertainty in the form of the underlying distributions in a nonparametric way. The Bayesian perspective provides an inherently symmetric probability measure of conditional dependence or independence, a feature particularly advantageous in causal discovery and not employed in existing procedures of this type.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43943702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Modelling dynamic network evolution as a Pitman-Yor process 将动态网络演化建模为Pitman-Yor过程
Foundations of data science (Springfield, Mo.) Pub Date : 2019-08-28 DOI: 10.3934/fods.2019013
Francesco Sanna Passino, N. Heard
{"title":"Modelling dynamic network evolution as a Pitman-Yor process","authors":"Francesco Sanna Passino, N. Heard","doi":"10.3934/fods.2019013","DOIUrl":"https://doi.org/10.3934/fods.2019013","url":null,"abstract":"Dynamic interaction networks frequently arise in biology, communications technology and the social sciences, representing, for example, neuronal connectivity in the brain, internet connections between computers and human interactions within social networks. The evolution and strengthening of the links in such networks can be observed through sequences of connection events occurring between network nodes over time. In some of these applications, the identity and size of the network may be unknown a priori and may change over time. In this article, a model for the evolution of dynamic networks based on the Pitman-Yor process is proposed. This model explicitly admits power-laws in the number of connections on each edge, often present in real world networks, and, for careful choices of the parameters, power-laws for the degree distribution of the nodes. A novel empirical method for the estimation of the hyperparameters of the Pitman-Yor process is proposed, and some necessary corrections for uniform discrete base distributions are carefully addressed. The methodology is tested on synthetic data and in an anomaly detection study on the enterprise computer network of the Los Alamos National Laboratory, and successfully detects connections from a red-team penetration test.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48066324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Bayesian inference for latent chain graphs 潜链图的贝叶斯推理
Foundations of data science (Springfield, Mo.) Pub Date : 2019-08-12 DOI: 10.3934/fods.2020003
Deng Lu, M. Iorio, A. Jasra, G. Rosner
{"title":"Bayesian inference for latent chain graphs","authors":"Deng Lu, M. Iorio, A. Jasra, G. Rosner","doi":"10.3934/fods.2020003","DOIUrl":"https://doi.org/10.3934/fods.2020003","url":null,"abstract":"In this article we consider Bayesian inference for partially observed Andersson-Madigan-Perlman (AMP) Gaussian chain graph (CG) models. Such models are of particular interest in applications such as biological networks and financial time series. The model itself features a variety of constraints which make both prior modeling and computational inference challenging. We develop a framework for the aforementioned challenges, using a sequential Monte Carlo (SMC) method for statistical inference. Our approach is illustrated on both simulated data as well as real case studies from university graduation rates and a pharmacokinetics study.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46556258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
EmT: Locating empty territories of homology group generators in a dataset EmT:在数据集中定位同源群生成器的空区域
Foundations of data science (Springfield, Mo.) Pub Date : 2019-06-03 DOI: 10.3934/FODS.2019010
Xin Xu, J. Cisewski-Kehe
{"title":"EmT: Locating empty territories of homology group generators in a dataset","authors":"Xin Xu, J. Cisewski-Kehe","doi":"10.3934/FODS.2019010","DOIUrl":"https://doi.org/10.3934/FODS.2019010","url":null,"abstract":"Persistent homology is a tool within topological data analysis to detect different dimensional holes in a dataset. The boundaries of the empty territories (i.e., holes) are not well-defined and each has multiple representations. The proposed method, Empty Territory (EmT), provides representations of different dimensional holes with a specified level of complexity of the territory boundary. EmT is designed for the setting where persistent homology uses a Vietoris-Rips complex filtration, and works as a post-analysis to refine the hole representation of the persistent homology algorithm. In particular, EmT uses alpha shapes to obtain a special class of representations that captures the empty territories with a complexity determined by the size of the alpha balls. With a fixed complexity, EmT returns the representation that contains the most points within the special class of representations. This method is limited to finding 1D holes in 2D data and 2D holes in 3D data, and is illustrated on simulation datasets of a homogeneous Poisson point process in 2D and a uniform sampling in 3D. Furthermore, the method is applied to a 2D cell tower location geography dataset and 3D Sloan Digital Sky Survey (SDSS) galaxy dataset, where it works well in capturing the empty territories.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42374169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Levels and trends in the sex ratio at birth and missing female births for 29 states and union territories in India 1990–2016: A Bayesian modeling study 1990-2016年印度29个邦和联邦属地出生性别比和失踪女婴的水平和趋势:贝叶斯模型研究
Foundations of data science (Springfield, Mo.) Pub Date : 2019-06-03 DOI: 10.3934/FODS.2019008
Fengqing Chao, A. Yadav
{"title":"Levels and trends in the sex ratio at birth and missing female births for 29 states and union territories in India 1990–2016: A Bayesian modeling study","authors":"Fengqing Chao, A. Yadav","doi":"10.3934/FODS.2019008","DOIUrl":"https://doi.org/10.3934/FODS.2019008","url":null,"abstract":"The sex ratio at birth (SRB) has risen in India and reaches well beyond the levels under normal circumstances since the 1970s. The lasting imbalanced SRB has resulted in much more males than females in India. A population with severely distorted sex ratio is more likely to have prolonged struggle for stability and sustainability. It is crucial to estimate SRB and its imbalance for India on state level and assess the uncertainty around estimates. We develop a Bayesian model to estimate SRB in India from 1990 to 2016 for 29 states and union territories. Our analyses are based on a comprehensive database on state-level SRB with data from the sample registration system, census and Demographic and Health Surveys. The SRB varies greatly across Indian states and union territories in 2016: ranging from 1.026 (95% uncertainty interval [0.971; 1.087]) in Mizoram to 1.181 [1.143; 1.128] in Haryana. We identify 18 states and union territories with imbalanced SRB during 1990–2016, resulting in 14.9 [13.2; 16.5] million of missing female births in India. Uttar Pradesh has the largest share of the missing female births among all states and union territories, taking up to 32.8% [29.5%; 36.3%] of the total number.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47172022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Power weighted shortest paths for clustering Euclidean data 欧氏数据聚类的权加权最短路径
Foundations of data science (Springfield, Mo.) Pub Date : 2019-05-30 DOI: 10.3934/fods.2019014
Daniel Mckenzie, S. Damelin
{"title":"Power weighted shortest paths for clustering Euclidean data","authors":"Daniel Mckenzie, S. Damelin","doi":"10.3934/fods.2019014","DOIUrl":"https://doi.org/10.3934/fods.2019014","url":null,"abstract":"We study the use of power weighted shortest path distance functions for clustering high dimensional Euclidean data, under the assumption that the data is drawn from a collection of disjoint low dimensional manifolds. We argue, theoretically and experimentally, that this leads to higher clustering accuracy. We also present a fast algorithm for computing these distances.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70247788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
General risk measures for robust machine learning 鲁棒机器学习的一般风险度量
Foundations of data science (Springfield, Mo.) Pub Date : 2019-04-26 DOI: 10.3934/fods.2019011
É. Chouzenoux, Henri G'erard, J. Pesquet
{"title":"General risk measures for robust machine learning","authors":"É. Chouzenoux, Henri G'erard, J. Pesquet","doi":"10.3934/fods.2019011","DOIUrl":"https://doi.org/10.3934/fods.2019011","url":null,"abstract":"A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework which has been developed in quantitative finance for risk measures. We show that the original min-max problem can be recast as a convex minimization problem under suitable assumptions. We discuss several important examples of robust formulations, in particular by defining ambiguity sets based on $varphi$-divergences and the Wasserstein metric.We also propose an efficient algorithm for solving the corresponding convex optimization problems involving complex convex constraints. Through simulation examples, we demonstrate that this algorithm scales well on real data sets.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43459497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Estimation and uncertainty quantification for the output from quantum simulators 量子模拟器输出的估计与不确定性量化
Foundations of data science (Springfield, Mo.) Pub Date : 2019-03-07 DOI: 10.3934/FODS.2019007
R. Bennink, A. Jasra, K. Law, P. Lougovski
{"title":"Estimation and uncertainty quantification for the output from quantum simulators","authors":"R. Bennink, A. Jasra, K. Law, P. Lougovski","doi":"10.3934/FODS.2019007","DOIUrl":"https://doi.org/10.3934/FODS.2019007","url":null,"abstract":"The problem of estimating certain distributions over {0, 1}d is considered here. The distribution represents a quantum system of d qubits, where there are non-trivial dependencies between the qubits. A maximum entropy approach is adopted to reconstruct the distribution from exact moments or observed empirical moments. The Robbins Monro algorithm is used to solve the intractable maximum entropy problem, by constructing an unbiased estimator of the un-normalized target with a sequential Monte Carlo sampler at each iteration. In the case of empirical moments, this coincides with a maximum likelihood estimator. A Bayesian formulation is also considered in order to quantify uncertainty a posteriori. Several approaches are proposed in order to tackle this challenging problem, based on recently developed methodologies. In particular, unbiased estimators of the gradient of the log posterior are constructed and used within a provably convergent Langevin-based Markov chain Monte Carlo method. The methods are illustrated on classically simulated output from quantum simulators.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42733584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信