arXiv - STAT - Applications最新文献

筛选
英文 中文
Privacy risk from synthetic data: practical proposals 合成数据的隐私风险:实用建议
arXiv - STAT - Applications Pub Date : 2024-09-06 DOI: arxiv-2409.04257
Gillian M Raab
{"title":"Privacy risk from synthetic data: practical proposals","authors":"Gillian M Raab","doi":"arxiv-2409.04257","DOIUrl":"https://doi.org/arxiv-2409.04257","url":null,"abstract":"This paper proposes and compares measures of identity and attribute\u0000disclosure risk for synthetic data. Data custodians can use the methods\u0000proposed here to inform the decision as to whether to release synthetic\u0000versions of confidential data. Different measures are evaluated on two data\u0000sets. Insight into the measures is obtained by examining the details of the\u0000records identified as posing a disclosure risk. This leads to methods to\u0000identify, and possibly exclude, apparently risky records where the\u0000identification or attribution would be expected by someone with background\u0000knowledge of the data. The methods described are available as part of the\u0000textbf{synthpop} package for textbf{R}.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Electrocardiography Data Classification Confidence: A Robust Gaussian Process Approach (MuyGPs) 增强心电图数据分类的可信度:鲁棒高斯过程方法 (MuyGPs)
arXiv - STAT - Applications Pub Date : 2024-09-06 DOI: arxiv-2409.04642
Ukamaka V. Nnyaba, Hewan M. Shemtaga, David W. Collins, Amanda L. Muyskens, Benjamin W. Priest, Nedret Billor
{"title":"Enhancing Electrocardiography Data Classification Confidence: A Robust Gaussian Process Approach (MuyGPs)","authors":"Ukamaka V. Nnyaba, Hewan M. Shemtaga, David W. Collins, Amanda L. Muyskens, Benjamin W. Priest, Nedret Billor","doi":"arxiv-2409.04642","DOIUrl":"https://doi.org/arxiv-2409.04642","url":null,"abstract":"Analyzing electrocardiography (ECG) data is essential for diagnosing and\u0000monitoring various heart diseases. The clinical adoption of automated methods\u0000requires accurate confidence measurements, which are largely absent from\u0000existing classification methods. In this paper, we present a robust Gaussian\u0000Process classification hyperparameter training model (MuyGPs) for discerning\u0000normal heartbeat signals from the signals affected by different arrhythmias and\u0000myocardial infarction. We compare the performance of MuyGPs with traditional\u0000Gaussian process classifier as well as conventional machine learning models,\u0000such as, Random Forest, Extra Trees, k-Nearest Neighbors and Convolutional\u0000Neural Network. Comparing these models reveals MuyGPs as the most performant\u0000model for making confident predictions on individual patient ECGs. Furthermore,\u0000we explore the posterior distribution obtained from the Gaussian process to\u0000interpret the prediction and quantify uncertainty. In addition, we provide a\u0000guideline on obtaining the prediction confidence of the machine learning models\u0000and quantitatively compare the uncertainty measures of these models.\u0000Particularly, we identify a class of less-accurate (ambiguous) signals for\u0000further diagnosis by an expert.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Measuring Sell Side Outcomes in Buy Side Marketplace Experiments using In-Experiment Bipartite Graph 在买方市场实验中使用实验内双方图衡量卖方结果
arXiv - STAT - Applications Pub Date : 2024-09-06 DOI: arxiv-2409.04174
Vaiva Pilkauskaitė, Jevgenij Gamper, Rasa Giniūnaitė, Agne Reklaitė
{"title":"Towards Measuring Sell Side Outcomes in Buy Side Marketplace Experiments using In-Experiment Bipartite Graph","authors":"Vaiva Pilkauskaitė, Jevgenij Gamper, Rasa Giniūnaitė, Agne Reklaitė","doi":"arxiv-2409.04174","DOIUrl":"https://doi.org/arxiv-2409.04174","url":null,"abstract":"In this study, we evaluate causal inference estimators for online controlled\u0000bipartite graph experiments in a real marketplace setting. Our novel\u0000contribution is constructing a bipartite graph using in-experiment data, rather\u0000than relying on prior knowledge or historical data, the common approach in the\u0000literature published to date. We build the bipartite graph from various\u0000interactions between buyers and sellers in the marketplace, establishing a\u0000novel research direction at the intersection of bipartite experiments and\u0000mediation analysis. This approach is crucial for modern marketplaces aiming to\u0000evaluate seller-side causal effects in buyer-side experiments, or vice versa.\u0000We demonstrate our method using historical buyer-side experiments conducted at\u0000Vinted, the largest second-hand marketplace in Europe with over 80M users.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Clustering of Remote Sensing Scenes through Heterogeneous Transfer Learning 通过异构迁移学习对遥感场景进行深度聚类
arXiv - STAT - Applications Pub Date : 2024-09-05 DOI: arxiv-2409.03938
Isaac Ray, Alexei Skurikhin
{"title":"Deep Clustering of Remote Sensing Scenes through Heterogeneous Transfer Learning","authors":"Isaac Ray, Alexei Skurikhin","doi":"arxiv-2409.03938","DOIUrl":"https://doi.org/arxiv-2409.03938","url":null,"abstract":"This paper proposes a method for unsupervised whole-image clustering of a\u0000target dataset of remote sensing scenes with no labels. The method consists of\u0000three main steps: (1) finetuning a pretrained deep neural network (DINOv2) on a\u0000labelled source remote sensing imagery dataset and using it to extract a\u0000feature vector from each image in the target dataset, (2) reducing the\u0000dimension of these deep features via manifold projection into a low-dimensional\u0000Euclidean space, and (3) clustering the embedded features using a Bayesian\u0000nonparametric technique to infer the number and membership of clusters\u0000simultaneously. The method takes advantage of heterogeneous transfer learning\u0000to cluster unseen data with different feature and label distributions. We\u0000demonstrate the performance of this approach outperforming state-of-the-art\u0000zero-shot classification methods on several remote sensing scene classification\u0000datasets.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal effect of the infield shift in the MLB MLB 内野转移的因果效应
arXiv - STAT - Applications Pub Date : 2024-09-05 DOI: arxiv-2409.03940
Sonia Markes, Linbo Wang, Jessica Gronsbell, Katherine Evans
{"title":"Causal effect of the infield shift in the MLB","authors":"Sonia Markes, Linbo Wang, Jessica Gronsbell, Katherine Evans","doi":"arxiv-2409.03940","DOIUrl":"https://doi.org/arxiv-2409.03940","url":null,"abstract":"The infield shift has been increasingly used as a defensive strategy in\u0000baseball in recent years. Along with the upward trend in its usage, the\u0000notoriety of the shift has grown, as it is believed to be responsible for the\u0000recent decline in offence. In the 2023 season, Major League Baseball (MLB)\u0000implemented a rule change prohibiting the infield shift. However, there has\u0000been no systematic analysis of the effectiveness of infield shift to determine\u0000if it is a cause of the cooling in offence. We used publicly available data on\u0000MLB from 2015-2022 to evaluate the causal effect of the infield shift on the\u0000expected runs scored. We employed three methods for drawing causal conclusions\u0000from observational data -- nearest neighbour matching, inverse probability of\u0000treatment weighting, and instrumental variable analysis -- and evaluated the\u0000causal effect in subgroups defined by batter-handedness. The results of all\u0000methods showed the shift is effective at preventing runs, but primarily for\u0000left-handed batters.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Stochastic Weather Model: A Case of Bono Region of Ghana 随机天气模型:加纳博诺地区案例
arXiv - STAT - Applications Pub Date : 2024-09-04 DOI: arxiv-2409.06731
Bernard Gyamfi
{"title":"A Stochastic Weather Model: A Case of Bono Region of Ghana","authors":"Bernard Gyamfi","doi":"arxiv-2409.06731","DOIUrl":"https://doi.org/arxiv-2409.06731","url":null,"abstract":"The paper sought to fit an Ornstein Uhlenbeck model with seasonal mean and\u0000volatility, where the residuals are generated by a Brownian motion for Ghanian\u0000daily average temperature. This paper employed the modified Ornstein Uhlenbeck\u0000model proposed by Bhowan which has a seasonal mean and stochastic volatility\u0000process. The findings revealed that, the Bono region experiences warm\u0000temperatures and maximum precipitation up to 32.67 degree celsius and 126.51mm\u0000respectively. It was observed that the Daily Average Temperature (DAT) of the\u0000region reverts to a temperature of approximately 26 degree celsius at a rate of\u000018.72% with maximum and minimum temperatures of 32.67degree celsius and\u000019.75degree celsius respectively. Although the region is in the middle belt of\u0000Ghana, it still experiences warm(hot) temperatures daily and experiences dry\u0000seasons relatively more than wet seasons in the number of years considered for\u0000our analysis. Our model explained approximately 50% of the variations in the\u0000daily average temperature of the region which can be regarded as relatively a\u0000good model. The findings of this paper are relevant in the pricing of weather\u0000derivatives with temperature as an underlying variable in the Ghanaian\u0000financial and agricultural sector. Furthermore, it would assist in the\u0000development and design of tailored agriculture/crop insurance models which\u0000would incorporate temperature dynamics rather than extreme weather\u0000conditions/events such as floods, drought and wildfires.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142187903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meal-taking activity monitoring in the elderly based on sensor data: Comparison of unsupervised classification methods 基于传感器数据的老年人进餐活动监测:无监督分类方法的比较
arXiv - STAT - Applications Pub Date : 2024-09-04 DOI: arxiv-2409.02971
Abderrahim DerouicheLAAS-S4M, UT3, Damien BrulinLAAS-S4M, UT2J, Eric CampoLAAS-S4M, UT2J, Antoine Piau
{"title":"Meal-taking activity monitoring in the elderly based on sensor data: Comparison of unsupervised classification methods","authors":"Abderrahim DerouicheLAAS-S4M, UT3, Damien BrulinLAAS-S4M, UT2J, Eric CampoLAAS-S4M, UT2J, Antoine Piau","doi":"arxiv-2409.02971","DOIUrl":"https://doi.org/arxiv-2409.02971","url":null,"abstract":"In an era marked by a demographic change towards an older population, there\u0000is an urgent need to improve nutritional monitoring in view of the increase in\u0000frailty. This research aims to enhance the identification of meal-taking\u0000activities by combining K-Means, GMM, and DBSCAN techniques. Using the\u0000Davies-Bouldin Index (DBI) for the optimal meal taking activity clustering, the\u0000results show that K-Means seems to be the best solution, thanks to its\u0000unrivalled efficiency in data demarcation, compared with the capabilities of\u0000GMM and DBSCAN. Although capable of identifying complex patterns and outliers,\u0000the latter methods are limited by their operational complexities and dependence\u0000on precise parameter configurations. In this paper, we have processed data from\u00004 houses equipped with sensors. The findings indicate that applying the K-Means\u0000method results in high performance, evidenced by a particularly low\u0000Davies-Bouldin Index (DBI), illustrating optimal cluster separation and\u0000cohesion. Calculating the average duration of each activity using the GMM\u0000algorithm allows distinguishing various categories of meal-taking activities.\u0000Alternatively, this can correspond to different times of the day fitting to\u0000each meal-taking activity. Using K-Means, GMM, and DBSCAN clustering\u0000algorithms, the study demonstrates an effective strategy for thoroughly\u0000understanding the data. This approach facilitates the comparison and selection\u0000of the most suitable method for optimal meal-taking activity clustering.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fundamental properties of linear factor models 线性因子模型的基本特性
arXiv - STAT - Applications Pub Date : 2024-09-04 DOI: arxiv-2409.02521
Damir Filipovic, Paul Schneider
{"title":"Fundamental properties of linear factor models","authors":"Damir Filipovic, Paul Schneider","doi":"arxiv-2409.02521","DOIUrl":"https://doi.org/arxiv-2409.02521","url":null,"abstract":"We study conditional linear factor models in the context of asset pricing\u0000panels. Our analysis focuses on conditional means and covariances to\u0000characterize the cross-sectional and inter-temporal properties of returns and\u0000factors as well as their interrelationships. We also review the conditions\u0000outlined in Kozak and Nagel (2024) and show how the conditional mean-variance\u0000efficient portfolio of an unbalanced panel can be spanned by low-dimensional\u0000factor portfolios, even without assuming invertibility of the conditional\u0000covariance matrices. Our analysis provides a comprehensive foundation for the\u0000specification and estimation of conditional linear factor models.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Networks with LSTM and GRU in Modeling Active Fires in the Amazon 使用 LSTM 和 GRU 的神经网络模拟亚马逊活跃火灾
arXiv - STAT - Applications Pub Date : 2024-09-04 DOI: arxiv-2409.02681
Ramon Tavares
{"title":"Neural Networks with LSTM and GRU in Modeling Active Fires in the Amazon","authors":"Ramon Tavares","doi":"arxiv-2409.02681","DOIUrl":"https://doi.org/arxiv-2409.02681","url":null,"abstract":"This study presents a comprehensive methodology for modeling and forecasting\u0000the historical time series of fire spots detected by the AQUA_M-T satellite in\u0000the Amazon, Brazil. The approach utilizes a mixed Recurrent Neural Network\u0000(RNN) model, combining Long Short-Term Memory (LSTM) and Gated Recurrent Unit\u0000(GRU) architectures to predict monthly accumulations of daily detected fire\u0000spots. A summary of the data revealed a consistent seasonality over time, with\u0000annual maximum and minimum fire spot values tending to repeat at the same\u0000periods each year. The primary objective is to verify whether the forecasts\u0000capture this inherent seasonality through rigorous statistical analysis. The\u0000methodology involved careful data preparation, model configuration, and\u0000training using cross-validation with two seeds, ensuring that the data\u0000generalizes well to the test and validation sets, and confirming the\u0000convergence of the model parameters. The results indicate that the mixed LSTM\u0000and GRU model offers improved accuracy in forecasting 12 months ahead,\u0000demonstrating its effectiveness in capturing complex temporal patterns and\u0000modeling the observed time series. This research significantly contributes to\u0000the application of deep learning techniques in environmental monitoring,\u0000specifically in fire spot forecasting. In addition to improving forecast\u0000accuracy, the proposed approach highlights the potential for adaptation to\u0000other time series forecasting challenges, opening new avenues for research and\u0000development in machine learning and natural phenomenon prediction. Keywords:\u0000Time Series Forecasting, Recurrent Neural Networks, Deep Learning.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Dynamic Generalized Additive Model for Mortality during COVID-19 Pandemic COVID-19 大流行期间死亡率的贝叶斯动态广义加法模型
arXiv - STAT - Applications Pub Date : 2024-09-04 DOI: arxiv-2409.02378
Wei Zhang, Antonietta Mira, Ernst C. Wit
{"title":"Bayesian Dynamic Generalized Additive Model for Mortality during COVID-19 Pandemic","authors":"Wei Zhang, Antonietta Mira, Ernst C. Wit","doi":"arxiv-2409.02378","DOIUrl":"https://doi.org/arxiv-2409.02378","url":null,"abstract":"While COVID-19 has resulted in a significant increase in global mortality\u0000rates, the impact of the pandemic on mortality from other causes remains\u0000uncertain. To gain insight into the broader effects of COVID-19 on various\u0000causes of death, we analyze an Italian dataset that includes monthly mortality\u0000counts for different causes from January 2015 to December 2020. Our approach\u0000involves a generalized additive model enhanced with correlated random effects.\u0000The generalized additive model component effectively captures non-linear\u0000relationships between various covariates and mortality rates, while the random\u0000effects are multivariate time series observations recorded in various\u0000locations, and they embody information on the dependence structure present\u0000among geographical locations and different causes of mortality. Adopting a\u0000Bayesian framework, we impose suitable priors on the model parameters. For\u0000efficient posterior computation, we employ variational inference, specifically\u0000for fixed effect coefficients and random effects, Gaussian variational\u0000approximation is assumed, which streamlines the analysis process. The\u0000optimisation is performed using a coordinate ascent variational inference\u0000algorithm and several computational strategies are implemented along the way to\u0000address the issues arising from the high dimensional nature of the data,\u0000providing accelerated and stabilised parameter estimation and statistical\u0000inference.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信