Journal of data science, statistics, and visualisation最新文献_第2页

Casting multiple shadows: interactive data visualisation with tours and embeddings 投射多个阴影:带有游览和嵌入的交互式数据可视化

Journal of data science, statistics, and visualisation Pub Date : 2022-05-30 DOI: 10.52933/jdssv.v2i3.21

Stuart Lee, U. Laa, D. Cook

{"title":"Casting multiple shadows: interactive data visualisation with tours and embeddings","authors":"Stuart Lee, U. Laa, D. Cook","doi":"10.52933/jdssv.v2i3.21","DOIUrl":"https://doi.org/10.52933/jdssv.v2i3.21","url":null,"abstract":"Non-linear dimensionality reduction (NLDR) methods such as t-distributed stochastic neighbour embedding (t-SNE) are ubiquitous in the natural sciences, however, the appropriate use of these methods is difficult because of their complex parameterisations; analysts must make trade-offs in order to identify structure in the visualisation of an NLDR technique. We present visual diagnostics for the pragmatic usage of NLDR methods by combining them with a technique called the tour. A tour is a sequence of interpolated linear projections of multivariate data onto a lower dimensional space. The sequence is displayed as a dynamic visualisation, allowing a user to see the shadows the high-dimensional data casts in a lower dimensional view. By linking the tour to an NLDR view, we can preserve global structure and through user interactions like linked brushing observe where the NLDR view may be misleading. We display several case studies from both simulations and single cell transcriptomics, that shows our approach is useful for cluster orientation tasks. The implementation of our framework is available as an R package called liminal available at https://github.com/sa-lee/liminal.","PeriodicalId":93459,"journal":{"name":"Journal of data science, statistics, and visualisation","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81018292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

INTEREST: INteractive Tool for Exploring REsults from Simulation sTudies. 兴趣：探索模拟研究结果的交互式工具。

Journal of data science, statistics, and visualisation Pub Date : 2021-12-31 DOI: 10.52933/jdssv.v1i4.9

Alessandro Gasparini, Tim P Morris, Michael J Crowther

{"title":"INTEREST: INteractive Tool for Exploring REsults from Simulation sTudies.","authors":"Alessandro Gasparini, Tim P Morris, Michael J Crowther","doi":"10.52933/jdssv.v1i4.9","DOIUrl":"10.52933/jdssv.v1i4.9","url":null,"abstract":"Simulation studies allow us to explore the properties of statistical methods. They provide a powerful tool with a multiplicity of aims; among others: evaluating and comparing new or existing statistical methods, assessing violations of modelling assumptions, helping with the understanding of statistical concepts, and supporting the design of clinical trials. The increased availability of powerful computational tools and usable software has contributed to the rise of simulation studies in the current literature. However, simulation studies involve increasingly complex designs, making it difficult to provide all relevant results clearly. Dissemination of results plays a focal role in simulation studies: it can drive applied analysts to use methods that have been shown to perform well in their settings, guide researchers to develop new methods in a promising direction, and provide insights into less established methods. It is crucial that we can digest relevant results of simulation studies. Therefore, we developed INTEREST: an INteractive Tool for Exploring REsults from Simulation sTudies. The tool has been developed using the Shiny framework in R and is available as a web app or as a standalone package. It requires uploading a tidy format dataset with the results of a simulation study in R, Stata, SAS, SPSS, or comma-separated format. A variety of performance measures are estimated automatically along with Monte Carlo standard errors; results and performance summaries are displayed both in tabular and graphical fashion, with a wide variety of available plots. Consequently, the reader can focus on simulation parameters and estimands of most interest. In conclusion, INTEREST can facilitate the investigation of results from simulation studies and supplement the reporting of results, allowing researchers to share detailed results from their simulations, readers to explore them freely.","PeriodicalId":93459,"journal":{"name":"Journal of data science, statistics, and visualisation","volume":"1 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7612246/pdf/EMS140699.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39949693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On Generalization and Computation of Tukey's Depth: Part II 土基深度的概化与计算:第二部分

Journal of data science, statistics, and visualisation Pub Date : 2021-12-15 DOI: 10.52933/jdssv.v2i2.61

Yiyuan She, Shao Tang, Jingze Liu

引用次数: 2

On Generalization and Computation of Tukey's Depth: Part I 土基深度的概化与计算:第一部分

Journal of data science, statistics, and visualisation Pub Date : 2021-12-15 DOI: 10.52933/jdssv.v2i1.23

Yiyuan She, S. Tang, Jingze Liu

引用次数: 3

Editorial Founding Issue 创刊编辑

Journal of data science, statistics, and visualisation Pub Date : 2021-09-30 DOI: 10.52933/jdssv.v1i1.52

S. Aelst, P. Groenen

{"title":"Editorial Founding Issue","authors":"S. Aelst, P. Groenen","doi":"10.52933/jdssv.v1i1.52","DOIUrl":"https://doi.org/10.52933/jdssv.v1i1.52","url":null,"abstract":"The Journal of Data Science, Statistics, and Visualisation (JDSSV) is an electronic journal which welcomes contributions to data science, statistics, and visualisation, and in particular, those aspects which link and integrate these subject areas. Articles can cover topics such as machine learning and statistical learning, the visualisation and verbalisation of data, visual analytics, big data infrastructures and analytics, interactive learning, and advanced computing. Articles thatdiscuss two or more research areas of the journal are favoured. Scientific contributions should be of a high standard. Articles should be oriented towards a wide scientific audience of statisticians, data scientists, computer scientists, data analysts, etc. The journal welcomes original contributions that are not being considered for publication elsewhere and contain a high level of novelty. Articles with a thorough but concise review of a certain topic with the potential to provide new insights are also welcome. Manuscripts submitted to the journal generally are accompanied by supplementary material containing software code, data, technical derivations or detailed explanations, additional examples, etc. All submitted material will be reviewed by the assigned associate editor and reviewers of the manuscript.","PeriodicalId":93459,"journal":{"name":"Journal of data science, statistics, and visualisation","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86046756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Spatial SEIR Model for COVID-19 in South Africa 南非COVID-19的空间SEIR模型

Journal of data science, statistics, and visualisation Pub Date : 2021-06-09 DOI: 10.20944/PREPRINTS202106.0262.V1

I. Fabris-Rotelli, Jenny P. Holloway, Zaid Kimmie, S. Archibald, P. Debba, Raeesa Manjoo-Docrat, A. Roux, Nontembeko Dudeni-Tlhone, Charl Janse van Rensburg, R. Thiede, N. Abdelatif, Sibusisiwe Makhanya, Arminn Potgieter

引用次数: 3

A Review of Containerization for Interactive and Reproducible Analysis 交互式和可重复分析的容器化研究综述

Journal of data science, statistics, and visualisation Pub Date : 2021-03-30 DOI: 10.52933/jdssv.v3i1.53

Gregory J. Hunt, Johann A. Gagnon-Bartsch

{"title":"A Review of Containerization for Interactive and Reproducible Analysis","authors":"Gregory J. Hunt, Johann A. Gagnon-Bartsch","doi":"10.52933/jdssv.v3i1.53","DOIUrl":"https://doi.org/10.52933/jdssv.v3i1.53","url":null,"abstract":"In recent decades the analysis of data has become increasingly computational. Correspondingly, this has changed how scientific and statistical work is shared. For example, it is now commonplace for underlying analysis code and data to be proffered alongside journal publications and conference talks. Unfortunately, sharing code faces several challenges. First, it is often difficult to take code from one computer and run it on another. Code configuration, version, and dependency issues often make this challenging. Secondly, even if the code runs, it is often hard to understand or interact with the analysis. This makes it difficult to assess the code and its findings, for example, in a peer review process. In this review we describe the combination of two computing technologies that help make analyses shareable, interactive, and completely reproducible. These technologies are (1) analysis containerization, which leverages virtualization to fully encapsulate analysis, data, code and dependencies into an interactive and shareable format, and (2) code notebooks, a literate programming format for interacting with analyses. The fusion of these two technologies offers significant advantages over using either individually. This review surveys how the combination enhances the accessibility and reproducibility of code, analyses, and ideas.","PeriodicalId":93459,"journal":{"name":"Journal of data science, statistics, and visualisation","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73237900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust Model-Based Clustering 稳健的基于模型的聚类

Journal of data science, statistics, and visualisation Pub Date : 2021-02-13 DOI: 10.1201/b18358-20

Juan D. González, R. Maronna, V. Yohai, R. Zamar

引用次数: 0

Handling Cellwise Outliers by Sparse Regression and Robust Covariance 稀疏回归和稳健协方差处理单元格异常值

Journal of data science, statistics, and visualisation Pub Date : 2020-12-07 DOI: 10.52933/jdssv.v1i3.18

Jakob Raymaekers, P. Rousseeuw

引用次数: 8

Compressed sensing with a jackknife and a bootstrap 压缩传感与一个小刀和一个引导

Journal of data science, statistics, and visualisation Pub Date : 2018-09-18 DOI: 10.52933/jdssv.v2i4.43

Aaron Defazio, M. Tygert, Rachel A. Ward, Jure Zbontar

{"title":"Compressed sensing with a jackknife and a bootstrap","authors":"Aaron Defazio, M. Tygert, Rachel A. Ward, Jure Zbontar","doi":"10.52933/jdssv.v2i4.43","DOIUrl":"https://doi.org/10.52933/jdssv.v2i4.43","url":null,"abstract":"Compressed sensing proposes to reconstruct more degrees of freedom in a signal than the number of values actually measured (based on a potentially unjustified regularizer or prior distribution). Compressed sensing therefore risks introducing errors -- inserting spurious artifacts or masking the abnormalities that medical imaging seeks to discover. Estimating errors using the standard statistical tools of a jackknife and a bootstrap yields \"error bars\" in the form of full images that are remarkably qualitatively representative of the actual errors (at least when evaluated and validated on data sets for which the ground truth and hence the actual error is available). These images show the structure of possible errors -- without recourse to measuring the entire ground truth directly -- and build confidence in regions of the images where the estimated errors are small. Further visualizations and summary statistics can aid in the interpretation of such error estimates. Visualizations include suitable colorizations of the reconstruction, as well as the obvious \"correction\" of the reconstruction by subtracting off the error estimates. The canonical summary statistic would be the root-mean-square of the error estimates. Unfortunately, colorizations appear likely to be too distracting for actual clinical practice in medical imaging, and the root-mean-square gets swamped by background noise in the error estimates. Fortunately, straightforward displays of the error estimates and of the \"corrected\" reconstruction are illuminating, and the root-mean-square improves greatly after mild blurring of the error estimates; the blurring is barely perceptible to the human eye yet smooths away background noise that would otherwise overwhelm the root-mean-square.","PeriodicalId":93459,"journal":{"name":"Journal of data science, statistics, and visualisation","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85227334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4