{"title":"Bayesian Modeling and Computation in Python","authors":"Shuai Huang","doi":"10.1080/00224065.2022.2041379","DOIUrl":"https://doi.org/10.1080/00224065.2022.2041379","url":null,"abstract":"This book is useful for readers who want to hone their skills in Bayesian modeling and computation. Written by experts in the area of Bayesian software and major contributors to some existing widely used Bayesian computational tools, this book covers not only basic Bayesian probabilistic inference but also a range of models from linear models (and mixed effect models, hierarchical models, splines, etc) to time series models such as the state space model. It also covers the Bayesian additive regression trees. Almost all the concepts and techniques are implemented using PyMC3, Tensorflow Probability (TFP), ArviZ and other libraries. By doing all the modeling, computation, and data analysis, the authors not only show how these things work, but also show how and why things don’t work by emphasis on exploratory data analysis, model comparison, and diagnostics. To learn from the book, readers may need some statistical background such as basic training in statistics and probability theory. Some understanding of Bayesian modeling and inference is also needed, such as the concepts of prior, likelihood, posterior, the bayes’s law, and Monte Carlo sampling. Some experience with Python would also be very beneficial for readers to get started on this journey of Bayesian modeling. The authors suggested a few books as possible preliminaries for their book. I feel that the readers may also benefit from reading Andrew Gelman’s book, Bayesian Data Analysis, Chapman & Hall/CRC, 3rd Edition, 2013. Of course, as the authors pointed it out, this book is not for a Bayesian Reader but a Bayesian practitioner. The book is more of an interactive experience for Bayesian practitioners by learning all the computational tools to model and to negotiate with data for a good modeling practice. On the other hand, if readers have already had experience with real-world data analysis using Python or R or other similar tools, even if this book is their first experience with Bayesian modeling and computation, readers may still learn a lot from this book. There are an abundance of figures and detailed explanations of how things are done and how the results are interpreted. Picking up these details would need some trained sensibility when dealing with real-world data, but aspiring and experienced practitioners should find all the details useful and impressive. And there are also many big picture schematic drawings to help readers connect all the details with overall concepts such as end-to-end workflows. The Figure 9.1 is a remarkable example. Overall, as Kevin Murphy pointed out in the Forward, “this is a valuable addition to the literature, which should hopefully further the adoption of Bayesian methods”. I highly recommend readers who are interested in learning Bayesian models and their applications in practice to have this book on their bookshelf.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"30 1","pages":"266 - 266"},"PeriodicalIF":2.5,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78872910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"cpss: an package for change-point detection by sample-splitting methods","authors":"Guanghui Wang, Changliang Zou","doi":"10.1080/00224065.2022.2035284","DOIUrl":"https://doi.org/10.1080/00224065.2022.2035284","url":null,"abstract":"Abstract Change-point detection is a popular statistical method for Phase I analysis in statistical process control. The cpss package has been developed to provide users with multiple choices of change-point searching algorithms for a variety of frequently considered parametric change-point models, including the univariate and multivariate mean and/or (co)variance change models, changes in linear models and generalized linear models, and change models in exponential families. In particular, it integrates the recently proposed COPSS criterion to determine the number of change-points in a data-driven fashion that avoids selecting or specifying additional tuning parameters in existing approaches. Hence it is more convenient to use in practical applications. In addition, the cpss package brings great possibilities to handle user-customized change-point models.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"100 1","pages":"61 - 74"},"PeriodicalIF":2.5,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85866418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yueyao Wang, Li Xu, Yili Hong, Rong Pan, Tyler H. Chang, T. Lux, Jon Bernard, L. Watson, K. Cameron
{"title":"Design strategies and approximation methods for high-performance computing variability management","authors":"Yueyao Wang, Li Xu, Yili Hong, Rong Pan, Tyler H. Chang, T. Lux, Jon Bernard, L. Watson, K. Cameron","doi":"10.1080/00224065.2022.2035285","DOIUrl":"https://doi.org/10.1080/00224065.2022.2035285","url":null,"abstract":"Abstract Performance variability management is an active research area in high-performance computing (HPC). In this article, we focus on input/output (I/O) variability, which is a complicated function that is affected by many system factors. To study the performance variability, computer scientists often use grid-based designs (GBDs) which are equivalent to full factorial designs to collect I/O variability data, and use mathematical approximation methods to build a prediction model. Mathematical approximation models, as deterministic methods, could be biased particularly if extrapolations are needed. In statistics literature, space-filling designs (SFDs) and surrogate models such as Gaussian process (GP) are popular for data collection and building predictive models. The applicability of SFDs and surrogates in the HPC variability management setting, however, needs investigation. In this case study, we investigate their applicability in the HPC setting in terms of design efficiency, prediction accuracy, and scalability. We first customize the existing SFDs so that they can be applied in the HPC setting. We conduct a comprehensive investigation of design strategies and the prediction ability of approximation methods. We use both synthetic data simulated from three test functions and the real data from the HPC setting. We then compare different methods in terms of design efficiency, prediction accuracy, and scalability. In our synthetic and real data analysis, GP with SFDs outperforms in most scenarios. With respect to the choice of approximation models, GP is recommended if the data are collected by SFDs. If data are collected using GBDs, both GP and Delaunay can be considered. With the best choice of approximation method, the performance of SFDs and GBD depends on the property of the underlying surface. For the cases in which SFDs perform better, the number of design points needed for SFDs is about half of or less than that of the GBD to achieve the same prediction accuracy. Although we observe that the GBD can also outperform SFDs for smooth underlying surface, GBD is not scalable to high dimensional experimental regions. Therefore, SFDs that can be tailored to high dimension and non-smooth surface are recommended especially when large numbers of input factors need to be considered in the model. This article has online supplementary materials.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"39 1","pages":"88 - 103"},"PeriodicalIF":2.5,"publicationDate":"2022-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82735014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knots and their effect on the tensile strength of lumber: A case study","authors":"Shuxiang Fan, S. Wong, J. Zidek","doi":"10.1080/00224065.2023.2180457","DOIUrl":"https://doi.org/10.1080/00224065.2023.2180457","url":null,"abstract":"Abstract When assessing the strength of sawn lumber for use in engineering applications, the sizes and locations of knots are an important consideration. Knots are the most common visual characteristics of lumber, that result from the growth of tree branches. Large individual knots, as well as clusters of distinct knots, are known to have strength-reducing effects. However, industry grading rules that govern knots are informed by subjective judgment to some extent, particularly the spatial interaction of knots and their relationship with lumber strength. This case study reports the results of an experiment that investigated and modeled the strength-reducing effects of knots on a sample of Douglas Fir lumber. Experimental data were obtained by taking scans of lumber surfaces and applying tensile strength testing. The modeling approach presented incorporates all relevant knot information in a Bayesian framework, thereby contributing a more refined way of managing the quality of manufactured lumber.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"8 1","pages":"510 - 522"},"PeriodicalIF":2.5,"publicationDate":"2022-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85147875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mathematical Statistics","authors":"Shuai Huang","doi":"10.1080/00224065.2020.1764418","DOIUrl":"https://doi.org/10.1080/00224065.2020.1764418","url":null,"abstract":"As a sister book of the book Probability by the same author, this book is supposed to be the second course in a mathematical statistics sequence of classes. The readers should have learned calculus and completed a calculusbased course in probability. As with most mathematical statistics textbooks, point estimations, interval estimation, and hypothesis testing are the core concepts. This book is particularly written for students who would have their first exposure to mathematical statistics, so the author carefully selected his materials and had focused on the understanding of statistics such as the sample mean and sample variance being also random variables as well. R is used throughout the text for graphics, computation, and Monte Carlo simulation. The homework is comprehensive. From all these aspects, this book has a similar style as the other book of Probability by the same author. The book’s organization is deceptively simple: it only has four chapters. Chapter 1, almost 100 pages, is about random sampling. Chapter 2, another 100 pages, is about point estimation. Chapter 3, 135 pages, is about interval estimation. Chapter 4, 133 pages, is about hypothesis testing. This “simple” structure makes the four pillars of mathematical statistics very clear to readers who first learn the topic. Within each chapter, just like in the book Probability, each concept is presented in detail and in multiple aspects. And when calculation is involved, enough middle steps are preserved so readers can easily follow the steps. One notable example is the presentation of the hypothesis testing. Not like many other textbooks that start with proven methods such as the Z-test, this book introduces the big picture first, and this big picture includes “a hunch”: it presents in the very beginning a clear outline of the 12 steps for hypothesis testing, starts with “a hunch, or theory, concerning a problem of interest,” then moves to the second step “translate the theory into a question concerning an unknown parameter theta,” then “state the null hypothesis of theta.” ... Then technical explanation of many of these steps is given in detail. The type 1 and type 2 errors are also presented right along with this 12-step outline. What is more, strange (i.e., idiosyncratic) forms of hypothesis testing are presented! It concerns three brothers, Chico, Harpo, and Groucho. Each of them comes up with their own testing statistics, e.g., x1þ x2, min (x1, x2), or max(x1, x2), where x1 and x2 are random samples of size 2 from a uniform distribution U(0, theta). Is theta 1⁄4 5, or theta 1⁄4 2? Is this an allusion to the three little pigs? Nonetheless, this is a hilarious example that very effectively instructs the technical details of hypothesis testing, but also revives this “ancient” technique that tells readers that, in using the proven hypothesis testing methods, we actually have made choices (i.e., each of the three brothers’ proposals have pros and cons, in terms of the type 1 and ","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"78 1 1","pages":"118 - 118"},"PeriodicalIF":2.5,"publicationDate":"2021-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85795417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caleb King, T. Bzik, Peter A. Parker, M. Wells, Benjamin R. Baer
{"title":"Addendum to “Estimating pure-error from near replicates in design of experiments”","authors":"Caleb King, T. Bzik, Peter A. Parker, M. Wells, Benjamin R. Baer","doi":"10.1080/00224065.2021.2019569","DOIUrl":"https://doi.org/10.1080/00224065.2021.2019569","url":null,"abstract":"","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"385 1","pages":"123 - 123"},"PeriodicalIF":2.5,"publicationDate":"2021-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76614565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Industrial Data Analytics for Diagnosis and Prognosis: A Random Effects Modeling Approach","authors":"Jing Li","doi":"10.1080/00224065.2021.2006583","DOIUrl":"https://doi.org/10.1080/00224065.2021.2006583","url":null,"abstract":"In Industrial Data Analytics for Diagnosis and Prognosis A Random Effects Modelling Approach, distinguished engineers Shiyu Zhou and Yong Chen deliver a rigorous and practical introduction to the random effects modeling approach for industrial system diagnosis and prognosis. In the book’s two parts, general statistical concepts and useful theory are described and explained, as are industrial diagnosis and prognosis methods. The accomplished authors describe and model fixed effects, random effects, and variation in univariate and multivariate datasets and cover the application of the random effects approach to diagnosis of variation sources in industrial processes. They offer a detailed performance comparison of different diagnosis methods before moving on to the application of the random effects approach to failure prognosis in industrial processes and systems.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"10 1","pages":"606 - 606"},"PeriodicalIF":2.5,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89676795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilizing individual clear effects for intelligent factor allocations and design selections","authors":"Qi Zhou, William Li, Hongquan Xu","doi":"10.1080/00224065.2021.1991863","DOIUrl":"https://doi.org/10.1080/00224065.2021.1991863","url":null,"abstract":"Abstract Extensive studies have been conducted on how to select efficient designs with respect to a criterion. Most design criteria aim to capture the overall efficiency of the design across all columns. When prior information indicated that a small number of factors and their two-factor interactions (2fi’s) are likely to be more significant than other effects, commonly used minimum aberration designs may no longer be the best choice. Motivated by a real-life experiment, we propose a new class of regular fractional factorial designs that focus on estimating a subset of columns and their corresponding 2fi’s clear of other important effects. After introducing the concept of individual clear effects (iCE) to describe clear 2fi’s involving a specific factor, we define the clear effect pattern criterion to characterize the distribution of iCE’s over all columns. We then obtain a new class of designs that sequentially maximize the clear effect pattern. These newly constructed designs are often different from existing optimal designs. We develop a series of theoretical results that can be particularly useful for constructing designs with large run sizes, for which algorithmic construction becomes computationally challenging. We also provide some practical guidelines on how to choose appropriate designs with respect to different run size, the number of factors, and the number of 2fi’s that need to be clear.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"171 1","pages":"3 - 17"},"PeriodicalIF":2.5,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78446053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-starting process monitoring based on transfer learning","authors":"Zhijun Wang, Chunjie Wu, Miaomiao Yu, F. Tsung","doi":"10.1080/00224065.2021.1991251","DOIUrl":"https://doi.org/10.1080/00224065.2021.1991251","url":null,"abstract":"Abstract Conventional self-starting control schemes can perform poorly when monitoring processes with early shifts, being limited by the number of historical observations sampled. In real applications, pre-observed data sets from other production lines are always available, prompting us to propose a scheme that monitors the target process using historical data obtained from other sources. The methodology of self-taught clustering from unsupervised transfer learning is revised to transfer knowledge from previous observations and improve out-of-control (OC) performance, especially for processes with early shifts. However, if the difference in distribution between the target process and the pre-observed data set is large, our scheme may not be the best. Simulation results and two illustrative examples demonstrate the superiority of the proposed scheme.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"8 1","pages":"589 - 604"},"PeriodicalIF":2.5,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82710881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phase I analysis of high-dimensional processes in the presence of outliers","authors":"M. Ebadi, Shoja'eddin Chenouri, Stefan H. Steiner","doi":"10.1080/00224065.2023.2196034","DOIUrl":"https://doi.org/10.1080/00224065.2023.2196034","url":null,"abstract":"Abstract One of the significant challenges in monitoring the quality of products today is the high dimensionality of quality characteristics. In this paper, we address Phase I analysis of high-dimensional processes with individual observations when the available number of samples collected over time is limited. Using a new charting statistic, we propose a robust procedure for parameter estimation in Phase I. This robust procedure is efficient in parameter estimation in the presence of outliers or contamination in the data. A consistent estimator is proposed for parameter estimation and a finite sample correction coefficient is derived and evaluated through simulation. We assess the statistical performance of the proposed method in Phase I. This assessment is carried out in the absence and presence of outliers. We show that, in both cases, the proposed control chart scheme effectively detects various kinds of shifts in the process mean. Besides, we present two real-world examples to illustrate the applicability of our proposed method.","PeriodicalId":54769,"journal":{"name":"Journal of Quality Technology","volume":"35 1","pages":"469 - 488"},"PeriodicalIF":2.5,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80554616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}