{"title":"A simple classification approach to build a bathtub","authors":"B. Haan","doi":"10.1109/RAMS.2008.4925796","DOIUrl":null,"url":null,"abstract":"The notional bathtub curve is often cited to describe how a device's failure rate may change with age. Modeling the bathtub curve or other undulating function to capture the reliability-centric phases of life can be accomplished using the mixed-Weibull distribution. Unfortunately, fitting failure data directly to the mixed-Weibull distribution typically requires an assumption of the number of subpopulations within the distribution and difficult computations that often end in the utilization of complex algorithms. The fitting approach described in this paper provides a tactic that can perform the fit without assuming a set number of subpopulations and can be implemented in a basic spreadsheet. This paper begins with a brief examination of a common mixed-Weibull form. It is observed that the likelihood function of this form implicitly handles the data in aggregate - ironically not a mixture. This can be addressed with a modest adjustment but at the cost of greatly increasing the number of parameters that must be considered to fit the distribution. Two separate derivations of the introduced approach are outlined. The first originates within an Artificial-Life framework used for constructing reliability models. Processes within this framework are taken to a conceptual limit. Addressing computational time issues that result yields the presented approach. Because the Artificial-Life Framework tactic is still largely unproven a second derivation based on the well established k-means clustering algorithm is provided as an alternate. Because k-means clustering algorithms are well known, their behavior provides predictions into the behavior of the approach being introduced. The mechanics of the approach are outlined and detailed using sample data. One simple sample set demonstrates the mechanics while a second, more contextually rich set of data illustrates a more realistic application and behavior of the approach. In each, individual reliability data are classified and subpopulations emerge to quickly estimate parameters for a mixed-Weibull distribution. Performance characteristics are noted to be very similar to the k-means algorithm. Termination requires little iteration so even very complex mixtures can be assessed quickly. As predicted by its k-means derivation the approach is mildly chaotic so multiple trials may yield better solutions. Fortunately speed and ease of implementation accommodates for this shortcoming. Additionally, repeated application of the method on a set of data is shown to yield a discrete probabilistic estimate of the number of subpopulations contained within a dataset. The approach is found to be a convenient addition to the reliability analyst's toolbox.","PeriodicalId":143940,"journal":{"name":"2008 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Annual Reliability and Maintainability Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RAMS.2008.4925796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The notional bathtub curve is often cited to describe how a device's failure rate may change with age. Modeling the bathtub curve or other undulating function to capture the reliability-centric phases of life can be accomplished using the mixed-Weibull distribution. Unfortunately, fitting failure data directly to the mixed-Weibull distribution typically requires an assumption of the number of subpopulations within the distribution and difficult computations that often end in the utilization of complex algorithms. The fitting approach described in this paper provides a tactic that can perform the fit without assuming a set number of subpopulations and can be implemented in a basic spreadsheet. This paper begins with a brief examination of a common mixed-Weibull form. It is observed that the likelihood function of this form implicitly handles the data in aggregate - ironically not a mixture. This can be addressed with a modest adjustment but at the cost of greatly increasing the number of parameters that must be considered to fit the distribution. Two separate derivations of the introduced approach are outlined. The first originates within an Artificial-Life framework used for constructing reliability models. Processes within this framework are taken to a conceptual limit. Addressing computational time issues that result yields the presented approach. Because the Artificial-Life Framework tactic is still largely unproven a second derivation based on the well established k-means clustering algorithm is provided as an alternate. Because k-means clustering algorithms are well known, their behavior provides predictions into the behavior of the approach being introduced. The mechanics of the approach are outlined and detailed using sample data. One simple sample set demonstrates the mechanics while a second, more contextually rich set of data illustrates a more realistic application and behavior of the approach. In each, individual reliability data are classified and subpopulations emerge to quickly estimate parameters for a mixed-Weibull distribution. Performance characteristics are noted to be very similar to the k-means algorithm. Termination requires little iteration so even very complex mixtures can be assessed quickly. As predicted by its k-means derivation the approach is mildly chaotic so multiple trials may yield better solutions. Fortunately speed and ease of implementation accommodates for this shortcoming. Additionally, repeated application of the method on a set of data is shown to yield a discrete probabilistic estimate of the number of subpopulations contained within a dataset. The approach is found to be a convenient addition to the reliability analyst's toolbox.