{"title":"Continuous state reliability analysis","authors":"Kai Yang, Jianan Xue","doi":"10.1109/RAMS.1996.500670","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500670","url":null,"abstract":"In this paper, the authors extend binary state reliability analysis to continuous state reliability analysis. This extension enables the analysis of both catastrophic failure and performance degradation simultaneously. The modeling of degradation is based on an independent increment random process or a normal random process. Regression analysis is used to estimate degradation parameters. The state tree method is introduced to conduct system reliability analysis for both degradation and catastrophic failures. ANOVA and DOE techniques are used to assess the criticality of product parameters or components to performance degradation.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120855066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Software Reliability, Availability, and Maintainability Engineering System (SOFT-RAMES)","authors":"B. Edson, B. Hansen, P. Larter","doi":"10.1109/RAMS.1996.500680","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500680","url":null,"abstract":"The Software Reliability, Availability, and Maintainability Engineering System (SOFT-RAMES) has been developed for the Air Force Material Command Space Systems Support Group as a software reliability and maintainability engineering tool to aid in the management and implementation of a post deployment support process for mission computer software. Using failure, change, and source code data, it provides Pareto analyses, R&M trends and predictions, and checklists to identify problems and impacts of software changes. Initial results, and lessons learned are described, along with the capabilities of SOFT-RAMES. Important initial lessons learned are the need to calibrate metrics models to the application, a quantitative means to set metrics based design guidelines, and the usefulness of failure rate trends to investigate the effects of the underlying software change process on software reliability. SOFT-RAMES is implemented for the Defense Meteorological Satellite Program (DMSP) satellite operations centers mission software.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115756067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reliability analysis using the ADEPT-REST interface","authors":"R. Rao, A. Rahmam, B.W. Johnson","doi":"10.1109/RAMS.1996.500645","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500645","url":null,"abstract":"The Advanced Design Environment Prototype Tool (ADEPT) brings dependability analysis and trade-offs into the mainstream of the design process. ADEPT models are constructed using a collection of predefined library elements, called ADEPT modules. Each ADEPT module has an unambiguous mathematical definition in the form of a colored Petri net (CPN) and a corresponding VHSIC (very high speed integrated circuit) hardware description language (VHDL) description. One of the key features of ADEPT is that the designer need deal with only one model of the system, the ADEPT model, from which alternate representations, for performance and dependability analysis, are derived using provably correct transformations. The use of a single model eliminates the problem of inconsistency between the different models used to perform system-level analysis and trade-offs. The ADEPT toolset supports several simulation and analytical based approaches to dependability analysis of ADEPT models. The focus of this paper is on describing an approach to integrating the ADEPT-VHDL simulation model and the Reliability Estimation System Testbed (REST) engine in order to estimate system reliability from ADEPT models. This paper presents an overview of the ADEPT methodology, the ADEPT-REST interface, and examples which illustrate the capabilities of the methodology.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131581510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proposed new DoD standard for product acceptance","authors":"D. Ermer, D. Kedzie","doi":"10.1109/RAMS.1996.500637","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500637","url":null,"abstract":"Total quality and productivity improvement is a basic tenant in the DoD/Defense industries Quality Excellence Program. As part of this program, the DoD stated that Military and Federal Specifications which prescribe fixed levels of defects, such as acceptable quality levels and lot tolerance percent defectives, inhibit quality and productivity improvement and have been eliminated. In response to this mandate, the Technical Concepts Committee of the American Defense Preparedness Association (ADPA) has developed a new Standard (herein called STD-XXX) for product acceptance which emphasizes prevention versus detection. This paper describes STD-XXX, how it is intended to be used, and the underlying details of its development. This standard offers a practical approach for continuous improvement of the procurement process, and encourages industry innovation and flexibility to achieve the benefits of total quality. Furthermore, the new standard allows the transition from ineffective, inefficient, and costly sampling inspection to prevention by process control and improvement to be made in partnership with the Government.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128473105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inferring coverage probabilities by optimum 3-stage sampling","authors":"C. Constantinescu","doi":"10.1109/RAMS.1996.500643","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500643","url":null,"abstract":"Reliability assessment is an important step in the development of fault-tolerant computing systems. Availability, MTTF, and, in general, any reliability measure is determined by the system ability to handle faults and errors and the rate of occurrence of these events. A special parameter, the coverage probability, provides information about the effectiveness of the fault tolerance mechanisms embedded into the system. Practically, physical or simulated fault injection experiments are conducted for evaluating the coverage. Unfortunately, the extremely large number of events which can perturb the operation of a computing system makes exhaustive testing intractable. As a consequence, statistical inference has been employed to derive meaningful results after performing a relatively small number of fault injection experiments. This paper presents a new method for inferring the coverage probability by means of optimum 3-stage sampling. A three-dimensional space of events is considered. It is represented by the cross product of system inputs, times of injection, and fault locations. The fault injection consists of a pilot experiment followed by the main injection experiment. The sample size of the main experiment is chosen to minimize the cost of the fault injection for a fixed value of the variance. This approach is used for estimating the coverage probability of a hypothetical fault-tolerant system. Based on our experiments, we conclude that the optimum 3-stage sampling method is especially useful when a low variance of the coverage probability is required.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131723919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fault tree analysis and binary decision diagrams","authors":"Roslyn M. Sinnamon, John Andrews","doi":"10.1109/RAMS.1996.500665","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500665","url":null,"abstract":"Fault tree analysis is now commonly used to assess the adequacy, in reliability terms, of industrial systems. For complex systems, an analysis may produce thousands of combinations of events which can cause system failure (minimal cut sets). The determination of these minimal cut sets can be a very time consuming process even on modern high speed digital computers. Also, if the fault tree has many minimal cut sets, calculating the exact top event probability will require extensive calculations. For many complex fault trees this requirement is beyond the capability of the available machines, thus approximation techniques need to be introduced resulting in loss of accuracy. This paper describes the use of a binary decision diagram for fault tree analysis and some ways in which it can be efficiently implemented on a computer. The work to date shows a substantial improvement in computational effort for large, complex fault trees analysed with this method in comparison to the traditional approach. The binary decision diagram method has the additional advantage that as approximations are not required, exact calculations for the top event parameters can be performed.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115625875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Designing for fault-tolerance in the commercial environment","authors":"M. Karyagina","doi":"10.1109/RAMS.1996.500671","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500671","url":null,"abstract":"Fault-tolerant techniques have been successfully used for implementing highly reliable electronic systems. There is a range of fault-tolerant techniques to cater for any desired level of fault-tolerance. However, only cost-effective techniques can be used for commercial systems. The cost of implementing fault-tolerance can be estimated from the extra hardware required. The possible savings to the user can be estimated from the cost of \"prevented failures\". To estimate the cost of electronic failures to the user of computer numerically controlled (CNC) machines, maintenance records from several machine tools were analysed. The results of the study show that electronic failures constitute less than 10% of all failures of CNC machines. However, they top the list of average repair costs for different failure categories. Permanent electronic failures also have longest down-times and may result in substantial losses. Fault-tolerance techniques can be used to make machine controllers more reliable. The challenge is how to do it cost-effectively. Apart from the original investment, some fault-tolerant implementations increase maintenance expenses thus offsetting the benefits to the user. An example of the cost-benefit analysis for a double-redundant system shows how the major components of the life cycle cost change depending on the implementation.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126126104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. A. Montgomery, D. Pugh, S.T. Leedham, S.R. Twitchett
{"title":"FMEA automation for the complete design process","authors":"T. A. Montgomery, D. Pugh, S.T. Leedham, S.R. Twitchett","doi":"10.1109/RAMS.1996.500638","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500638","url":null,"abstract":"Performing an FMEA during the design stage is a valuable technique for improving the reliability of a product. Unfortunately, the traditional brainstorming approach is also very tedious, time consuming, and error prone. Automating the process promises the generation of a more complete, consistent FMEA worksheet in a fraction of the time currently required. However, to be truly valuable, this automation must follow the product though the entire design cycle at each level of design: architecture; subsystem; and component. This paper presents an FMEA automation approach that spans the entire design cycle for electrical/electronic circuits. Brainstorming is replaced by computer simulation of failure modes and their effects. Qualitative simulation is used in the early (architectural) stages when design detail is not available. As the design progresses, the qualitative simulation gives way to quantitative simulation. Throughout, the information required to perform the FMEA is gleaned from that used to understand the nominal behavior of the circuit; thus the relief from brainstorming is not offset by a new modeling burden. Sample results from software supporting this approach are presented.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125166587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Environmental stress screening for a massively parallel vision computer","authors":"A. Kostić, R. Wallace","doi":"10.1109/RAMS.1996.500659","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500659","url":null,"abstract":"AISI began experiencing a severe field reliability problem with their computers. There was a single point source of failure in the systems in which they were incorporated. An issue with programmable array logic (PAL) had driven the customer return rate to approximately 45% and caused severe production problems for the ultimate users of the computers. The legacy screening process used by AISI was ineffective at screening out the problems. The limited amount of failure analysis performed was inconclusive at identifying root cause of the failures. A screen was developed based on generic information on technology failure mechanism and circumstantial evidence gathered by AISI. The resultant screening used both temperature and voltage stress. Combined with part level screening and change of suppliers the customer return rate was reduced to 1%. Further improvements for part level screening were developed using Iddq as a parametric screen. The board-level screening program required a capital investment of only $50,000. Part screening increased the price of the parts by an additional 10%.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128418571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Planning and optimizing environmental stress screening","authors":"Y. Mok, M. Xie","doi":"10.1109/RAMS.1996.500662","DOIUrl":"https://doi.org/10.1109/RAMS.1996.500662","url":null,"abstract":"Environmental stress screening (ESS) is widely used in the electronics industries as a means to remove early failures. It is a process that calls for proper planning as inadequate duration is ineffective while prolonged screening can incur unnecessary cost. This note describes an approach utilizing mathematical programming to ensure that the right amount of screening is in place at each assembly level. The factors considered include the screening cost and desired operational reliability.","PeriodicalId":393833,"journal":{"name":"Proceedings of 1996 Annual Reliability and Maintainability Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127043764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}