{"title":"Method for designing and placing check sets based on control flow analysis of programs","authors":"S. Geoghegan, D. Avresky","doi":"10.1109/ISSRE.1996.558838","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558838","url":null,"abstract":"Proposes a formal approach for adding fault detection to software. An assertion-based formalism is used to represent algorithm specifications. This representation is then used to generate a flowgraph or decision-to-decision graph (ddgraph), which is used to construct an execution path tree. The information gained from this algorithm representation is used to aid in the design of software-based fault tolerance techniques. Algorithm-based fault tolerance (ABFT) techniques are used to detect data structure-corrupting faults and checks are added to detect program flow errors. Flowgraph and ddgraph representations provide information to predict future program flow from the current flow. During execution, the current program location is recorded, along with the expected flow. Checks are placed to verify that the program flow follows the predicted flow. Fault coverage has been estimated through experiments with SOFIT (SOftware-based Fault Injection Tool), and the data is presented to demonstrate the effectiveness of the method.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128579702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DQS's experience with SRE","authors":"W. Everett, J. Gobat","doi":"10.1109/ISSRE.1996.558820","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558820","url":null,"abstract":"This is an experience report on the application of software reliability engineering (SRE) in developing the Data Quality System (DQS). DQS is a knowledge-based software system developed by Lucent Technologies. It was developed to synchronize data spread among several disparate databases. The first application of DQS was to telecommunications network databases. Because the development of DQS was a large custom software development project, SRE was employed to assist in measuring product quality and readiness for production use. In this report, we first describe what we wanted to accomplish in applying SRE methods in the DQS project. The AT&T best current practice outlined the prerequisites for applying SRE during testing. We describe how we addressed these prerequisites in the DQS project. In particular, we discuss specifics on: what assumptions were made regarding an operational profile; how test cases were selected to conform to operational profile usage; and how test log forms were designed and used to collect failure data and to manage the test effort itself. Next, we share the results of reliability growth modeling during the system testing and field trials. Finally, we highlight what we learned in our initial implementation of SRE and what plans were recommended for subsequent implementations.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121582262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An on-line algorithm for checkpoint placement","authors":"A. Ziv, Jehoshua Bruck","doi":"10.1109/ISSRE.1996.558869","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558869","url":null,"abstract":"Checkpointing is a common technique for reducing the time to recover from faults in computer systems. By saving intermediate states of programs in a reliable storage device, checkpointing enables one to reduce the processing time loss caused by faults. The length of the intervals between the checkpoints affects the execution time of the programs. Long intervals lead to a long re-processing time, while too-frequent checkpointing leads to a high checkpointing overhead. In this paper, we present an online algorithm for the placement of checkpoints. The algorithm uses online knowledge of the current cost of a checkpoint when it decides whether or not to place a checkpoint. We show how the execution time of a program using this algorithm can be analyzed. The total overhead of the execution time when the proposed algorithm is used is smaller than the overhead when fixed intervals are used. Although the proposed algorithm uses only online knowledge about the cost of checkpointing, its behavior is close to that of the off-line optimal algorithm that uses the complete knowledge of the checkpointing cost.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129099982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reliability of a commercial telecommunications system","authors":"M. Kaâniche, K. Kanoun","doi":"10.1109/ISSRE.1996.558807","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558807","url":null,"abstract":"Analyzes data collected on a commercial telecommunications system and summarizes some of the lessons learned from this study. The data correspond to failure and fault information recorded during system validation and operation: 3,063 trouble reports, corresponding to a five-year period during which five versions of the system have been developed and more than 100 systems have been introduced in the field. The failure information includes software failures as well as hardware failures due to design faults.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130027562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Gokhale, T. Philip, P. Marinos, Kishor S. Trivedi
{"title":"Unification of finite failure non-homogeneous Poisson process models through test coverage","authors":"S. Gokhale, T. Philip, P. Marinos, Kishor S. Trivedi","doi":"10.1109/ISSRE.1996.558886","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558886","url":null,"abstract":"A number of analytical software reliability models have been proposed for estimating the reliability growth of a software product. We present an Enhanced Non-Homogeneous Poisson Process (ENHPP) model and show that previously reported Non-Homogeneous Poisson Process (NHPP) based models, with bounded mean valve functions, are special cases of the ENHPP model. The ENHPP model differs from previous models in that it incorporates explicitly the time varying test coverage function in its analytical formulation, and provides for defective fault detection and test coverage during the testing and operational phases. The ENHPP model is validated using several available failure data sets.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127943481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data partition based reliability modeling","authors":"J. Tian, Joe Palma","doi":"10.1109/ISSRE.1996.558895","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558895","url":null,"abstract":"The paper presents an approach to software reliability modeling using data partitions derived from tree based models. We use these data sensitive partitions to group data into clusters with similar failure intensities. The series of data clusters associated with different time segments forms a piecewise linear model for the assessment and short term prediction of reliability. Long term prediction can be provided by the dual model that uses these grouped data as input fitted to some failure count variations of the traditional software reliability growth models. These partition based reliability models can be used effectively to measure and predict the reliability of software systems and can be readily integrated into our strategy of reliability assessment and improvement using tree based modeling.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"47 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115941551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supervision of real-time software systems using optimistic path prediction and rollbacks","authors":"D. Simser, R. Seviora","doi":"10.1109/ISSRE.1996.558892","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558892","url":null,"abstract":"Real time supervision is a technique for automatically detecting and reporting failures in the external behaviour of real time software systems. Failure detection is achieved by monitoring the target system's external inputs and outputs, in a 'black box' manner and comparing its behaviour with the formally specified behaviour of the system. The paper presents the Optimistic Path Prediction and Rollbacks (OPPR) approach to real time supervision. In this technique, the supervisor predicts a single likely behaviour of the target system and, if the observed behaviour does not match the prediction, rolls back and creates a new prediction of the legal behaviour. A failure is detected when the supervisor has explored all valid behaviours without matching the observed behaviour. The paper opens by introducing the field of real time supervision and examining existing techniques. The core of the paper presents the basic algorithm of the OPPR method, with an example to illustrate its operation. The paper closes by describing an evaluation system, summarizing the experimental results and examining the performance of the OPPR scheme.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116450179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integration testing using interface mutation","authors":"M. Delamaro, J. Maldonado, A. Mathur","doi":"10.1109/ISSRE.1996.558719","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558719","url":null,"abstract":"A criterion for assessing the adequacy of test sets during integration testing is proposed. The criterion is based on a testing technique named Interface Mutation. The technique itself is designed to be scalable with the size of the software under test; the size being measured in the number of subsystems integrated. Using Interface Mutation it is possible to assess the adequacy of tests incrementally while integrating various subsystems. Also reported are results from a pilot experiment conducted to study the cost and error defection effectiveness of Interface Mutation.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129491027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Kung, Y. Lu, N. Venugopalan, P. Hsia, Y. Toyoshima, Cris Chen, J. Gao
{"title":"Object state testing and fault analysis for reliable software systems","authors":"D. Kung, Y. Lu, N. Venugopalan, P. Hsia, Y. Toyoshima, Cris Chen, J. Gao","doi":"10.1109/ISSRE.1996.558704","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558704","url":null,"abstract":"Object state behavior implies that the effect of an operation on an object may depend on the states of the object and other objects. It may cause state changes to more than one object. Thus, the combined or composite effects of the object operations must be analyzed and tested. We show that certain object state behavior errors cannot be detected readily by conventional testing methods. We describe an object state test method consisting of an object state model, a reverse engineering tool, and a composite object state testing tool. The object state test model is an aggregation of hierarchical, concurrent, communicating state machines envisioned mainly for object state testing. The reverse engineering tool produces an object state model from any C++ program. The composite object state testing tool analyzes the object state behaviors and generates test cases for testing object state interactions. We show the detection of several composite object state behavior errors that exist in a well-known thermostat example.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114063170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Error injection aimed at fault removal in fault tolerance mechanisms-criteria for error selection using field data on software faults","authors":"J. Christmansson, P. Santhanam","doi":"10.1109/ISSRE.1996.558785","DOIUrl":"https://doi.org/10.1109/ISSRE.1996.558785","url":null,"abstract":"Fault injection allows a detailed study of complex interactions between faults and fault handling mechanisms. It can be a useful complement to analytical modeling and formal verification techniques in the testing of fault tolerant systems. However, work on fault injection has not matured adequately to provide industry with cost effective alternatives for the validation of fault tolerant systems. This study analyzes 408 customer discovered faults (defects) in a release of a large operating system product. We discuss methods to select the error types for an error injection experiment in the system test environment, aimed at fault removal. Using four levels of severity and a total of 24 error types as recorded in the customer defects records, we analyze the faults in terms of fault types and system test triggers as defined in ODC. Our work shows examples of criteria that can be used to select errors for injection that use the information from the field reported defects.","PeriodicalId":441362,"journal":{"name":"Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127810399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}